Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
@TempestasLudi the project itself is small. But that table is a historical record of articles.
The co-worker was trying out different algos for sentiment analysis and storing each results IN the original record. As oppose to writing to a new table and tying it via a foreign key.
Why should you not do that? Well what if we change the algo, and it's output now varies. Would we start adding another new column? What if we need to run other algos? Would we keep appending each column per algo to that table?
With a separate table I can just drop the table. In this way I have to freeze the source table and wait for the column to drop. Blocking access to legitimate writes. -
@gitoutofhere I can feel you, I was in the same situation. My teammates wanted to append calculated data in two different columns. I tried to make them understand that it is not a good thing to do. Don't appent data which may varies with the original data. But, I had to give up.
Related Rants
Coworker: so once the algorithm is done I will append new columns in the sql database and insert the output there
Me: I don't like that, can we put the output in a separate table and link it using a foreign key. Just to avoid touching the original data, you know, to avoid potential corruption.
C: Yes sure.
< Two days later - over text >
C: I finished the algo, i decided to append it to the original data in order to avoid redundancy and save on space. I think this makes more sense.
Me: ahdhxjdjsisudhdhdbdbkekdh
No. Learn this principal:
" The original data generated by the client, should be treated like the god damn Bible! DO NOT EVER CHANGE ITS SCHEMA FOR A 3RD PARTY CALCULATION! "
Put simply: D.F.T.T.O
Don't. Fucking. Touch. The. Origin!
rant
coworkers
data integrity