Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "data cleaning"
-
For the Dutch people on here, the new surveillance law in short:
- dragnet surveillance, data retention of normal data is a maximum of 3 years, encrypted data up to 6 years.
- secret DNA database, data retention up to 30(!!) years.
- use of 0days without having to report them to the vendors.
- third parties may be hacked to get to main targets; if my neighbor is suspected they may legally hack me in order to get to him/her.
Cleaning up (removing backdoors etc) afterwards is not required.
- sharing unfiltered (raw) data gathered through dragnet surveillance with foreign intelligence agencies is permitted, even if it's to a country which doesn't have as much 'democracy' as this country does.
Decide for yourself if you're voting (at all) against or in favor of this law, I'm voting against :)
We do need a new/reformed law, this one is just too intrusive imo.34 -
One comment from @Fast-Nop made me remember something I had promised myself not to. Specifically the USB thing.
So there I was, Lieutenant Jr at a warship (not the one my previous rants refer to), my main duties as navigation officer, and secondary (and unofficial) tech support and all-around "computer guy".
Those of you who don't know what horrors this demonic brand pertains to, I envy you. But I digress. In the ship, we had Ethernet cabling and switches, but no DHCP, no server, not a thing. My proposition was shot down by the CO within 2 minutes. Yet, we had a curious "network". As my fellow... colleagues had invented, we had something akin to token ring, but instead of tokens, we had low-rank personnel running around with USB sticks, and as for "rings", well, anyone could snatch up a USB-carrier and load his data and instructions to the "token". What on earth could go wrong with that system?
What indeed.
We got 1 USB infected with a malware from a nearby ship - I still don't know how. Said malware did the following observable actions(yes, I did some malware analysis - As I said before, I am not paid enough):
- Move the contents on any writeable media to a folder with empty (or space) name on that medium. Windows didn't show that folder, so it became "invisible" - linux/mac showed it just fine
- It created a shortcut on the root folder of said medium, right to the malware. Executing the shortcut executed the malware and opened a new window with the "hidden" folder.
Childishly simple, right? If only you knew. If only you knew the horrors, the loss of faith in humanity (which is really bad when you have access to munitions, explosives and heavy weaponry).
People executed the malware ON PURPOSE. Some actually DISABLED their AV to "access their files". I ran amok for an entire WEEK to try to keep this contained. But... I underestimated the USB-token-ring-whatever protocol's speed and the strength of a user's stupidity. PCs that I cleaned got infected AGAIN within HOURS.
I had to address the CO to order total shutdown, USB and PC turnover to me. I spent the most fun weekend cleaning 20-30 PCs and 9 USBs. What fun!
What fun, morons. Now I'll have nightmares of those days again.9 -
What I did wrong during my home office cleaning session this morning:
- put soap on my mouse mat
- snapped my enter key
- vacuumed up my F8 key
- absent-mindedly cut my ethernet cable
- lost my zero key
- dropped my backup hard drive (data was recoverable, but I need a new drive)
- lost one of the nose pads on my glasses
- got a cocktail stick stuck in a USB port
- exploded my mouse by using the wrong type of battery
Things I did good:
- nothing11 -
I'm officially CTC.
Chief Technical Clown 🤡
How do I know? I've yet to write a single line of productive code today. I've spent the day purely as an administrative cog: writing emails, giving data to consultants, supporting juniors, and cleaning up the absolute hellscape that is also known as our Jira project.
I've become exactly what I hate.12 -
The cleaning lady saga continues yet again..
Here in Belgium, cleaning ladies are paid with cheques. All fine and dandy, and apparently the parent organization (Sodexo) even migrated to digital cheques. Amazing!!!
If only they did it properly.
Just now I received an email with my login data.
Login: ${FIRSTNAME}${FIRST2CHARSOFLASTNAME}
Password: I won't reveal the amount of characters.. but it's not even hex. It's just uppercase letters, and far from what I'd deem even remotely secure. Hopefully I'll be able to change that shitty password shortly, and not get it mailed back, even when I ask for recovery. Guess I'll have to check that later - the person who made that account was pretty incompetent when it comes to tech after all. Don't ask me why they did it instead of me. I honestly don't really know either.
With that said, this is a government organization after all... Can I really expect them to hash their passwords?24 -
I've optimised so many things in my time I can't remember most of them.
Most recently, something had to be the equivalent off `"literal" LIKE column` with a million rows to compare. It would take around a second average each literal to lookup for a service that needs to be high load and low latency. This isn't an easy case to optimise, many people would consider it impossible.
It took my a couple of hours to reverse engineer the data and implement a few hundred line implementation that would look it up in 1ms average with the worst possible case being very rare and not too distant from this.
In another case there was a lookup of arbitrary time spans that most people would not bother to cache because the input parameters are too short lived and variable to make a difference. I replaced the 50000+ line application acting as a middle man between the application and database with 500 lines of code that did the look up faster and was able to implement a reasonable caching strategy. This dropped resource consumption by a minimum of factor of ten at least. Misses were cheaper and it was able to cache most cases. It also involved modifying the client library in C to stop it unnecessarily wrapping primitives in objects to the high level language which was causing it to consume excessive amounts of memory when processing huge data streams.
Another system would download a huge data set for every point of sale constantly, then parse and apply it. It had to reflect changes quickly but would download the whole dataset each time containing hundreds of thousands of rows. I whipped up a system so that a single server (barring redundancy) would download it in a loop, parse it using C which was much faster than the traditional interpreted language, then use a custom data differential format, TCP data streaming protocol, binary serialisation and LZMA compression to pipe it down to points of sale. This protocol also used versioning for catchup and differential combination for additional reduction in size. It went from being 30 seconds to a few minutes behind to using able to keep up to with in a second of changes. It was also using so much bandwidth that it would reach the limit on ADSL connections then get throttled. I looked at the traffic stats after and it dropped from dozens of terabytes a month to around a gigabyte or so a month for several hundred machines. The drop in the graphs you'd think all the machines had been turned off as that's what it looked like. It could now happily run over GPRS or 56K.
I was working on a project with a lot of data and noticed these huge tables and horrible queries. The tables were all the results of queries. Someone wrote terrible SQL then to optimise it ran it in the background with all possible variable values then store the results of joins and aggregates into new tables. On top of those tables they wrote more SQL. I wrote some new queries and query generation that wiped out thousands of lines of code immediately and operated on the original tables taking things down from 30GB and rapidly climbing to a couple GB.
Another time a piece of mathematics had to generate all possible permutations and the existing solution was factorial. I worked out how to optimise it to run n*n which believe it or not made the world of difference. Went from hardly handling anything to handling anything thrown at it. It was nice trying to get people to "freeze the system now".
I build my own frontend systems (admittedly rushed) that do what angular/react/vue aim for but with higher (maximum) performance including an in memory data base to back the UI that had layered event driven indexes and could handle referential integrity (overlay on the database only revealing items with valid integrity) or reordering and reposition events very rapidly using a custom AVL tree. You could layer indexes over it (data inheritance) that could be partial and dynamic.
So many times have I optimised things on automatic just cleaning up code normally. Hundreds, thousands of optimisations. It's what makes my clock tick.4 -
Some of the penguin's finest insults (Some are by me, some are by others):
Disclaimer: We all make mistakes and I typically don't give people that kind of treatment, but sometimes, when someone is really thick, arrogant or just plain stupid, the aid of the verbal sledgehammer is neccessary.
"Yeah, you do that. And once you fucked it up, you'll go get me a coffee while I fix your shit again."
"Don't add me on Facebook or anything... Because if any of your shitty code is leaked, ever, I want to be able to plausibly deny knowing you instead of doing Seppuku."
"Yep, and that's the point where some dumbass script kiddie will come, see your fuckup and turn your nice little shop into a less nice but probably rather popular porn/phishing/malware source. I'll keep some of it for you if it's good."
"I really love working with professionals. But what the fuck are YOU doing here?"
"I have NO idea what your code intended to do - but that's the first time I saw RCE and SQLi in the same piece of SHIT! Thanks for saving me the hassle."
"If you think XSS is a feature, maybe you should be cleaning our shitter instead of writing our code?"
"Dude, do I look like I have blue hair, overweight and a tumblr account? If you want someone who'd rather lie to your face than insult you, go see HR or the catholics or something."
"The only reason for me NOT to support you getting fired would be if I was getting paid per bug found!"
"Go fdisk yourself!"
"You know, I doubt the one braincell you have can ping localhost and get a response." (That one's inspired by the BOFH).
"I say we move you to the blockchain. I'd volunteer to do the cutting." (A marketing dweeb suggested to move all our (confidential) customer data to the "blockchain").
"Look, I don't say you suck as a developer, but if you were this competent as a gardener, I'd be the first one to give you a hedgetrimmer and some space and just let evolution do its thing."
"Yeah, go fetch me a unicorn while you're chasing pink elephants."
"Can you please get as high as you were when this time estimate come up? I'd love to see you overdose."
"Fuck you all, I'm a creationist from now on. This guy's so dumb, there's literally no explanation how he could evolve. Sorry Darwin."
"You know, just ignore the bloodstain that I'll put on the wall by banging my head against it once you're gone."2 -
I wrote a prototype for a program to do some basic data cleaning tasks in Go. The idea is to just distribute the files with the executable on our shared network to our team (since it is small enough, no github bullshit needed for this) and they can go from there.
Felt experimental, so I decided to try out F# since I have always been interested with it and for some reason Microsoft adopted it into their core net framework.
I shit you not, from 185 lines of Go code, separated into proper modules etc not to mention the additional packages I downloaded (simple things for CSV reading bla bla)
To fucking 30 lines of F# that could probably be condensed more if I knew how to do PROPER functional programming. The actual code is very much procedural with very basic functional composition, so it could probably be even less, just more "dense"
I am amazed really. I do not like that namespace pollution happens all over F# since importing System.IO gives you a bunch of shit that you wouldn't know where it is coming from unless you fuck enough with Ionide and the docs. But man.....
No need for dotnet run to test this bitch, just highlight it on the IDE, alt enter and WHAM you have the repl in front of you, incremental quasi like Lisp changes on the code can be REPL changed this way, plethora of .NET BCL wonders in it, and a single point of documentation as long as you stay in standard .net
I am amazed and in love, plus finding what I wanted to do was a fucking cakewalk.
Downside: I work in a place in which Python is seen as magic and PHP, VB.NEt and C# is the end all be all of languages. If me goes away or dies there will be no one else in this side of the state to fuck with F#
This language needs to be studied more. Shit can be so compact, but I do feel that one needs to really know enough of functional programming to be good at it. It is really not a pure language like Haskell (then again, haskell is the only "mainstream" pure functional language ain't it not?) but still, shit is really nice and I really dig what Microhard is doing in terms of the .net framework.
Will provide later findings. My entire team is on the Microsoft space, we do have Linux servers, but porting the code to generate the necessary executables for those servers if needed should be a walk in the park. I am just really intrigued by how many lines of code I was able to cut down from the Go application.
Please note that this could also mean that I am a shit Golang dev, but the cut down of nil err checkings do come somewhere.9 -
Finished my project early today. I assumed it would take another day or two since it's primarily research and I had no idea how to progress, but I caught a break and finished it early. I also finished another surprise ticket! yay! I had the rest of the day to myself!
... had!
But then I noticed I had been working on the wrong branch. Fuck. Moving my work over was tedious, as was the cleanup. I kicked myself for good measure. Also, every time I switch branches, I need to run a bloody slow script that runs all the migrations, data tasks, backfills, etc. for the branch. It takes 12-18 minutes. There's a faster version, but it usually breaks things.
Turns out the branch I was supposed to be working on wasn't up to date with master. So I merged that in, leading to....
merge conflicts. Because of course there are conflicts. To make matters worse, I had (and have) no idea which changes were correct because idfk what those 248 new commits are doing. So I guessed at them, ran the script, and (after more waiting) ran a few related specs. Yet more waiting. Sense a pattern here? Eventually they finished, and all the specs passed. H'ray. So I committed the changes, and told Jenkins to kick off a full spec suite, which takes 45+ minutes.
La de da, I go back to cleaning up the previous ticket, pushing reversion commits, etc. Later, I notice the ticket number, look at the branch number I've been working on.... and. Fuuuck. I realize I had put everything on the wrong freaking branch AGAIN. I'm such an idiot. Cue more cleanup, more reversions, running the bloody script again and again. More wasted time, more kicking. ugh.
All of this took well over three hours. So instead of finishing at a leisurely 5:00 like a normal person, I finally stopped around 9pm. and I won't know the Jenkins spec results until morning.
A nice early day?
I should know better.2 -
Last week I was erasing a 2Gb USB thumb while copying some really important shit to my backup disk. I look at the terminal and see it's taking a lot of time to did zeroes on dev/sdb.
Then I realized that dev/sdb is the backups drive and I just erased the firsts sectors of my only fucking backup.
It's ok, I said, let's see what can TestDisk do for me. And it only could find an empty sad partition that had useless shit on it. Whdd couldn't even find the drive. Cat and dd vomited 160Gb of nothing to a file that couldn't be read. I was lost, because I failed doing something I'm really good at. And I did it because I was to stupid to check fstab...
It's the very first time I couldn't recover data, so I'm thinking about delete "Data recovery" from my resume skills and put "Data cleaning. Really effective. I can send you 160Gb of pure horse shit to prove it" instead.2 -
This is a follow-up of my last rant: https://devrant.com/rants/1323422/...
TLDR; My step-son tripped over my HDD power cord, sending it plummeting towards certain death.
So this is just over a year ago. At this point, my GF and I are married, and she's about 7 months pregnant with our daughter. Her son, Nicolas - the one from the last rant - is 13 years old.
So it was a Saturday, and I had Nicolas helping me to clean up the apartment. My wife was off the hook, because, ya know - she's pregnant.
While I was cleaning the living room, I had Nic cleaning the kitchen/dining room area. At this same time, I had my laptop and a 3Tb external USB hard drive on the dining room table, copying a bunch of data or something. This external HDD also had it's own power cord, which was plugged in next to the table.
Next thing I know, I hear an "Ohp!" followed by a crash. It was the horrifying sound of my hard drive plunging 36 inches off the table towards certain death. And death, it had.
Before even checking, I knew this HDD was dead. It took a lot for me not to snap at the kid. I told him to get out of the kitchen and go clean his room. That hard drive... hadn't been backed up. At all, which is on me. Even more so, since that data was really irreplaceable.
Even knowing that the HDD HAD to be dead, I still plugged it in, hoping for a miracle. I got nothing, it wouldn't even spin up.
$ dmesg -w
Showed that linux saw the USB controller and even the HDD controller (it printed out the manufacturer, SeaGate). The data was valuable enough that I was saving up some money to have the data recovered, which would be about $2,000.
However, before I had saved up enough money... My apartment was broken into and all my external HDD's (and some internal ones I had laying around) were stolen.6 -
36.63% of the respondents said they’re not planning to use any new programming languages in the coming 12 months.
But, 18.15% of them said they’re planning to use Python, while 16.83% said they’re planning to use Go, followed by JavaScript with 16.17%.
What about the tools?
Honestly, this was the hardest part of the report since it required very thorough data cleaning, and it turns out developer teams use a wide variety of tools, especially when it comes to testing and project management.24 -
Code fuckup day or what?! After two weeks where I wasn't on my project and a co-worker handled it, I came back to my project and reviewed what he had done so far.
Me: "I don't understand how this new code part here can work?"
Him: "Uhm, actually, it doesn't, somehow."
Me: "..."
Then he had checked in his stuff with spaces while the whole project is with tabs. And variables that were used in a different way, but still under the old name, now completely misleading. Bypassing existing infrastructure and defines with "just for this case" hacks. But the best was tracking higher level state by peeking into lower level data buffers, even pulling out their data definitions into global header files - instead of using proper states in the higher layer itself.
NOT! IN! MY! FUCKING! PROJECT!!!
So I spent the day cleaning up the shit to fight off software rot right in the beginning.4 -
Our tech lead left a mess in the database. He turned his screw-up into an architectural discussion, weighing the benefits of duplicated, messy data vs. keeping it consistent. He suggested that, instead of fixing his broken script and cleaning the mess, we should break all of the other scripts and have them trash the database, too.
They almost believed it.
What a clever maneuver. I wish he would use his cleverness to make good software, instead. -
I was cleaning up my hard drive and deleted some old directories.
I was notified that my backup just started and wanted to look how far along it was.
However, instead of 'ls -l /mnt/DATA/Backup' my shitpile of muscle memory typed 'rm -rf /mnt/DATA/Backup'... That's when my harddrive suddenly had 750GB of free space and I decided that I probably need some sleep.
Before any of y'all wanna lecture me on off-site backups, funny thing: Today I implemented a new daily backup routine (praised be borg) and therefore deleted my somewhat chaotic Backups on my NAS "Because if shit hits the fan, I still have my local Backup"2 -
I was cleaning up dangling images in docker, and I accidentally removed the production database container as well.
Its not a big issue, I can just up the container back and everything should be fine. But after I up the container and connected to the database, I found out there's no data inside. I thought I fucked up, and sent msg in slack channel that I nuked the db.
Later my friend asked me which compose file I am using and that's when I realized I used the wrong config to up the db. Used the correct config to up the database again and everything goes back to normal.
It's friday evening and if I really dropped the db it would be fucking bad weekend....3 -
Is your code green?
I've been thinking a lot about this for the past year. There was recently an article on this on slashdot.
I like optimising things to a reasonable degree and avoid bloat. What are some signs of code that isn't green?
* Use of technology that says its fast without real expert review and measurement. Lots of tech out their claims to be fast but actually isn't or is doing so by saturation resources while being inefficient.
* It uses caching. Many might find that counter intuitive. In technology it is surprisingly common to see people scale or cache rather than directly fixing the thing that's watt expensive which is compounded when the cache has weak coverage.
* It uses scaling. Originally scaling was a last resort. The reason is simple, it introduces excessive complexity. Today it's common to see people scale things rather than make them efficient. You end up needing ten instances when a bit of skill could bring you down to one which could scale as well but likely wont need to.
* It uses a non-trivial framework. Frameworks are rarely fast. Most will fall in the range of ten to a thousand times slower in terms of CPU usage. Memory bloat may also force the need for more instances. Frameworks written on already slow high level languages may be especially bad.
* Lacks optimisations for obvious bottlenecks.
* It runs slowly.
* It lacks even basic resource usage measurement.
Unfortunately smells are not enough on their own but are a start. Real measurement and expert review is always the only way to get an idea of if your code is reasonably green.
I find it not uncommon to see things require tens to hundreds to thousands of resources than needed if not more.
In terms of cycles that can be the difference between needing a single core and a thousand cores.
This is common in the industry but it's not because people didn't write everything in assembly. It's usually leaning toward the extreme opposite.
Optimisations are often easy and don't require writing code in binary. In fact the resulting code is often simpler. Excess complexity and inefficient code tend to go hand in hand. Sometimes a code cleaning service is all you need to enhance your green.
I once rewrote a data parsing library that had to parse a hundred MB and was a performance hotspot into C from an interpreted language. I measured it and the results were good. It had been optimised as much as possible in the interpreted version but way still 50 times faster minimum in C.
I recently stumbled upon someone's attempt to do the same and I was able to optimise the interpreted version in five minutes to be twice as fast as the C++ version.
I see opportunity to optimise everywhere in software. A billion KG CO2 could be saved easy if a few green code shops popped up. It's also often a net win. Faster software, lower costs, lower management burden... I'm thinking of starting a consultancy.
The problem is after witnessing the likes of Greta Thunberg then if that's what the next generation has in store then as far as I'm concerned the world can fucking burn and her generation along with it.6 -
I recently tried to apply the same data analytics rationale that I use at work to my personal life. This is not a rant, it is more like an data storytelling of an actual use case I would like some input on.
I set a goal - gotta thin up a bit and calm down my ticker - and got a (almost unreasonably expensive) field expert consultant to yell at me about it for a couple hours.
I unravel the metrics - there is like a million weight-related KPIs and most say nothing at all. I have never seen an non-infrastructure measurable subject that could not be resumed to 2-5 performance metrics. I got overall weight, how well my nine-years-old business suit fits me, heart rate, and day-after relative muscle pain (it will make sense soon).
Then its data-pipeline time. I bought a cheap weight scale and smartwatch, and every morning I input the data in an app. Yes, I try to put on the suit every morning. It still does not fit.
After establishing a baseline, I tried to fit different approaches. Doing equipment-free exercises, going to the gym, dieting. None was actually feasible in the long run, but trying different approaches does highlight the impacts and the handling profile of each method.
Looking at the now-gathered data, one thing was obvious - can't do dieting because it is not doable to have a shopping list and meals for me and another for the family.
Gym is also off the table - too much overhead. I spend more time on the trip there and back than actually there.
And home exercise equipment is either super crappy or very expensive. But it is also the most reasonable approach.
So it is solutions time. I got a nice exercise bycicle (not a peloton), an yoga mat (the wife already had that one) and an exercise program that uses only those two resources. Not as efficient without dieting, not as measurable and broad as the gym, but it fits my workflow. Deploy to production!
A few months pass and the dataset grows. The signal is subtle but has support - it works! The handling, however, needs improvement, since I cannot often enough get with the exercise program. Some mornings are just after some hard days.
I start thinking about what else I can improve in the program, but it is already pretty lean and full of compromises.
So I pull an engineer and start thinking about the support systems and draft profile. What else could be draining my willpower and morning time?
Chores. Getting the kids ready for school, firing up the moka pot, setting the off-brand roomba, folding the overnight-dried clothes, cooking breakfast, doing the dishes, cleaning the toilets. All part of my morning routine. It might benefit from some automation.
Last month I got that machine our elders call "wasteful" and "useless crap lazy entitled Americans invented because they feel oh-so-insulted for simply doing something by hand like everyone always did" - a "dish-washer".
Heh, I remember how hard was to convince my mother-in-law that an remote-controled electric garage door would not make she look like an spoiled brat.
Still to early to call, but I think that the dishwasher just saved me about 25 mins every morning. It might be enough to save willpower for me to do more exercise.
This is all so reflective of all data analytics cases really are out in the wild - the analytics phase seems so small compared to the gathering and practical problem-solving all around. And yet d.a. is what tells you that you are doing the wrong thing all along. Or on what you should work next.7 -
So.. how does being in a company makes you a better engineer?
For eg, this year my team shipped these 2 interesting features (among other things) that were never part of our product.
these were crazy ideas, that got liked by our users, and might have impacted the finances of company positively.
However the engineers (not me, i worked on them even less) that worked on it, did they got their knowledge growth? they just analysed the old codebase, its shitty architecture and drawbacks, and added the feature around it in a similar manner. so basically it was building debris over the debris
but is this growth? if that engg was me, would my experience in dealing with debris and building debris over it (that somehow works) be helpful for any other organisation?
And number (2) : is my organisation even a good one if its allowing to build debris over debris and not leaving a space for discussion or cleaning the mess or thinking about better architectures, data structures for scalability and robustness?
An engineer's growth should be made by giving them a chance to explore new and best solutions and not the best hacks i guess1 -
How it started:
Need to replace in a lot of SQL files certain stuff...
find . -type f -iname '*.sql' -exec sed -i 's|new|old|g' {} \;
12 hours later that find executed a shell script containing roughly 120 lines of text pipelining.
The jolly of inconsistent workflows.
Different SQL format stylings... Makes fun when single line string replace needs to be extended to multiline RegEx handling. Or matching SQL comment configuration..
Different line endings. MacOS, Windows, Unix, Bukkake.
Different charsets / collations. Anyone wants latin1_swedish_ci... utf8... utf16... :/
Realizing some people even left sensitive data inside the SQL files (e.g. API Tokens..... Yayyyyyyy).
...
Ugh. It's never a one liner. It's never easy. -.-
I hate cleaning up messy shit.3 -
People think Machine Learning is all about using Super Complex Prediction Models...
But turns out to devote most time in data gathering and data cleaning(preprocessing).2 -
!rant
I've always been wondering why do tech companies need everyone to have a strong grasp of algos and data structures?
I've been coding most of my life but didn't get a CS degree so ended up in IT but I kind of want to get into a tech company as my thinking is the quality of code much higher (I spend a lot of time cleaning up other people's code and prod issues over the years...), I've been learning Algo/DS but when I see those technical questions on CareerCup, I go WTF.... it's this the kind of problems you guys do every day?6 -
[NN]
Day3: the accuracy has gone to shit and continues to stay that way, despite me cleaning that damn data up.
Urghhhhhhhh
*bangs head against the wall, repeatedly*10 -
So this web company i joined had a page load time in minutes. The free text search (inverted index search, based on elasticsearch) queries would return results in 10-45 seconds (should be milliseconds always). The indexes had no schema. And they would crawl data and feed into mssql db, which had a 2 gb/db limit on the free version. So everytime the db hit the limit, a new db was created and the name was incremented by one.
Had a very tough time cleaning up that mess. Plus the architect who had made this architecture was on his way out and unhelpful to the core.
What was worse was that most of the changes i did were very simple changes that should have been done long back. Basic sanity changes.4 -
Did the simplest corridor gen I could think of. The tile that makes up corridors is different than the tile that makes up the rooms. They are drawn the same here though. In the floor data structure they are different types. This will allow me to easily place doors the like. The dots are potential door placements.
Now that I have simple room gen working I can work on filling it with 3d models to make up walls, doors, etc.
Most of the time the rooms connect on the whole map. But once in a while they do not. I like this as I will incorporate mining. The final map will be much bigger. This is 32x32 and I want 256x256. I will need to figure out how to determine room density versus grid size.
I need to spend some time cleaning up the code and try and generalize the code. I will need to allow for pregenned rooms as well with defined entry points. The entry points on these rooms is all random. It will probably be tricky to do random room to pregenned room corridors. Proximity seems to work. So prox to a predefined door location should work.5 -
Do you have ever tried to recover a very valuable shredded stack of paper (4 sheets)? They are shredded into A LOT OF PIECES and not stripes as I hoped for!!
After 5 hours work I have found 15 pieces which fit together! I am so pissed about myself and my incompetence when it comes to data cleaning 😡😠🤬14 -
I don't know how much of this can be considered data loss but one one of my uni classmates frustrated by some hellish tasks (cleaning some old code files probably) decided that everything in that particular directory won't be of any further need, so she procede to rm -rf it.. only to discover that the terminal opened in that dir was another one and her current one (the one she bashed that unforgiving rm) was in fact a standard freshly opened term where any term would open.. in the user's (only user) home dir... such a face she had when all her codes, homeworks, projects and everything went to oblivion 😂😂 jokes aside it was a good thing that the semester was almost finished, all hws submited and no important data was there as she dual booted with ubuntu and some windows, but funny thing how such a honest mistake can ruin not only your day, but maybe your entire semester1
-
So a Buggy production code caused data discrepency. This is fixed, and now the only task left is cleaning up the data, and this will be in my backlog forever.
-
I started running a Database benchmark yesterday morning, with my system configuration, expected time to complete was 36hrs(arround), so I left it and made sure no one disturbs (I stuck a note in the monitor) because it was on common system in the lab.
Then I went to my other work.
Evening ,I came to check the progress, my monitor was switched off, I thought its in power saving mode!
Fuck, I bend down and see the CPU is off!
Wtf!! Who shut it down ,even after the note.
Then I saw the electric outlet was off!
Then after wards asking ppl in the lab, they told ,the cleaning person was cleaning the switches, so yeah she could have by mistake!
* I facepalmed *
So again, I set it up with frustration!
Today morning ,I came to see the progress
FML, from no where ,
" It's in Windows automatic repair loop! "
It's been 3hrs, trying to get out of that loop without loosing the data.
1TB of data is there, took 1month to setup all the things
Fuck Microsoft for adding these kind idiotic stuff in windows.
Is there a spirit in the lab not allowing me to do benchmark? -
I have a ton.
This one is more about my own foolhardiness, but I also learned a ton and came out stronger & wiser.
when I started out as a dev (my very first job!) and had not learned to say no.
I was a novice way out of my playground.
Like lifting a full booking platform from legacy php to Laravel & launching it like yesterday because it’s high season.
Didn’t know the full domain, so I just built something quite different in a week and shafted the existing db into it.
Obviously wasn’t feature complete or anything, so it resulted in maintaining legacy while building the new one and because it was already live and on different domains, we didn’t fully know which ppl went to, I had to every day painstakingly back port data from both platforms.
What I initially thought would take a few weeks that was launched in 3 days, spanned across 2 years plus one year refining and cleaning up my mess.1 -
I work as a data engineer in my company. My senior calls himself data scientist- he is 29 years old and recently did one MOOC on data science!
I wonder when my colleagues will find out about how much he really knows.
Till then I am cleaning and arranging his data, while he sits and earns a big fat package by citing one data scientist tag to his profile!! -_-1 -
Nothing much to ready today, keep scrolling..
I just asked you to keep scrolling, I am using this space to think out loud...
Damn you bloody rebel.. whatever..
Finally after a rough week, festivals, interviews, work stress, and pending tasks, I got a free weekend for myself to be with myself.
I managed to do bare minimum at work. My new line manager isn't quite pleased with how team and I am functioning but whatever.
On Fridays, I usually end the day early and start with personal tasks. I managed to finish some long pending activities.
Today, I was able to do a deep cleaning of digital housekeeping. Sorted some clashes with parents. manage to de-stress and relax my stiff neck muscles.
Apart from that I guess, I am all prepared to interview and get hired for a company on foreign land. I am confident that I can relocate to EU.
And for now, I am actively pursuing two of my hobbies, Music and Finances. I love managing my finances and learning more about technical aspects of audio and listening to more and more music.
I feel happier, relaxed, and calm. Having things under control is such a wonderful feeling.
And I am slowly building a framework to earn, manage, invest, and grow my finances. It's turning out really well. I have setup the base infrastructure.
For music, I have figured the fundamentals and now I will go out buy myself an DAC/AMP to build a portable rig.
This shit is so awesome and makes me happy. I am able to socialise at the end of each day so that keeps me going during the lock-down phase.
I have figured the top key and important things to do at work for my profile and I actually enjoy those.
1. Product discovery - talking to users/customers and finding their pain areas and opportunities to build the solution
2. Product vision/strategy - Dreaming on how the product would evolve and laying out a solid plan to materialise those dreams.
3. Roadmap and prioritisation - this should be self explanatory
4. Success metrics - I really want to get into data and I am getting opportunities to do so. This is super fun. This will help me analyse and show the impact of the what we are building and measuring it while making sure that LT recognises my and my teams' efforts.
I want to and I will excel these 4 keys skills of my profile and be more efficient at my job.
This will give me more time to pursue my hobbies (which will change over time and want to enjoy them the most while I am at them).
Guys, after a rough 2021, the end of the year seems promising with a lot of leaves and short vacation coming up.
Apart from all this, what is more important here is that I got the career and life clarity that I was struggling with for past few months.
For whoever has read till here, YOU ARE BLOODY AWESOME and thank you from the bottom of my heart for being there for me always.
I am grateful to be a part of this community and have awesome friends like you all who have been with me though my ups and downs since 2016.
LOVE YOU ALL :)3 -
This morning I found out that the code I wrote to convert json data to a new format in our DB was giving errors and a bunch of questions got saved with the wrong property. It was assumed when it was triaged with my boss that we would only see one key property so the code written by me so the code was aimed at that. Well some questions have multiple keys for no reason. They are mostly floating data that hasn't been wiped clean because the develop who wrote this use json data in psql with no validation or data cleaning. This edge case was also never caught on PR reviews and we got a pretty heavy review process. I'm not being blamed for it. Most of it I think all the devs feel bad we didn't catch this because it affected us greatly. I've been working all morning trying to resolve it with my boss and just now in the evening we stopped. I just feel like I'm not a good dev at all and just want advice on how to deal with situations like this. I'm a new dev and this is my first job I have held for almost a year2
-
Spring cleaning my laptop. Moved my music etc to external hdd. Somehow I deleted all music. Ease of USE data recovery charging $65 for license. No previous version to restore. When I scan with Ease I can see my files. Someone help?24
-
Every time you give me a CSV to import, why is it arranged differently to the last one, missing info that is required for the point of the thing I'm importing it into, and I have to spend 2 hours going through it with a text editor and even a hex editor? Your data entry is about as clean as my arse after a particularly spicy burrito.2
-
I spent 4 days making this:
https://txstc55.github.io/us_crime_...
Cleaning data, learning threejs, optimizing the search because threejs is slow as shit, etc
Tell me I’m awesome (please)