Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "big data"
-
buzzword translations:
"cloud" -> someones computer
"big data" -> lots of somewhat irrelevant data
"ai" -> if if if if if if if if if if if if if else
"algorithm" -> something that works but you don't know why
"secure" -> https://
"cyber security" -> kali linux + black hoodie
"innovation" -> adding something completely irrelevant such as making a poop emoji talk
"blockchain" -> we make lots of backups
"privacy" -> we store your data, we just don't tell you about it40 -
Buzzword dictionary to deal with annoying clients:
AI—regression
Big data—data
Blockchain—database
Algorithm—automated decision-making
Cloud—Internet
Crypto—cryptocurrency
Dark web—Onion service
Data science—statistics done by nonstatisticians
Disruption—competition
Viral—popular
IoT—malware-ready device15 -
I miss old internet.
- without politics
- without robots
- without money
- without big portals
- without commercials
- without advertising
- without data centers
- without ipv6
but with great usenet and community
Shit fuck I’m old26 -
I'd like to extend my heartfelt fuck-you to the following persons:
- The recruiter who told me that at my age I wouldn't find a job anymore: FUCK YOU, I'll send you my 55 birthday's cake candles, you can put all of them in your ass, with light on.
- The Project Manager that after 5 rounds of interviews and technical tests told me I didn't have enough experience for his project: be fucked in an Agile way by all member of your team, standing up, every morning for 15 minutes, and every 2 weeks by all stakeholders.
- The unemployment officer who advised me to take low level jobs, cut my expenses and salary expectations: you can cut your cock and suck it, so you'll stop telling bullshit to people
- The moron that gave me a monster technical assignment on Big Data, which I delivered, and didn't gave me any feedback: shove all your BIG DATA in your ass and open it to external integrations
- the architect who told me I should open my horizons, because I didn't like React: put a reactive mix in your ass and close it, so your shit will explode in your mouth
- the countless recruiter who used my cv to increase their db, offering fake jobs: print all your db on paper and stuff your ass with that, you'll see how big you will be
To all of them, really really fuck you.12 -
*Interview*
Interviewer: We have an opening. Are you interested to work?
Me: What is that I'll be doing?
I: What technologies and languages do you know?
Me: I know Scala, Java, Spark, Angular, Typescript, blah blah. What is your tech stack?
I: Any experience working on frontend?
Me: Yes. But what do you use for it?
I: Can you work with databases?
Me: I can, on SQL based. What are yours?
I: Can you do big data processing?
Me: I know Spark, if that's what you are asking for. What is it that you actually do?
I: Any experience in cloud development?
Me: Yes. AWS? Azure? GCP?
I: Do you know CI CD?
Me: Excuse me.. I've been asking a lot of questions but you're not paying attention to what I'm asking. Can you please answer the questions I asked.
I: Yes. Go ahead.
Me: What will be my position?
I: A full stack developer.
Me: What technologies do you use in your project?
I: We use all the latest tech.
Me: Like?
I: All latest tech.
Me: You mentioned big data processing?
I: Yes. Processing data from DB and generating reports.
Me: what do you use for that?
I: Java.
Me: Are you planning to rebuild it using Spark or something and deploy in the cloud?
I: No we're not rebuilding it. Just some additions to the existing.
Me: Then what's with cloud? Why did you ask for that?
I: Just to know if you're familiar.
Me: So I'll be working with Java. Okay. What do you use for UI?
I: Flash
Me: 🙄
I sat for a couple of minutes contemplating life.
I: Are you willing to join?
Me: No. Not at all. Thankyou for the offer.5 -
This can annoy the hell out of me. When people ask me if they can have your Facebook or whatsapp or something and I'm like 'sorry I don't have that' and they ask why and you explain because privacy reasons and they go like 'oh you're a little paranoid are ya?'.
There's a motherfucking big difference between wanting control over your data as much as possible and being paranoid.
Fucking hell.30 -
Me: you should not open that log file in excel its almost 700mb
Client: its okay, my computer has 4gb ram
Me: *looking at clients computer crashing*
Client: the file is broken!
Me: no, you just need to use a more memory efficient tool, like R, SAS, python, C#, or like anything else!5 -
I guess that is what you get for bringing up security issues on someones website.
Not like I could read, edit or delete customer or company data...
I mean what the shit... all I did was try to help and gives me THIS? I even offered to help... maybe he got angry cause I kind of threw it in his face that the whole fucking system is shit and that you can create admin accounts with ease. No it's not a framework or anything, just one big php file with GET parameters as distinction which function he should use. One fucking file where everything goes into.21 -
If Big O notations where emojis. This chart shows you common big-Os with emoji showing how they'll make you feel as your data scales. Source blog.honeybadger.io7
-
Was programming on the privacy site REST api.
Needed a break and started searching for a good movie or documentary.
Found a documentary about big data/mass surveillance.
I now have loads of motivation for programming on this again as this showed me the importance of secure services/software.20 -
A few years ago:
In the process of transferring MySQL data to a new disk, I accidentally rm'ed the actual MySQL directory, instead of the symlink that I had previously set up for it.
My guts felt like dropping through to the floor.
In a panic, I asked my colleague: "What did those databases contain?"
C: "Raw data of load tests that were made last week."
Me: "Oh.. does that mean that they aren't needed anymore?"
C: "They already got the results, but might need to refer to the raw data later... why?"
Me: "Uh, I accidentally deleted all the MySQL files... I'm in Big Trouble, aren't I?"
C: "Hmm... with any luck, they might forget that the data even exists. I got your back on this one, just in case."
Luck was indeed on my side, as nobody ever asked about the data again.5 -
So Facebook provided unlimited data access to loads of companies including spotify/microsoft and other big names.
Although there are privacy rules, those companies had deals which excluded them from these privacy rules.
I don't think my custom DNS server or a pihole is enough anymore, let's firewall block all Facebook's fucking ip ranges.
Source: https://fossbytes.com/facebook-gave...19 -
Manual Data Entry: Most boring job
This reminds me of one conversation with one of my faculty..
Faculty: Why not try some Machine Learning Project?
Me: Cool. Any ideas you have already thought
Faculty: Comes up with a really noble idea
Me: Awesome idea. But we need data
Faculty: Don't worry. I will get it. Just help me setup Hadoop (see the irony.. no data yet, and he wants big data setup)
Me: But we don't have data. Let's focus of data collection, Sir
Faculty: I will get it. Don't worry. Trust me.
( I did setup for him twice coz he formatted the system on which I did the setup first time)
After 6 months,
Me: (same question) Sir, Data??
Faculty: I got it.
Me: Great. Give me, I can start looking into it from today.
Faculty: Actually, it's in a register written manually in a different language (which even I can't understand) I will hire data entry guys to convert it into English digital contents.
Me: *facepalm*
Road to Manual data entry to Big Data
Dedicating this pencil to the individuals keeping the register up to date and Sir in hopes of converting it into big data..
Long way to go..4 -
Although it might not get much follow up stuffs (probably a few fines but that will be about it), I still find this awesome.
The part of the Dutch government which keeps an eye on data leaks, how companies handle personal data, if companies comply with data protection/privacy laws etc (referring to it as AP from now on) finished their investigation into Windows 10. They started it because of privacy concerns from a few people about the data collection Microsoft does through Windows 10.
It's funny that whenever operating systems are brought up (or privacy/security) and we get to why I don't 'just' use windows 10 (that's actually something I'm asked sometimes), when I tell that it's for a big part due to privacy reasons, people always go into 'it's not that bad', 'oh well as long as it's lawful', 'but it isn't illegal, right!'.
Well, that changed today (for the netherlands).
AP has concluded that Windows 10 is not complying with the dutch privacy and personal data protection law.
I'm going to quote this one (trying my best to translate):
"It appears that Microsofts operating system follows every step you take on your computer. That gives a very invasive image of you", "What does that mean? do people know that, do they want that? Microsoft should give people a fair chance for deciding this by themselves".
They also say that unless explicit lawful consent is given (with enough information on what is collected, for what reasons and what it can be used for), Microsoft is, according to law, not allowed to collect their telemetrics through windows 10.
"But you can turn it off yourself!" - True, but as the paragraph above said, the dutch law requires that people are given more than enough information to decide what happens to their data, and, collection is now allowed until explicitly/lawfully ok'd where the person consenting has had enough information in order to make a well educated decision.
I'm really happy about this!
Source (dutch, sorry, only found it on a dutch (well respected) security site): https://security.nl/posting/534981/...8 -
Some companies be like-
.. In job posting - We are the next big thing. We are going to change the industry. We are like Google / Facebook etc...
..in Introduction - We are the next big thing. We are going to change the industry. We are like Google / Facebook etc...
.. in Interviews - We are the next big thing. We are already changing the industry. Think of us like Google / Facebook etc...
.. during Interviews - Our interview process is rigorous because we are the next big thing. We are going to change the industry. We are like Google / Facebook etc...
.. questions in interviews - Since we are Google / Facebook, please answer questions on Java, C/C++, JS, react, angular, data structure, html, css, C#, algorithms, rdbms, nosql, python, golang, pascal, shell, perl...
.. english, french, japanese, arabic, farsi, Sinhalese..
.. analytics, BigData, Hadoop, Spark,
.. HTTP(s), tcp, smpp, networking,.
..
..
..
.. starwars, dark-knight, scarface, someShitMovie..
You must be willing to work anytime. You must have 'no-excuses' attitude
.........................................
Now in Salary - Oh... well... yeah... see.... that actually depends on your previous package. Stocks will be given after 24 re-births. Joining bonus will be given once you lease your kidneys.
But hey, look... We got free food.
Well, SHOVE THAT FOOD UPTO YOUR ASS.
FUCK YOU...
FUCK YOUR 'COOL aka STUPID PIZZA BEER - CULTURE'.
FUCK YOUR 'FLAT- HIERARCHY'.
FUCK YOUR REVOLUTIONARY-PRODUCT.
FUCK YOU!2 -
Just wanted to say a 'thank you' to all people who bear with my privacy stuffs! I know quite some people who installed messaging apps, signed up to privacy services and so on, solely because --> I <-- want to communicate in private and I realise (I've always realized that though) that that can be tough sometimes.
Also a thank you to those people for not requiring me to get data fed into the big companies :).
Thanks!24 -
My client is trying to force me to sign an ethics agreement that would allow them to sue me if found in breach of it. At the same time they are scraping eBay's data without their consent and refuse to sign the licence agreement. Apparently they don't understand irony.3
-
Mum: Is this the big data?
Father: Do you know anything about Bitcoin? Can you explain me what it is?1 -
National Health Service (nhs) in the UK got hacked today... Workers at the hospitals could not access patient and appointment related data... How big a cheapskate you gotta be to hack a free public health service that is almost dying for fund shortages anyway...16
-
When I was 10 my younger brother saved over my fully completed pokedex in Pokemon blue.
First big data loss taught me a good life lesson. Now I backup everything on a local server.1 -
Overheard some guy talking about robotics on the phone, turns out it was all about MS excel macros.
people need to stop abusing terms like big data, AI etc. to make them sound 'smart' 🙄4 -
I've dreamed of learning business intelligence and handling big data.
So I went to an university info event today for "MAS Data Science".
Everything's sounded great. Finding insights out of complex datasets, check! Great possibilities and salery.
Yay! 😀
Only after an hour they've explained that the main focus of this course is on leading a library, museum or an archive. 😟 huh why? WTF?
Turns out, they've relabled their librarian education course to Data science for getting peoples attention.
Hey you cocksuckers! I want my 2 hours of wasted life time back!
Fuck this "english title whitewashing"4 -
You may know about my dumb CTO, if not, read here: https://devrant.io/rants/854361/...
Anyway, the dumbass emailed me this weekend asking “what is big data?”
So I replied: “...it’s when you use a large font in your code...”
He thanked me. I bet he will be at some presentation somewhere and will reference using large fonts in an IDE!!!10 -
"Big data" and "machine learning" are such big buzz words. Employers be like "we want this! Can you use this?" but they give you shitty, ancient PC's and messy MESSY data. Oh? You want to know why it's taken me five weeks to clean data and run ML algorithms? Have you seen how bad your data is? Are you aware of the lack of standardisation? DO YOU KNOW HOW MANY PEOPLE HAVE MISSPELLED "information"?!!! I DIDN'T EVEN KNOW THERE WERE MORE THAN 15 WAYS OF MISSPELLING IT!!! I HAD TO MAKE MY OWN GODDAMN DICTIONARY!!! YOU EVER FELT THE PAIN OF TRAINING A CLASSIFIER FOR 4 DAYS STRAIGHT THEN YOUR GODDAMN DEVICE CRASHES LOSING ALL YOUR TRAINED MODELS?!!
*cries*7 -
Today a colleague was making weird noises because he was modifying some data files where half of the data needed to be updated with a name field, there were 4 files all about 1200 lines big.
I asked how he was doing and he said he was ready to kill himself, after he explained why I asked why he was doing if manually. He said he normally uses regex for it but he couldnt do this with a regex.
I opened VS code for him, used the multiselect thing (CTRL+D) and changed one of the files in about 2 minutes. Something he was working on for over half an hour already.... He thanked me about a million times for explaining it to him.
If you ever find yourself in a position where you have a tedious task which takes hours, please ask if somebody knows a way of doing it quicker. Doing something in 2 minutes is quite a bit cheaper and better for your mental state than doing the same thing manually In 3 hours (our estimate)4 -
What kind of cum gargling gerbil shelfer stores and transmits user passwords in plain text, as well as displays them in the clear, Everywhere!
This, alongside other numerous punishable by death, basic data and user handling flaws clearly indicate this fucking simpleton who is "more certified than you" clearly doesn't give a flying fuck about any kind of best practice that if the extra time was taken to implement, might not totally annihilate the company in lawsuits when several big companies gang up to shower rape us with lawsuits over data breaches.
Even better than that is the login fields don't even differentiate between uppercase or lowercase, I mean WHAT THE ACTUAL FUCK DO YOU SELF RIGHTEOUS IGNORANT CUNTS THINK IS GOING TO HAPPEN IN THIS SCENARIO?13 -
I always thought that this could only happen with big orgs with precious data. One of my coworkers sent me this last night10
-
I absolutely hate the way we are taught programming in Indian colleges.
FML #1: I'm pursuing a UG CS course, and this semester, I only had one subject of Computers, that too only 1 credit. The rest with all electronics.
FML #2: In that 1 credit course, we had to make a C++ project which had "data handling". No one cares if you build something cool or not, just that a project should have "extensive use" of data handling.
FML #3: Source code had to be >= 1000 lines. This is the only place where ADDING MORE LINES OF CODES THAN REDUCING IT is appreciated. Had to stuff my code with all kinds of comments and violating the basic principle of DRY.
So, yeah, we're fucked big time. 😥14 -
Developer vs Tester
(Spoiler alert: developer wins)
My last developent was quite big and is now in our system testing department. So last week i got every 20 minutes a call from the tester, that something did not work as expected. For about 90% of the time i looked at the testing setup or the logs and told him, that the data is wrong or he used the tool wrong. After a couple of days i got mad because of his frequent interruptions. So I decided to make a list. Every time he came to me with an "error" i checked it and made a line for "User Error" or "Programming Error". He did not liked that much, because the User Error collum startet to grow fast:
User Errors: ||||| |||
Programming Errors: |||
Now he checks his testing data and the logs 3 times before he calls me and he hardly finds any "errors" anymore.3 -
Whoever implemented the data import in Numbers on Mac needs to be lined up against a wall and shot with needles until they wish they were dead.
Why on all of gods unholy green and shitty earth would i want data i import (EVEN IN CSV FOR FUCK SAKE) to be delimited by an arbitrary text width? WHAT THE ACTUAL FUCK
WHY WHY why would I EVER want to delimit my carefully structured data by fucking text width instead of new line or comma? AAAAARRRHHH
And what fucking big brain genius made this the DEFAULT SETTING for imported text AND CSV FILES. IT STANDS FOR COMMA SEPARATED FILE YOU FUCK BOI MAYBE JUST MAYBE I WANT IT SEPARATED BY FUCKING COMMMMMMMAAAAASSSSSS9 -
GOT AN A+ FOR MY LAST PROJECT OF HIGH SCHOOL!!! SO FUCKING HAPPY!!!
(by the way, we built a search engine for this project. A pretty big and fast one too)10 -
A company gave a placement talk in college today.
First, they talked about their company's facts and figures, which no one was interested in.
Second, they talked about Amazon and Jeff's vision, AirBnB and their revolutionary idea, more than their own company and products.
Third, they showed some testimonial videos of their employees and customers.
"What the fuck is going on?" I thought. We were there to get information about a placement test.
Buzzwords started coming in. Machine Learning, Artificial Intelligence, Big Data and what not.
Last 15 minutes, a guy came. He talked about test date, test format and test topics, finally.
An hour and half wasted for 15 minutes of information.
Fuck placement talks.35 -
From a design meeting yesterday:
MyBoss: "The estimate hours seem low for a project of this size. Is everything accounted for?"
WebDev1: "Yes, we feel everything for the web site is accounted for."
-- ding ding...my spidey sense goes off
Me: "What about merchandising?"
MerchDevMgr: "Our estimate pushed the hours over what the stakeholders wanted to spend. Web department nixed it to get the proposal approved."
MyBoss: "WTF!? How the hell can this project go anywhere without merchandising entering the data!?"
WebDev2: "Its fine. We'll just get the data from merchandising and enter it by hand. It will only be temporary"
Me: "Temporary for who? Are you expecting developers to validate and maintain data?"
WebDev1: "It won't be a big deal."
MyBoss: "Yes it is! When the data is wrong, who are they going to blame!?"
WebDev1: "Oh, we didn't really think of that."
MerchDevMgr: "I did, but the CEO really wants this project completed, but the Web VPs would only accept half the hours estimated."
Me: "Then you don't do it. Period. Its better to do it right the first time than half-ass. How do think the CEO will react to finding out developers are responsible for the data entry?"
MerchDevMgr: "He would be pissed."
MyBoss: "I'm not signing off on this design. You can proceed without my approval., but I'll make a note on the document as to why. If you talk to Eric and Tom about the long term implications, they'll listen. At the end of the day, the MerchVPs are responsible to the CEO."
WebDev1: "OK, great. Now, the database, it should be SQLServer ..."
I checked out after that...daydreamed I was a viking.1 -
You've heard it!! To become a Web 3.0 daveluper, you have to do Blockchain programming, and you get additional investor points when you do artificial intelligence, Big Data and IoT 😝2
-
True story.
Some clients (especially in India) don't want to pay, but they want everything to be implemented in the project.
Big data.... Check
Machine learning.... Check
Deep learning..... Check
Espresso maker.... Check.
They want all the buzz words that are buzzing to be put in your project and they want you to put it in the 'cloud', for which you have to pay.....10 -
Creating an anonymous analytics system for the security blog and privacy site together with @plusgut!
It's fun to see a very simple API come alive with querying some data :D.
Big thanks to @plusgut for doing the frontend/graphs side on this one!20 -
Everybody talking about Machine Learning like everybody talked about Cloud Computing and Big Data in 2013.4
-
I just got four CSV reports sent to me by our audit team, one of them zipped because it was too large to attach to email.
I open the three smaller ones and it turns out they copied all the (comma separated) data into the first column of an Excel document.
It gets better.
I unzip the "big" one. It's just a shortcut to the report, on a network share I don't have access to.
They zipped a shortcut.
Sigh. This'll be a fun exchange.3 -
When there are no widely approved Swedish translations of big data terminology, such as "big data" itself. When discussing this kind of terms you have to resort to using the English words for them, which results in a horrible language mix, Swenglish.39
-
My co-workers hate it when I ask this question on a technical interview, but my common one is "what is the difference between a varchar(max) and varchar(8000) when they are both storing 8000 characters"
Answer, you cannot index a varchar(max). A varchar(max) and varchar(8000) both store the data in the table but a max will go to blob storage if it is greater than 8000.
No one ever knows the answer but I like to ask it to see how people think. Then I tell them that no one ever gets that right and it isn't a big deal that they don't know it, as I give them the answer.8 -
Client: we need a big data implementation in AWS to be fully HA and DR.... Money is no object
*3 weeks later when the bill comes in *
Client: its too expensive we don't need this HA stuff we don't even know what it stands for anyhow so can you take it out? But the system still needs 24/7 availability....2 -
We are building this big-data engine for a client's product for which we were using a cluster on GCP and they were billed ~1100$ for the last month's usage.
The CTO - the CHIEFFUCKING TECHNOLOGY OFFICER told us to hook up 5-6 laptops in our server room and create our own cluster because they cannot afford so much bill.6 -
Spend 14 hours a week studying more with my free time.
Things to be studied:
-discrete math
-data structures
-algorithms
-coding challenges
-problem defining
-abstraction
-other relevant maths
Other things I want to improve:
-confidence at work
-reaching out to teams with questions
-social skills
-time management
-enjoying the little things
-patience
-consistency (with everything above)
Last big thing would be being more conscious with what type of data/platforms I am digesting everyday. Just like a good diet I want to get in the habit of consuming “good” useful content that’s thought provoking or knowable rather than fast food social media carbs
Wish everyone a productive New Year!6 -
Coworker: since the last data update this query kinda returns 108k records, so we gotta optimize it.
Me: The api must return a massive json by now.
C: Yeah we gotta overhaul that api.
Me: How big do you think that json response is? I'd say 300Kb
C: I guess 1.2Mb
C: *downloads json response*
Filesize: 298Kb
Me: Hell yeah!
PM: Now start giving estimates this accurate!
Me: 😅😂4 -
veekun/pokedex
https://github.com/veekun/pokedex
It's essentially all meta you need to make a pokemon game, in csv files.
Afaik, they ripped the information from the original games, so you can be sure about their validity.
I love how it's easy to use, isn't some weird ass formatted wiki and even has scripts to load it into your database.
Me being a huge pokemon fan, that's the non plus ultra. -
My code review nightmare?
All of the reviews that consisted of a group of devs+managers in a conference room and a big screen micro-analyzing every line of code.
"Why did you call the variable that? Wouldn't be be more efficient to use XYZ components? You should switch everything to use ServiceBus."
and/or using the 18+ page coding standard document as a weapon.
PHB:"On page 5, paragraph 9, sub-section A-123, the standards dictate to select all the necessary data from the database. Your query is only selecting 5 fields from the 15 field field table. You might need to access more data in the future and this approach reduces the amount of code change."
Me: "Um, if the data requirements change, wouldn't we have change code anyway?"
PHB: "Application requirements are determined by our users, not you. That's why we have standards."
Me: "Um, that's not what I ..."
PHB: "Next file, oh boy, this one is a mess. On page 9, paragraph 2, sub-section Z-987, the standards dictate to only select the absolute minimum amount of the data from the database. Your query is selecting 3 fields, but the application is only using 2."
Me: "Yes, the application not using the field right now, but the user stated they might need the data for additional review."
PHB: "Did they fill out the proper change request form?"
Me: "No, they ...wait...Aren't the standards on page 9 contradictory to the standards on page 5?"
PHB: "NO! You'll never break your cowboy-coding mindset if you continue to violate standards. You see, standards are our promise to customers to ensure quality. You don't want to break our promises...do you?"7 -
A lot of brainwashed people dont care about privacy at all and always say: "Ive got nothing to hide, fuck off...". But that is not true. Any information can be used aginst you in the future when "authorities" will release some kind of Chinas social credit system. Stop selling your data for free to big companies.
https://medium.com/s/story/...6 -
Currently, a classmate and I are working on our technical thesis.
It is all about industry 4.0, IIoT, big data and stuff.
This week, we presented our interim results to our supervisor. He is very pleased with our work and made the following suggestion:
He thinks it would be awesome to publish our work on our own GitHub repository and make it open source because he is convinced that this thesis is able to kind of "set a new standard" in some specific fields of using big data analysis in production processes.
I guess I'm kind of proud :)4 -
Remove all the outdated and unwanted topics which were taught during Indus Valley civilization like: 8080 microprocessor, Java 6, Software Testing principles etc. And add more interesting and realistic topics like: Algorithm design, graphs and other data structures, Java 8 (at least for now), big data, Basics about AI, etc.7
-
User: Hey, we got a big issue with one of your tools. One of your pages isn't loading.
Me: Ok, so when did this happen?
User: We don't know? Its been like that for a long time though, so we thought it was normal 😃
Me: ....ok. So do you know what data is supposed to appear?
User: Uhhh we're not sure as well. Since, you know, its been like that for a while.
Just great 😑4 -
This might actually be my first real rant.
Whatever fucking cockgoblin decided that making dynamics GP so fucking confusing needs to suck a big bag of dicks. I'm so fucking tired of having to google every damned table name and column name because nothing makes any motherfucking sense.
Am I supposed to instinctively know what PM20201 does? What data it holds? I don't mind reading documentation. But it's hard to even know where to start when the shitbird API and database are more complicated than calculating orbital fucking decay.
I am done. Fuck you gp. Fuck you and your nonsense. I guess our sales people don't get to know when an invoice was paid.8 -
DEAR CTOs, PLEASE ASK THE DEVELOPER OF THE SOFTWARE WHICH YOU ARE PLANNING TO BUY IN WHAT LANGUAGE AND WHAT VERSION THEY ARE WRITTEN IN.
Background: I worked a LONG time for a software company which developed a BIG crm software suite for a very niche sector. The softwary company was quite successfull and got many customers, even big companies bought our software. The thing is: The software is written in Ruby 1.8.7 and Rails 2. Even some customer servers are running debian squeeze... Yes, this setup is still in production use in 2022. (Rails 7 is the current version). I really don't get it why no one asked for the specific setup, they just bought it. We always told our boss, that we need time to upgrade. But he told every time, no one pays for an tech upgrade... So there it is, many TBs of customer data are in systems which are totally old, not updated and with possibly security issues.9 -
One thing I've noticed about devRant is the ratio of web dev/mobile dev posts to database/architecture/big data dev posts. There's A LOT of you web peeps out there, and not enough data dudes, which I guess justifies my constant demand, salary and lack of competition. Just an observation.9
-
Dear Friends,
As a husband, I've sat next to my wife through eight miscarriages, and while drowning my sorrows on Facebook, face the inundation of pregnancy and baby ads. It's heartbreaking, depressing, and out right unethical.
How can we, as developers who conquer the world with software solutions, not solve this problem? Let's be honest, it's not that we cannot solve this problem, it's that we won't solve it.
We're really screwing this one up, and I'm issuing a challenge - who's out here on devRant that can make the first targeted "Shiva" ad campaign? Don't tell me you don't have the data in your system, because we all know you do. Your challenge is to identify the death of a loved one, or a miscarriage, and respectfully mourn the loss with no desire to make money from those individuals.
Fucking advertise flower delivery services and fancy chocolates to the people in THEIR inner circle, but stop fucking advertising pregnancy clothes to my wife after a miscarriage. You know you can do it. Don't let me down.
https://washingtonpost.com/lifestyl...11 -
Client from a big company requested that all sensible data should be encrypted, passwords included.
We agreed that was OK, and that we were already saving the hashes for the passwords.
The reply was "Hashes should be encrypted too"4 -
Me - Yeah great so you say it's big data we are gonna be analyzing and having to store, are you currently utilizing a service and aggregating any of it into smaller manageable segments?
Client - well yeah it's lots and lots of data, we can share it with you if you sign a nda.
Me - ok... sure, how are you gonna share it with me.
Client - oh I can email you the spreadsheet.
Me - .... Spreadsheet ... Um... Ok... 'Stands up and walks away to tell this as the most interesting meeting of the month, to some one that will get it'
--
Buzz word for the win!9 -
Someone at work snuck something past the censors.
Our Hadoop servers all have "bigd" in their name 😂5 -
Best exp:
( ͡ ͡° ͜ ʖ ͡ ͡°)
\╭☞ \╭☞ learning python and working with big data
Worst one:
(╯°□°)╯︵ learning php and visiting classes of programming at my college1 -
I hate it when marketing people decide they're technical - quote from a conference talk I regrettably sat through:
"The fourth industrial revolution is here, and you need to make sure you invest in every aspect of it - otherwise you'll be left in the dust by companies that are adopting big data, blockchain, quantum computing, nanotech, 3D printing and the internet of things."
Dahhhhhhhhhh6 -
Every 2019 tech startup: We do deep machine learning with big data in the cloud.
Investors: Please take our money!5 -
Goals for the next 100 weeks eh?
- Teach my 8yo and 11yo to be awesome java coders
- Take my 2yo to her first day at school
- Grow my department in work to over 40 people worldwide
- Start and finish my Masters degree in Big Data*
- Speak at 5 major international conferences (1000+ ppl)11 -
fuck code.org.
here are a few things that my teacher said last class.
"public keys are used because they are computationally hard to crack"
"when you connect to a website, your credit card number is encrypted with the public key"
"digital certificates contain all the keys"
"imagine you have a clock with x numbers on it. now, wrap a rope with the length of y around the clock until you run out of rope. where the rope runs out is x mod y"
bonus:
"crack the code" is a legitimate vocabulary words
we had to learn modulus in an extremely weird way before she told the class that is was just the remainder, but more importantly, we werent even told why we were learning mod. the only explanation is that "its used in cryptography"
i honestly doubt she knows what aes is.
to sum it up:
she thinks everything we send to a server is encrypted via the public key.
she thinks *every* public key is inherently hard to crack.
she doesnt know https uses symmetric encryption.
i think that she doesnt know that the authenticity of certificates must be checked.7 -
At work the other day...
Guy: "Oh hey I was thinking if you could help me with an application to visualize some data."
Me: "Ooookay...what did you have in mind?"
Guy: "I think we have XML files that could be turned into graphs...oh and we could add some trend lines. (Getting more excited) And maybe we could supplement it with live data...oh hey and maybe we could add real time alerts via email..."
Me: *thinks to self...there is no way in hell I am starting to work on something that he is literally coming up with requirements as he's talking* "I need specifics...so go take some time, think it through and get back to me with concrete details and examples."
Guy: "Ok. That should be enough to get you started for now at least."
That would be a big fuck no, good sir. Haven't started and won't start it. He has never mentioned it to me again since then.4 -
Hey !
A big question:
Assume we got an android app which graphs a sound file .
The point is: the user is able to zoom in/out so the whole data must be read in the begining , but as the file is a little longer , the load time increases.
What can i do to prevent this?3 -
The GitHub graphql API is pretty neat, mostly because it's a great example of a product where graphql has advantages over REST. As a code reviewer for repos with hundreds of simultaneous PRs, I use it to filter through branches for stuff that needs my attention the most.
NewRelic's NRQL API is also quite nice, as it provides an unusual but very direct interface into the underlying application metrics.
I'm also a big fan of launchlibrary, purely because I love spaceflight, and their API is an extremely rich and actively maintained resource. This makes it a great data source for playing around with plotting & statistics libraries — when I'm learning new languages or tools, I prefer to make something "real" rather than following a tutorial, and I often use launchlibrary as a fun and useful data backend. -
"How we use Tensorflow, Blockchain, Cloud, nVidia GPU, Ethereum, Big Data, AI and Monkeys to do blah blah... "4
-
Recently found an artist called "Big Data". His song titles describe what Big Data actually is really well:
-
Time for a soap box rant.
I just found this in one of our projects. I've simplified the example to make it more anonymous.
When I see code like this it automatically means there is a lack of attention to enumerations and/or understanding of what they are.
One may argue that in a certain execution of code it's a minor performance hit and therefore insignificant. It's still a performance hit. Furthermore, it takes even less time to do it the right way than it does to do it the wrong way.
Every one of these lines will enumerate the list from the beginning to try and find that one element you're interested in. Big O notation, people.
Throw that crap into a dictionary or hashset or similarly applicable data structure with direct reads at the beginning of your logic so that it only gets enumerated ONCE when the data structure instance is created. Then access it however many times you want.
Soap box rant over.15 -
The world is talking about AI, self-driving cars, big data, IOT and there are roboter driving around on Mars.
And here I stand, trying to figure out why a small change in a silly batch-script works on Windows7 and raises an error on Windows XP.
In 2020.2 -
BUZZWORD BUZZWORD AAAAAH
ARTIFICIAL INTELLIGENCE
BLOCKCHAIN
ALGORITHM
CLOUD
IOT
BIG DATA
SaaS
DEVOPS
5G
AR
VR
AAAAH BUZZWORD HERE BUZZWORD THERE3 -
First year: intro to programming, basic data structures and algos, parallel programming, databases and a project to finish it. Homework should be kept track of via some version control. Should also be some calculus and linear algebra.
Second year:
Introduce more complex subjects such as programming paradigms, compilers and language theory, low level programming + logic design + basic processor design, logic for system verification, statistics and graph theory. Should also be a project with a company.
Year three:
Advanced algos, datastructures and algorithm analysis. Intro to Computer and data security. Optional courses in graphics programming, machine learning, compilers and automata, embedded systems etc. ends with a big project that goes in depth into a CS subject, not a regular software project in java basically.4 -
How to get investors wet:
“My latest project utilizes the microservices architecture and is a mobile first, artificially intelligent blockchain making use of quantum computing, serverless architecture and uses coding and algorithms with big data. also devOps, continuous integration, IoT, Cybersecurity and Virtual Reality”
Doesn’t even need to make sense11 -
Perhaps more of a wishlist than what I think will actually happen, but:
- Everyone realises that blockchain is nothing more than a tiny niche, and therefore everyone but a tiny niche shuts up about it.
- Starting a new JS framework every 2 seconds becomes a crime. Existing JS frameworks have a big war, until only one is left standing.
- Developing for "FaaS" (serverless, if I must use that name) type computing becomes a big thing.
- Relational database engines get to the point where special handling of "big data" isn't required anymore. Joins across billions of rows doesn't present an issue.
- Everyone wakes up one day and realises that Wordpress is a steaming pile of insecure cow dung. It's never used again, and burns in a fire.9 -
Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it… — Dan Ariely4
-
PyTorch.
2018: uh, what happens when someone uses a same name attack? - No big deal. https://github.com/pypa/pip/...
2020: I think that's a security issue. - Nanana, it's not. https://github.com/pypa/pip/...
2022: malicious package extracts sensitive user data on nightly. https://bleepingcomputer.com/news/...
You had years to react, you clowns.6 -
I received 2 job offers:
1: c++ / c# / unity developer for a VR studio, tons of vr visors and shit to use
2: python / Java/somethingelse developer for machine learning, iot, big data
Offer n1 is from a small business 35 employees - casual outfit
Offer n2 is from medium/big business with 100-200 employees - suit and tie for all.
Same economic offer, 2 different and divergent paths on different but trending topics.
What do you choose and why18 -
Fuck you Intel.
Fucking admit that you're Hardware has a problem!
"Intel and other technology companies have been made aware of new security research describing software analysis methods that, when used for malicious purposes, have the potential to improperly gather sensitive data from computing devices that are operating as designed. Intel believes these exploits do not have the potential to corrupt, modify or delete data"
With Meltdown one process can fucking read everything that is in memory. Every password and every other sensible bit. Of course you can't change sensible data directly. You have to use the sensible data you gathered... Big fucking difference you dumb shits.
Meltown occurs because of hardware implemented speculative execution.
The solution is to fucking separate kernel- and user-adress space.
And you're saying that your hardware works how it should.
Shame on you.
I'm not saying that I don't tolerate mistakes like this. Shit happens.
But not having the balls to admit that it is because of the hardware makes me fucking angry.5 -
oauth (Yahoo) just opened sourced their data-processing & search engine!
It looks fricken cool, can't wait to play with it... and even more I can't wait to see what people make with it!
Yahoo!
[announcement](https://oath.com/press/...)
[docs](http://docs.vespa.ai/documentation/...)4 -
I see the industry popularizing Machine Learning programs using AI to implement ethical Blockchain as a Javascript framework using Scrum techniques for Big Data Web2.0 in Responsive Virtual Reality for your IoT Growth Hacking operations.3
-
We started working with some pretty big (in data volume) client. Around 4.000 projects with about 10 to 15 deliverables by project. Our software helps them plan/manage that.
US : Hey, so on this page we only display first 10, so it is fast and you can adjust using filters.
Client: No, I want to see all 4.000 projects on the same page
US : Well, for one year it will generate : 4000x10(deliverables)x12 editable fields. Your browser will crash. (No time to add virtual scroll)
Client: No, I want to see all 4.000 projects on the same page
US : Ok, here is pagination to help you.
Client: No, I want to see all 4.000 projects on the same page
US: …
Tomorrow is going to be fun.17 -
During an internship, I spent some time automating reports with VBA. Basically, imagine a few big excel sheets with 1000 formulas and a few thousand lines of VBA.
One of the reports was handed directly to the bosses boss of our boss. After 4 weeks, he came to me and asked why the table entry in row 23 or so was always 84. Well.. I dont know. This data is automatically calculated / retrieved from a database. Went and checked, already sweating, and found that
THE OTHER INTERN COULDNT FIX A FORMULA SO HE OVERRODE IT WITH PLAIN TEXT. WITH A FUCKING PLAIN VALUE OF 84. A FOOKING EXCEL SHEET WITH A THOUSAND DIFFERENT FORMULAS AND LOTS OF VBA. Needless to say, everything is password protected now.1 -
"Our company encourages cryptocurrency big data agile machine learning, empowerment diversity, celebrate wellness and synergy, unpack creative cloud real-time front-end bleeding edge cross-platform modular success-driven development of digital signage, powered by an unparalleled REST API backend, driven by a neural network tail recursion AI on our cloud based big data linux servers which output real time data to our Wordpress template interactive dynamic website TypeScript applet, with deep learning tensor flow capabilities.
Don't get what the fuck I just said? Udemy offers countless courses on python based buzzwords. Be the first out of 13 people to sell your soul and private information, and you'll get the first three minutes of the course free!"random bullshit cryptocurrency joke/meme ai fuck your buzzwords rest api deep learning big data udemy3 -
So as applying for an internship to a new company, they wanted me to make an account and do some things to get use to the website... That's great, until I learned their website is fucking garbage!
Takes 5 seconds to load any page (they import and link so much shit, it's poorly optimized), their website is vulnerable to Javascript injection (in many different places), im sure it will be vulnerable to sql injection too.
Their design looks bad, icons are terrible, no common design flow, super busy. And they are taking about using machine learning and big data? Bitch you need to fucking make your site usable first!! If contacted them and will give them 30 days to fix their shit before I write about it -
Want to make someone's life a misery? Here's how.
Don't base your tech stack on any prior knowledge or what's relevant to the problem.
Instead design it around all the latest trends and badges you want to put on your resume because they're frequent key words on job postings.
Once your data goes in, you'll never get it out again. At best you'll be teased with little crumbs of data but never the whole.
I know, here's a genius idea, instead of putting data into a normal data base then using a cache, lets put it all into the cache and by the way it's a volatile cache.
Here's an idea. For something as simple as a single log lets make it use a queue that goes into a queue that goes into another queue that goes into another queue all of which are black boxes. No rhyme of reason, queues are all the rage.
Have you tried: Lets use a new fangled tangle, trust me it's safe, INSERT BIG NAME HERE uses it.
Finally it all gets flushed down into this subterranean cunt of a sewerage system and good luck getting it all out again. It's like hell except it's all shitty instead of all fiery.
All I want is to export one table, a simple log table with a few GB to CSV or heck whatever generic format it supports, that's it.
So I run the export table to file command and off it goes only less than a minute later for timeout commands to start piling up until it aborts. WTF. So then I set the most obvious timeout setting in the client, no change, then another timeout setting on the client, no change, then i try to put it in the client configuration file, no change, then I set the timeout on the export query, no change, then finally I bump the timeouts in the server config, no change, then I find someone has downloaded it from both tucows and apt, but they're using the tucows version so its real config is in /dev/database.xml (don't even ask). I increase that from seconds to a minute, it's still timing out after a minute.
In the end I have to make my own and this involves working out how to parse non-standard binary formatted data structures. It's the umpteenth time I have had to do this.
These aren't some no name solutions and it really terrifies me. All this is doing is taking some access logs, store them in one place then index by timestamp. These things are all meant to be blazing fast but grep is often faster. How the hell is such a trivial thing turned into a series of one nightmare after another? Things that should take a few minutes take days of screwing around. I don't have access logs any more because I can't access them anymore.
The terror of this isn't that it's so awful, it's that all the little kiddies doing all this jazz for the first time and using all these shit wipe buzzword driven approaches have no fucking clue it's not meant to be this difficult. I'm replacing entire tens of thousands to million line enterprise systems with a few hundred lines of code that's faster, more reliable and better in virtually every measurable way time and time again.
This is constant. It's not one offender, it's not one project, it's not one company, it's not one developer, it's the industry standard. It's all over open source software and all over dev shops. Everything is exponentially becoming more bloated and difficult than it needs to be. I'm seeing people pull up a hundred cloud instances for things that'll be happy at home with a few minutes to a week's optimisation efforts. Queries that are N*N and only take a few minutes to turn to LOG(N) but instead people renting out a fucking off huge ass SQL cluster instead that not only costs gobs of money but takes a ton of time maintaining and configuring which isn't going to be done right either.
I think most people are bullshitting when they say they have impostor syndrome but when the trend in technology is to make every fucking little trivial thing a thousand times more complex than it has to be I can see how they'd feel that way. There's so bloody much you need to do that you don't need to do these days that you either can't get anything done right or the smallest thing takes an age.
I have no idea why some people put up with some of these appliances. If you bought a dish washer that made washing dishes even harder than it was before you'd return it to the store.
Every time I see the terms enterprise, fast, big data, scalable, cloud or anything of the like I bang my head on the table. One of these days I'm going to lose my fucking tits.10 -
Saw a question on SO asking why foreach was slow with big data.
The code provided was 6 nested foreachs (basically a cartesian product between an array of arrays, and 4 other arrays).
Inside, a select query and an "update or create" operation.
"But why is foreach so slow?"4 -
"What app is that?"
"It's like yik yak for developers..."
"Like FML for programmers..."
"Like one big groupme for computer scientists..."
"Like Josh from Intro to Data Structures..."
I'm running out of ways to describe devRant.1 -
Fucking shit uni is such a waste of time. We are learning Apache Spark in Big Data module. Fucking losers have Spark 1.6.0 installed while the latest version is 2.2.1 right now.
What a bunch of cunts. We are paying tons of money to study deprecated shits and a degree. A fucking degree that is not even on a piece of paper anymore.
Fuck this shit man.6 -
Worked as a student in a big company. Just doing data entry and checking product data. It was a nice part time job at the same time with computer science study. After a year I asked for something else and switched to the Android Team. I said I could do a little bit of Java and wrote for another half year Unit Tests. That was the point where I really learned coding and got experienced. Would never learned so much in my study because I was lazy. Now I can call me a Android Developer. Still love the company for giving me this opportunity.
-
My manager's boss just commited on a delivery date a month from now. We dont know what is to be delivered, nor does the client. We are supposed to work on a platform that we know nothing about. And of course the catchphrase is : yeah just use big data and spark. I'm dying...5
-
i want to get my own social network up and running.
so far ive got -
login 100% securely
register (1000% securely)
view someone’s profile (10^7% securely)
to add -
scrypt (maybe bcrypt, however scrypt looks like the better option)
friend a user
track their every move (ill use facebooks and googles apis for that)
to describe my product -
ai
blockchain
iot
big data
machine learning
secure
empower
analysis
call me when im a gazillionaire
but seriously, im making a social network and i hope its done by wk105 tbh3 -
Years ago, we were setting up an architecture where we fetch certain data as-is and throw it in CosmosDb. Then we run a daily background job to aggregate and store it as structured data.
The problem is the volume. The calculation step is so intense that it will bring down the host machine, and the insert step will bring down the database in a manner where it takes 30 min or more to become accessible again.
Accommodating for this would need a fundamental change in our setup. Maybe rewriting the queries, data structure, containerizing it for auto scaling, whatever. Back then, this wasn't on the table due to time constraints and, nobody wanted to be the person to open that Pandora's box of turning things upside down when it "basically works".
So the hotfix was to do a 1 second threadsleep for every iteration where needed. It makes the job take upwards of 12 hours where - if the system could endure it - it would normally take a couple minutes.
The solution has grown around this behavior ever since, making it even harder to properly fix now. Whenever there is a new team member there is this ritual of explaining this behavior to them, then discussing solutions until they realize how big of a change it would be, and concluding that it needs to be done, but...
not right now.2 -
So here I am testing some python code and writing to a file. No big deal. But damn is it taking a long time to get data back from this API. Ah it's fine I'll let it work in the background.
40 minutes later.
Oh! The requests timed out. No big deal. I'll just cut out the parts that are already done.
1st request in.
I wonder what the file is looking like.
Only showing 1 request.
waitaminute.jpeg
I should have more than that.
*Suddenly realizes that I was writing to the file and not appending.
Fuuuuuuuuuuuuuuuuuuck2 -
Google.
They’re doing amazing things but they are just too big now... Too much of a monopoly and the data is scary too.3 -
Hi.. one month ago i started to learn JavaScript (my first programming language)
In the 2nd proyect we create a Data dashboard i do my very best effort to create Js funcional code and other 2 girls works in css and html.
Im really proud of my work (1st time!)
A few guys told me JavaScript is awful and difficult but in a few weeks we will start in jquery.
In 2 weeks im gonna participate in Angelhack Santiago Hackathon 2018
I need an advice for me its a really big step10 -
When I wrote my first algorithm that learns...
So in order to on board our customers onto our software we have to link the product on their data base to the products on ours. This seems easy enough but when you actually start looking at their data you find it's a fuck up of duplication's, bad naming conventions and only 10% or so have distinct identifiers like a suppler code,model no or barcode. After a week or 2 they find they can't do it and ask for our help and we take over. On average it took 2 of our staff 1-2 weeks to complete the task manually searching one record of theirs against our db at a time. This was a big problem since we only had enough resources to on board 2-4 customers a month meaning slow growth.
I realized when looking at different customers databases that although the data was badly captured - it was consistently badly captured similar to how crap file names will usually contain the letters 'asd' because its typed with the left hand.
I then wrote an algorithm that fuzzy matched against our data and the past matches of other customers data creating a ranking algorithm similar to google page search. After auto matching the majority of results the top 10 ranked search results for each product on their db is shown to a human 1 at a time and they either click the the correct result or select "no match" and repeat until it is done at which point the algo will include the captured data in ranking future results.
It now takes a single staff member 1-2 hours to fully on board a customer with 10-15k products and will continue to get faster and adapt to changes in language and naming conventions. Making it learn wasn't really my intention at the time and more a side effect of what I was trying to achieve. Completely blew my mind. -
Why everyone is happy about Google clip? It's the single most scary instance of a big brother appliance that exists today. What are they going to do with the data? They say it's save memories of your kid or your dog. There's already something like that. It's called a brain and paying attention to your damn life. I don't want to be saved in your shitty memories just bc you are so insecure about remembering your fuck*ng memories.
I'm sorry for the outburst but that sh*t is solving a problem nobody had and it's getting applauded like those heaven's gate motherf*ckrs that say that life is improved by these shitty beliefs.26 -
Pffff...... Wanna make an app tomorrow...
Got no clue what to make....
Maybe something with big AI learning data machine. Yeah I think that hits all the right buzzwords :P
Any ideas you're willing to share?2 -
Yeah so seems like huge companies are literally just throwing tech buzzwords like "blockchain" and "cloud" and "big data" for marketing purposes.This annoys me.2
-
Client: THIS IS CRITICAL, SOME DATA HAS BEEN DELETED, WHAT ZE FUUK HAPPENED, UNDO THIS FAST
Us: so after carefully reviewing the code, related resources and the network traffic we conclude that was never sent in the first place.
*closes issue*
I'm glad we got such a meaningful bug report on the same day a production system started failing, one big deployment that that was like a boss with 3 phases, an unnecessary long meeting and an app developer that that wanted me to break HTTP standards.1 -
The scrum master for the project I'm working on decided to help out with changing some code (I'll add he's got a master's in software engineering and very proud about it..aka..big ego). It took him two days...yes two days to write the attached code.
I reviewed his code and sent back a response (code took about 15 seconds to write) including the link to the logging documentation explaining what fields were and were not necessary. Not sure how will look in devrant ...
var data = new InformationalDataPoint
{
Properties =
{
["RMANumber"] = rma,
["InvoiceID"] = invoiceId
}
};
Logger.Log(data);
He's stopped talking to me. Our next scrum meeting with the product owner should be ...um...awkward. -
If you're currently in college and wish to get placed in a major tech giant like Amazon or Facebook:
Don't learn React.js, instead learn Linked lists.
Don't learn Flutter, instead learn Binary search trees.
Don't learn how to perform secure Authorization with JWTs, instead learn how to recursively reverse a singly linked list.
Don't learn how to build scalable and fault tolerant web servers, instead learn how to optimally inverse a binary search tree.
These big tech companies don't really care what real world development technologies you've mastered. Your competence in competitive programming and data structures is all that matters.
The system is screwed. Or atleast I am.18 -
I've just realised, I don't care if Facebook are sharing my data... I mean, what I get more ads for men with big penises and ripped abs? So what...13
-
Data Disinformation: the Next Big Problem
Automatic code generation LLMs like ChatGPT are capable of producing SQL snippets. Regardless of quality, those are capable of retrieving data (from prepared datasets) based on user prompts.
That data may, however, be garbage. This will lead to garbage decisions by lowly literate stakeholders.
Like with network neutrality and pii/psi ownership, we must act now to avoid yet another calamity.
Imagine a scenario where a middle-manager level illiterate barks some prompts to the corporate AI and it writes and runs an SQL query in company databases.
The AI outputs some interactive charts that show that the average worker spends 92.4 minutes on lunch daily.
The middle manager gets furious and enacts an Orwellian policy of facial recognition punch clock in the office.
Two months and millions of dollars in contractors later, and the middle manager checks the same prompt again... and the average lunch time is now 107.2 minutes!
Finally the middle manager gets a literate person to check the data... and the piece of shit SQL behind the number is sourcing from the "off-site scheduled meetings" database.
Why? because the dataset that does have the data for lunch breaks is labeled "labour board compliance 3", and the LLM thought that the metadata for the wrong dataset better matched the user's prompt.
This, given the very real world scenario of mislabeled data and LLMs' inability to understand what they are saying or accessing, and the average manager's complete data illiteracy, we might have to wrangle some actions to prepare for this type of tomfoolery.
I don't think that access restriction will save our souls here, decision-flumberers usually have the authority to overrule RACI/ACL restrictions anyway.
Making "data analysis" an AI-GMO-Free zone is laughable, that is simply not how the tech market works. Auto tools are coming to make our jobs harder and less productive, tech people!
I thought about detecting new automation-enhanced data access and visualization, and enacting awareness policies. But it would be of poor help, after a shithead middle manager gets hooked on a surreal indicator value it is nigh impossible to yank them out of it.
Gotta get this snowball rolling, we must have some idea of future AI housetraining best practices if we are to avoid a complete social-media style meltdown of data-driven processes.
Someone cares to pitch in?14 -
INTERVIEW. It tells everything about the company. I recently applied for a "big" company for the position of ML Engineer. The Job description was like "someone with good knowledge of visual recognition, deep learning, advanced ML stuff, etc." I thought great, I might be a good fit. A guy called me the next day. Introduced himself as a manager of the Data Science team with 8+ years of experience. Started the talk saying "it is just an informal intro". But things escalated very quickly. Started shooting Data Science questions. He was asking questions in a very bookish way. Tells me to recite formulas (like big formulas). When I explained to him a concept, he was not understanding anything. Wanted a very bookish answer. I quickly realized I know more about ML stuff than him (not a big deal) and he is arrogant as fuck (not accepting my answers). Plus, he has no knowledge about Deep Learning. At the very end, he tells me "man, you need to clear up your fundamentals". WTH??? My fundamentals. Okay, I am not Einstein or Hinton, but I know I was answering things correctly. I have read books and research papers and blogs and all. When I don't know about things, I tell straight away. I don't cook answers. So the "interview" ended. I searched that man on LinkedIn. Got to know he teaches college students Data Science and ML. For a fee of 50,000 INR. It's a big amount!! Considering the things he teaches. You can find the same stuff (with far higher quality) free of cost (on Coursera, Udacity, YouTube, free books, what not). He is a cheater. He is making fool of college students. That is why I sometimes hate "experience". 8+ years of exp and he is such an a**hole!! BTW, I thanked God for saving me from that company. Can't imagine such an arrogant boss.
TLDR: Be vigilant during interviews. It tells a lot about the company.4 -
A couple of years ago, we decide to migrate our customer's data from one data center to another, this is the story of how it goes well.
The product was a Facebook canvas and mobile game with 200M users, that represent approximately 500Gibi of data to move stored in MySQL and Redis. The source was stored in Dallas, and the target was New York.
Because downtime is responsible for preventing users to spend their money on our "free" game, we decide to avoid it as much as possible.
In our MySQL main table (manually sharded 100 tables) , we had a modification TIMESTAMP column. We decide to use it to check if a user needs to be copied on the new database. The rest of the data consist of a savegame stored as gzipped JSON in a LONGBLOB column.
A program in Go has been developed to continuously track if a user's data needs to be copied again everytime progress has been made on its savegame. The process goes like this: First the JSON was unzipped to detect bot users with no progress that we simply drop, then data was exported in a custom binary file with fast compressed data to reduce the size of the file. Next, the exported file was copied using rsync to the new servers, and a second Go program do the import on the new MySQL instances.
The 1st loop takes 1 week to copy; the 2nd takes 1 day; a couple of hours for the 3rd, and so on. At the end, copying the latest versions of all the savegame takes roughly a couple of minutes.
On the Redis side, some data were cache that we knew can be dropped without impacting the user's experience. Others were big bunch of data and we simply SCAN each Redis instances and produces the same kind of custom binary files. The process was fast enough to launch it once during migration. It takes 15 minutes because we were able to parallelise across the 22 instances.
It takes 6 months of meticulous preparation. The D day, the process goes smoothly, but we shutdowns our service for one long hour because of a typo on a domain name.1 -
Encryption, Data, Servers, Protection, Certificate
oOOO WEE, I use big ear old words so I must be a hacker.2 -
Building an interface for a client between industrial power quality meters and a database that serves a webapp of data.
Client had heard of a way of sending data between meter and raspberry. From some manager in a big firm.
Currently we where using modus to connect the meter to a raspberry. This method was tested and proofen to work. Both devices could talk to each other in modbus.
Client kept demaning to use mbus, and was nog listening to any reason because the firm suggested it. In the end we end up going modbus to mbus to send it to the raspberry. There the mbus was converted back modbus. Because the meter could not communicate in mbus.
Really weird experience to program something so useless. But protesting about it was going nowhere and taking more time than the changes would take to implant.2 -
A big project in my company. Had some annoying race condition that caused data to get deleted when two processes finished in the wrong order they hit the dB and override each other’s work.
Long story short. Fixed the bug and in the process the codebase shrunk by 60%. I didn’t have to delete the rest of the code, but the bug was due to a function in the legacy section of the code, and found out that it was the only function used in that section.
So I deleted it. Rewrote the function so it upserts. And bam. Smaller, cleaner code :)1 -
Is it just me, or are the media / journalists once again putting a stupidly unfair pessimistic spin on that SpaceX launch?
"SpaceX rocket launches but explodes shortly into flight"
"Musk's SpaceX big rocket explodes on test flight"
"SpaceX rocket explosion: None injured or killed"
They've said time and time again, it's the first test of a massively complex rocket that's bigger than anything that's ever gone before it, and success is just defined as "getting off the launch pad" and collecting data. They did that and then some.
But instead of spreading excitement about the data, the fact it launched, that it's a world first, etc. - it's all doom and gloom, implying that the whole thing was a failure and people could have died 🙄
And people wonder why I have a low opinion of journalists.15 -
https://youtu.be/hkDD03yeLnU?t=8s
"I'll create a GUI interface using Visual Basic, see if I can track an IP address." 🤨🤔
I'll just blockchain a neural netwok for AI using big data in Delphi. -
At one point, my laptop's hard drive went down. Turns out, windows had written some garbage data to the mft, and fucked up the file structure. Luckily i was able to restore a big chunk of the data using recuva. I cleaned the disk after saving the most important files, cleaned the disk, reinstalled windows. All good so far. I put the laptop's drive and my recovery disk into my desktop to put back the files. During the install in forced me to make an account, which I wanted to delete. So I ran "rmdir /users /s" and went to grab a cup of coffee. Turns out, cmd was pointed at my recovery disk instead of my laptop disk. My whole backup wiped.1
-
I fixed my big data processing code, I think. If all works as planned, I'll wake up to some processed data that I can do some statistical analysis on... I hope...8
-
Been looking into 2D maps for a game. I am learning how to use tools that do autotiling. I want to have generated worlds for terrain. It is interesting how the scope of what you are learning starts expanding rapidly and can overwhelm you. I started wanting to learn autotiling. This went from that to autogen, to modifying terrain, to how to store generated terrain, to how to store difference between autogen and player modified, to how to separate things into chunks, to how to store a whole world worth of data! Like dude, chill. Just learn how to use autotiling first. Then learn how autogen, then learn how to efficiently chunk things,. Also the 2d data won't be big so just store the data you genned so if modified. The worlds don't have to be ultra huge. Really stop freaking out what it could be and see what it is. JUST FUCKING ITERATE!
It is wild to watch yourself get featuritus without learning how to crawl fist. Just divide and conquer.29 -
I never knew that I was a good mentor at SQL , specially at PL/SQL.
I gave a task to a new member of my team, to fill 5 tables with data from other 15 tables.
I informed him well about data table info and structure. He spended about 3 days to create 25 different queries in order to fill 5 tables.
After I saw the 25 queries, I told him, that he could do it with 1 main query and 5 insert statements.
So I spended 1 hour of training, in order to build,run and explain how to create the best sql statements for this task.
(First 5 minutes)
It was looking so simple at the beginning from starting with 1 simple join, after some steps he lost my actions.
(Rest 55 minutes)
I was explained the sql statements I 've created and how Oracle works.
Now , every time he meets me, he feels so thankful for learning him all those Oracle sql tips in 1 hour.
Now he is working only with big data and he loves the sql.1 -
Testers in my team have been told like 1000 times to follow the style guides that we all follow. That's not that big a deal. The big deal is that they were put on this project without having any mathematics background when the project is all about geometric stuff. So after me as a developer having to put so many hours to explain to them why the tests are not covering the requirements or why the tests are red because they are initializing the data completely wrong, I ask them pretty please to do the checks for the coding style and I have already been 4 hours reviewing code because not only I have to go through the maths and really obscure testing code to ensure that the tests are correct, but every line I have to write at least 4 or 5 style corrections. And some are not even about the code being clean, but about using wrong namespaces or not sticking to the internal data types. For fuck shake, this is embedded software and has to obey to certain security standards...3
-
Agile development of a decentralised AI, using a neural network based on Blockchain technology for big data.
Is that enough buzzwords to make an employer happy? :p2 -
So I joined a course for big data analysis. And they setup a lab specifically for us. Pulling us away from the usual computer labs
AND GUESS WHAT THEY DON'T EVEN BLOODY HAVE MYSQL INSTALLED. THEY'VE CHARGED ME A FORTUNE AND THEY DIDN'T EVEN CARE TO INSTALL THE BASIC SOFTWARES AND ITS ALREADY BEEN 2 WEEKS WITH THE START OF THE COURSE AND NOTHING.
F**king hate this man!!!!10 -
Can't get over how many big companies get away with poor/no documentation for their own APIs. The past week i have been working with a large insurance company that only via email threads explained what endpoint to send files to and what username I could use to get this to work.
I also worked with a major courier service last month that only had a two page document for all their methods and one of the pages was explaining the transportation of data via imagery haha.1 -
Well, my country has a Degree called Bsc.CSIT which literally means Bachelors of Science Computer Science and Information Technology. I completed that degree and was employed right after I completed my degree. I have worked in two offices and no one cares what degree I have.
So I think Degree is not that necessary here in Nepal as long as you can get the job done.
Now I am about to pursue a Big Data related degree hope that is not as worthless as my current degree.1 -
The word, "Code" being used as a verb,
"Cloud",
'Big Data",
Recruiters,
Scratch,
Any other crappy "Teach your kid to code!" Product,
And finally, mondays.
This is the comprehensive list of buzzwords and things in general that make me want to die right now7 -
Manager:
Hey this client sent us a list with all of their employees in this format... we would tell them to input it themselves but they're a pretty big client, so could you do it?
Me: Sure
*3 hours later*
... why am I taking so long...?
I look back at my code, and see that I've done a whole framework to input data into our system, which accepts not only the client's format but it's actually pretty abstract and extensible for any format you'd like, all with a thorough documentation.
*FACEPALM*
Why can I do this with menial jobs and not for our main code?3 -
What if people, life, humanity, the universe is just a cluster of CPUs running a giant Recurrent Neural Network algorithm? 🤔
-Sun and food == power source
-People == semiconductors
-Earth/a Galaxy == a single CPU
-Universe == a local grouping of nearby nodes, so far the ones we've discovered are dead or not what same data transport protocol/port as us
-Universal Expansion == the search algorithm
-Blackholes: sector failures
-Big Bang == God turns on his PC, starts the program
-Big Crunch == rm -rf4 -
Is python a good language for building a RestAPI? Personally I don't have any experience with python yet, but what I've gathered, is that python is great for scripting, and big data.
I have a bit of knowledge about Node.js, and I really like the structure, and it's so easy to make an API using express.js.
I've already read a bunch of articles about it, but I'd like to know what the community feels about the two languages?21 -
I don't know why they made so many algorithms, data structures and big O questions during interview, when all they wanted me to do was to maintain some legacy, tight coupled, spaghetti code with no architecture, documentation, tests nor any kind of engineering behind :/1
-
!rant
Does anyone know what the **day-to-day** differences are between working in IT (banks, hedge funds) vs tech (Google, Facebook, Netflix).
In my mind, I see Hell and Heaven. And there's a giant wall in between called "technical interviews + algorithms and data structures".
I'm on the Hell side... And not sure if I should climb the wall 😔
Is the wall even that big?8 -
TL;DR - (almost) childhood trauma due to Wesrern Digital crap products lead to lot of data loss and a plege to not trust or purchase their products for the rest of my life.
....
So, I got my first ever Wester Digital 2TB Mybook, back when 2TB was a really big thing. While in the midst of moving (not copying) a LOT of data to it, the damn disk just.. died. There was no fall, no power outage, no damage, it just stopped working. I was out of words and out of options. Tried yanking out the disk and connecting it directly to a system, but no luck because it looks like it's the HDD mobo that died.
Also stupid young me did not realise back then that, even if a "moved" the data, the original data is still most likely in their original location, and so, never bothered a recovery.
Lots of good stuff lost that day.
And as with a lot of you, my disaster recovery system kicked up 10 fold. Now I got redundant local and cloud backup copies of all critical and otherwise unattainable data.
As you may have guessed, I never bought another Wester Digital product ever again. My internal HDDs are Segate, and external is a suprisingly long lived Toshiba Canvio.6 -
Hello,
I just quit my job at a big market research company. It was disturbing how much processes there depended on excel and obscure visual basic scripts.
They load data from a database, do typical database tasks with excel and upload it back into the database.
PhDs run complex statical computations through an excel interface that passes the request to R.
Instead of an hour Python they execute stupid tasks with excel by hand. Day after day, month after month.
WHY? My colleagues were not dumb but instead of learning SQL and some python they build insane excel tables.
Maybe it's time pressure. But this excel insanity costs much more time in the end.5 -
There's this company that works in the RPA (Robotic Process Automation) domain, they describe their work as using A.I and Machine Learning to collect Big Data whilst following the latest trends in IoT, Blockchain, Web 6.9, and any other fancy term they can use while in fact, well... They're using an outdated software that uses vb6 modules to scrap some images and no employee have ever written a single line of code.
-
Deleted 5 databases out of which 3 of them contains important data. A team is working on a big feature from months on those databases. All of the data is gone.10
-
What would be the better approach for loading very big in size or in quantity files in java?
1) Loading data parallel through multiple threads
2) Loading data in series in a single thread
3) any other methodology?
Just asking because loading time is varying both cases.16 -
For all the hate that Java gets, this *not rant* is to appreciate the Spring Boot/Cloud & Netty for without them I would not be half as productive as I am at my job.
Just to highlight a few of these life savers:
- Spring security: many features but I will just mention robust authorization out of the box
- Netflix Feign & Hystrix: easy circuit breaking & fallback pattern.
- Spring Data: consistent data access patterns & out of the box functionality regardless of the data source: eg relational & document dbs, redis etc with managed offerings integrations as well. The abstraction here is something to marvel at.
- Spring Boot Actuator: Out of the box health checks that check all integrations: Db, Redis, Mail,Disk, RabbitMQ etc which are crucial for Kubernetes readiness/liveness health checks.
- Spring Cloud Stream: Another abstraction for the messaging layer that decouples application logic from the binder ie could be kafka, rabbitmq etc
- SpringFox Swagger - Fantastic swagger documentation integration that allows always up to date API docs via annotations that can be converted to a swagger.yml if need be.
- Last but not least - Netty: Implementing secure non-blocking network applications is not trivial. This framework has made it easier for us to implement a protocol server on top of UDP using Java & all the support that comes with Spring.
For these & many more am grateful for Java & the big big community of devs that love & support it. -
Client: "..and you will send us the big data via mail... right?"
At the end we send a 100 kb file.1 -
I hate doing front-end development...
I was hired along with another dev to build a webapp to manage the personnel of this big (2000+) company.
I made the backend and some of the frontend (mainly handling the data movement between the two), but my partner was let go after we delivered a first version because "there was not enough work for both of us".
The backlog is months of work for me and now I have to do everything and it's wearing me down...
I want to quit but it's paying well and I don't want to search for something new.
What do?6 -
I will major in AI. No, I will major in Big data, wait, I want to major in cloud too. I think I should first complete the courses I enrolled on cloud academy or the tens of courses in enrolled on Udemy on all the domains possible first! So many technologies, so many dreams!
-
Story Time: About Priorities and Sales
So at this point I'm working tech support for a company that makes some super cool networking equipment, think big data / data centers and such.
This company had grown at a good pace but the the support team had not (thus is the way for all tech support evetually). So I get a call from a frantic sales guy:
Sales: "OMG, where are with this ticket?!?!? It's a P2 ticket!!!"
Me: "Well the ticket came in 30 minutes ago, I emailed them some questions, but just so you know I have 8 P2 tickets, and 4 P1 tickets.... so it will be a while."
Sales: "OMG! Make my customer's ticket a P1!!"
Me: "Sure."
-call ends-
-30 minutes passes-
-sales calls again-
Sales: "OMG, where are with this ticket?!?!? It's a P1 ticket!!!"
Me: "Well I haven't gotten to them yet... just so you know I have 7 P2 tickets, and 5 P1 tickets.... "
Sales "ARGH!"
ʅ(´◔౪◔)ʃ1 -
Using an api: ok, this url (.../xml/endpoint) gives me an xml-document. Oh, and there is a node whose text contains html markup, interesting.
Using the same endpoint, but requesting json: yep, that's the same data, there even is this big html string and... why is this string in a json object wrapped inside "<![CDATA[...]]>"?
If i ever see a courtroom from the inside i'll plead insanity.2 -
Why is python supposedly something big data people use ? Sounds like r and stats and well I don’t see the adoption of that though python is used somewhat I note in a lot of Linux apps and utilities
Just seems strange that an interpreted language would be used that way to me or am I an idiot ?35 -
So I live in the middle of nowhere and therefore I have a very limited choice of different ISPs. The short version of the choices is a fast but very limited in data size or one that works 99% of the time (I'll talk about the 1% later) but doesn't have limits on downloads. So I obviously chose the second one.
It works pretty great most of the time and I don't have any problems usually... The problem with "usually" is that the 1% of the time it doesn't work is all it needs to frustrate me. I could be downloading a massive file and around 70% the Internet decides to disconnect. It wouldn't really be a big deal if it wouldn't cause the file to get corrupted.
My point is that if you're going to share a big file, don't upload it to mega, mediafire, dropbox or anything like that. Just use torrents. They work way better for big files.2 -
F*ck JavaFX. I mean, how a GUI framework doesn't have a standart navigation procedure? It is not even possible to create a page by constructor. In many other framework when I wanted to pass a data to a page, I just had to write
"new MyPage(SomeClass someObject)"
but in javafx I have to first create a constructor, link the fxml file to it then show the page.
Actually I am not angry. It is a big mistake to wait a good GUI framework from a company that has a website something like that in 2018.1 -
While smoke testing in production, I had to delete the sample entity I created to test the released feature, which is not a big deal
Until 20 minutes later, when I realized that I attached a couple of sub entities under it that contained actual live data1 -
Every single time I present a tool for data visualization:
"Oh that's great! Have you considered integrating it with service XYZ? It would be great to see the data from XYZ alongside this."
The answer I would like to give:
"No, you retard! Nobody gives a fuck about your crappy service! Nobody uses it, not even your own team! This is the 10th service that I've been asked to integrate and I don't have time to dig into the details of yet-another-shit. If you have time to waste, please go ahead but don't bother me."2 -
So my customer wanted me to collect Big Data using SQLITE... (Yeap SQLITE, like wtf...)
F*** my career.14 -
I am excited about all of the AI blockchain technology using IoT running in the cloud, as a service. It has all of the bells and whistles -- big data, hyper converged infrastructure, seamless integration, a sleek dashboard with everything in a single pane of glass. On top of all of that, it's future proof!1
-
Can anyone tell me what all things a developer should follow in order to be upto date. It's just too long of a thing.
I have been a back end developer, became a big data developer, then moving to becoming a full stack developer. Now I don't know who I am anymore. -
One of my colleagues who thinks he knows all about "big data": "I try to put everything in hadoop, that is my philosophy".
We don't even have a hadoop cluster. -
Just started a new job at a big co.
Expected to implement small new feature, no sweat about 30 LOC. Unfortunately no unit tests, no way to test without real data.
Spend 2 weeks trying to get it to run on the test rack. Lo and behold the entire testing system has been sitting broken for months and nobody knew. Why is all the documentation so vague!???5 -
Random opinion question:
I'm working on a thing where the user provides a big CSV and we process it and put it in the database, or update existing records.
This data impacts other things, but the data isn't front and center as a group of n the application for them to notice / see again (well they can query for it).
I'm thinking of taking the CSV and then presenting them with a table showing how we processed that data giving them a chance to review it before they commit it to the database...
I like this idea for two reasons:
1. If something goes sideways there's a chance someone will see it and I'm not sure I can do enough validation on a big ass CSV from god knows where to be sure we're going to process it right... (I'm going to do some validation but just can't cover it all)
2. It takes some of the mystery out of what happened / is happening for the user for.
Anyone try this in the past? Seems reasonable, but lots of things do before they go sideways.7 -
Using Oracle ADF along with ADF Faces to build a simple learning management system. No JavaScript, no external stylesheets, all inline styles, no client side validation, doing form submit for every field's onblur event triggering a server-side validation, creating a VO for every damn page requiring data, creating an EO for every DB table or view, adding big-ass custom queries for most EOs to join on multiple tables, frequent N+1 queries, etc.,
Idont remember the rest of the problems5 -
Just found the most embarrassing security hole. Basically a skelleton key to millions of user data. Names, email addresses, zip codes, orders. If the email indicates a birthdate, even more shit if you chain another vector. Basically an order id / hash pair that should allow users to enter data AND SHOULD ONLY AUTHORIZE THEM TO THE SITE FOR ENTRING DATA. Well, what happend was that a non mathing hash/id pair will not provide an aith token bit it will create a session linked to that order.
Long story short, call url 1 enter the foreign ID, get an error, access order overview site, profit. Obviously a big fucking problem and I still had to run directly to our CEO to get it prioritized because product management thought a style update would be more important.
Oh, and of course the IDs are counted upwards. Making them random would be too unfair towards the poor black hats out there.1 -
Can gamedevelopers stop using lua as their freaking scripting language..
Every time I try and figure out how tables work and think I finally get it it throws a big fuck you curve ball.
Oh and then they use json file to store the data of a table except that those json interfaces are complete retards.
If you are going to support json files then why the fuck won't you put in a small fucking inconsecential JS interperter so you can actually find some docs regarding more complex fucking docs then those simple minded t[guildName] = "guild"
Another thing, why the fuck does lua not use {} like every other langauge. I use those curly brackets to figure out where shit start and ends half the freaking time.
Fuck this I'm out for today...
And a big fuck you with both middle fingers to any dev that thinks lua is a great scripting language for plugins.3 -
So I'm in a scenario I'm uncomfortable, need some encouragement from fellow devRanters. (Looong post)
I've been working at this startup for about 10mths (since I graduated). They have been really good to me since the start, and overlooked some fuck ups I did at first.
But now I've been way more experienced , picked it up really quick. And I've basically redesigned several of their admin solutions and data products. Also, I'm basically their entire data analysis team now. I do backend (node, PHP, MySQL) and analysis for them (stats, deep learning, python, big data packaging for clients).
But seeing as I've moved in their company, and have been consulted on several major decisions, as well as built a really good relationship with some of their clients. I still haven't seen a raise, moreover I've been told that I'm expected to work from 8am to 5:30pm (9.5hrs no overtime pay). Which really pisses me off, since I know I'm worth more than what I'm paid (about 40k a year).
My brother (who's also a dev) suggested to tell them that I'm not happy at work due to this. And quit if they don't react well.
How should i bring this up? Should I really quit? This is all new grounds.6 -
I just realized that I subconsciously believe more lines of code means slower code.
It's not intellectual. I understand that little lines of code often are just calling other code. That this is not how Big O works or does not replace benchmarking and that some data structure requires a lot of code for immense speed up. E.g: B-Trees with sizes at page size for big amounts of data read from a secondary storage location.
But still, when I see a function with just 3 to 5 lines, my inner monkey believes it must be fast.
Know your biases, I guess.3 -
I'm not sure if it is a dev experience, but definitely boosted my morale.
In 2014, my company (in India) sent me to attend a conference in Boston. The conference was about big data.
When I came back, I wrote a blog post about Apache Spark in my company's blog. Because of the blog, my name got mentioned in a prominent newspaper's article about Apache Spark.
PS: That is my only blog post till date -
Guys, I think they are asking about Big Data Loss. People thought it's about Data Loss.
My Big Data Loss experience is when everyone in Hong Kong starts talking about big data using spreadsheets. So lost.1 -
In my project to process CSV file it needs to be transformed into XML file. Basically 100 MB CSV is turned into 3GB XML. They can be deleted afterward but it is not implemented and no one cares. Our storage gains around 1-2 TB data each month. Right now we have ~200 TB of XML in our s3
I think I can add Big Data to my CV *sarcasm*5 -
So, My usual dual boot setup is Linux(Dev stuff) and Windows(For gaming) and for Linux I always create different partition just to make sure I don't fuck up while removing Linux(mostly switching). So at one time I wanted to switch to CentOS and I accidentally deleted windows partition of C drive which had like 90+GB just for Steam....it was not Big loss but still it was pain downloading data for steam games.
-
Went to a Big Data workshop, now I know why there's a elephant as hadoop's logo and how it came. Still no clue how it works.
-
I find GPT3/ChatGPT an interesting development but at the same time I'm afraid which the spread of deep learning is going to take away further power from individuals and small companies to put it in the hands of big tech companies: the only ones who can afford to hoard countless GPUs/TPUs and exabytes of data to train top performing AIs.9
-
~rant
I think we need to change way how websites deliver themselves to its users. This HTML CSS JS clusterfuck is just a huge PITA in the ass.
What is a website?
It's an application where users find, communicate or share information, can buy or sell their penis pumps and loads of shady stuff.
Why must a website (the delivered application) be split into multiple languages/scripts and lots of HTTP requests?
In my opinion, PWA is a start to make us look at websites more like apps as we are used to on the machine, but they don't solve the mess.
Per my experience, many people working on websites regularly confuse what's executed on the server and what is on the client. They send data to the client via XHR, for example full DB tables of private data, just to then filter it in their beloved Array.filter function.
You can tell those people again and again and this is why I start thinking that the Web, as we know it, needs a big change.14 -
I'm gonna spin this as ridiculously awesome meeting. My company is currently expanding the local satellite office in to a full site. Part of that includes building a local presence for recruiting efforts.
I was part of a meeting I organized at my alma mater between my executive partner and two deans of the college. I am leading the effort to help them align their curriculum with modern practices, training them on free software licences with my company, and more. As well, there's an opportunity to train students on an untouched area of big data in the medical industry.
Less than 2 years with the company and partners (local, national, and international) in the company within my work area are sending me kudos.1 -
(Day 1 of Database Class)
Database Prof - "Your final project is a program or app that deals with big data and showcases data analysis. Make something that can be used by consumers and has real-world application."
(After the Day of Showcasing Projects)
Database Prof -
(What he should have said on Day 1) - "Make something that makes me laugh with added data." -
Anything missing?
"We are applying deep learning, NLP, machine learning, Big Data frameworks and other technologies to produce outstanding fintech products in areas of robo advisors, stocks and cryptocurrencies analysis, digital assistants, prediction of customer behavior, deep learning analysis of alternative data (satellite images) and other areas."
http://www.alpha-quantum.com/4 -
I found out the importance of time complexity. It might not seem like a big difference between O(1) and O(2). But there's a big difference hardcoding 500 lines and 1000 lines of data.
I made a navigation app for school using dijkstra's algo. However it had no data available so I had to hardcode it. Long story short, there was a ton of hardcoding. Always try to improve the time complexity of the code you write.2 -
Knew Android Studio Emulator was heavy but didn’t know it can take more than 6 PetaByte (~6,000,000 GB).🤣 Are they preparing it to run big-data queries on emulator.
-
I eagerly wait for the day some people will realize and believe that Hadoop is a file system and NOT a database!!
-
I usually like PHP, because it is easy to use, but FUCK! Can you just let me free the fucking memory by myself? Setting variable to null doesn't work, unset doesn't work either. I am still getting fucking memory exhausted error.
There is literally no data stored anywhere, because I unset every fucking thing.
gc_collect_cycles() doesn't work either, probably because this crap thinks there is a reference for this variable somewhere.12 -
My big data loss was I left my one-month college project and personal data on friend's laptop who don't know about what ` rm -rf ` does??2
-
I’ve been reading this book “Designing Data Intensive Applications” by Martin Kleppman. The concepts are really well explained !!2
-
I don't like when client decide which tech use in the project. I got some weird tech request like:
1. Move existing database from postgresql to Hadoop because hadoop is Big Data (is kinda move from amazon rds to amazon s3 just why? have you index, cluster your postgresql table?)
2. Move from mysql to postgresql because mysql cause deadlock (maybe their previous developer just fucking moron)
In this situation we just explain why we don't use that and propose alternative solution. If they insist with their solution either ignore it or decide not continuing the project.5 -
Perhaps as a tip for the junior devs out there, here's what I learned about programming skills on the job:
You know those heavy classes back in college that taught you all about Data Structures? Some devs may argue that you just need to know how to code and you don't need to know fancy Data Structures or Big o notation theory, but in the real world we use them all the time, especially for important projects.
All those principles about Sets, (Linked) lists, map, filter, reduce, union, intersection, symmetric difference, Big O Notation... They matter and are used to solve problems. I used to think I could just coast by without being versed in them.. Soon, mathematics and Big o notation came back to bite me.
Three example projects I worked in where this mattered:
- Massive data collection and processing in legacy Java (clients want their data fast, so better think about the performance implications of CRUD into Collections)
- ReactJS (oh yes, maps and filters are used a lot...)
- Massive data collection in C# where data manipulation results are crucial (union, intersection, symmetric difference,...)
Overall: speed and quality mattered (better know your Big o notation or use a cheat sheet, though I prefer the first)
Yes, the approach can be optimized here, but often we're tied to client constraints, with some room if we're lucky.
I'm glad I learned this lesson. I would rather have skills in my head and in memory than having to look up things and try to understand them all the time.5 -
I work on an webapp that should manage a huge ton of data, and some page needs to display a big part of them.
On this page, we had some checkboxes lists to display, so even more data. One of them wasn't behaving correctly tho, so we ask the support was could be the problem.
Answer : It might have too much data to display.
No shit Sherlock.
Answer : Please provide us a lighter version of it.
Ok, I'm gonna do a lighter version with a very few data so you can test a situation we will never encounter. Thanks ! -
I've been working in industry for 2-3 months after graduating from CompuSci last year, doing big data stuff surrounded by people with huge amounts of experience. I've learnt a lot but I'm still being overwhelmed by all the stuff I'm being told to do that seems second nature to my seniors and there's not enough time to Google it all and understand it ;____;3
-
Play Store's $25 registration fee - for getting PWA listed in their shitty catalogue? Who in the right mind would even jump in this clusterfuck of store to find a *web* app? For all you know, Google, there is such thing as QR codes - and customers can just scan the code (or type in that sweet address). Voila! Boom!!! Ching-ching!
Hello-hello, monopolistic cashgrabage! I came to inform you that your TWA bullshit is unneeded in ETHICAL space. The only ones who would benefit from this thing are permission-hungry publishers. And I'm already sick of this culture where people are put into store bubbles. You can't hide the fact that this data and features you provide, with "native" layer, may be misused in a jiffy - and by big players, no less. Of course, as a vile dumpster that you are, you don't mind it.
Don't even bring up a battery consumption that comes with PWA and browser. This doesn't matter if you use an app for some 2 minutes to tick your mental checkboxes! I'm just sick of app stores and native apps that collect the data without normal warning, and dare to take more than 1 second to fucking load the cached data. Take a lesson or two from PWAs that collect (probably useful) cache, instead of my specs, and load almost instantly.12 -
!rant
Started Data Science course on big data universuty. The outcomes are heavily dependent on domain experts/stakeholders!!! Since all the answers are false positives and need to decide what make sense with the help of domain experts. And most of the Data scientists are not from programming background, they are domain experts who turned into Data scientists. Thoughts if I should continue with learning big data/data science, knowing that I have knowledge in information retrieval and search engines. -
I looked at an SQL server today from a customer, talked with one of their devs and he said that he's unable to understand why the server misbehaves... All (!) queries were optimized, but they have 'big data queries'... Migraine started, I had a very bad feeling. Monitoring? Nooooppeeee. Migraine kicks in. Connected to server. SHOW GLOBAL VARIABLES...
After a bit of scrolling I found a lot of misconfigured variables (e.g. extreme large join buffers, unrealistic buffer sizes), high slow query count (nearly 60 % of COM_SELECT) and a few variables that were unknown to me.
Then came the version line.
5.0.46
Yes. 5.0.46.
Big data? Well... 30 GB of usage data.
I called the company back... The dev told me sternly that this was the production server (I had hope...) and that I lie - neither the version, nor the variables could be the problem.
A coworker had to verify it and our manager had to do the communication... Worst, most traumatic working day I ever had. -
Hi guys i need to vent with you. I live in Portugal.I graduated in computer science with 16 (0-20). While I was graduating I worked in my university programming for iot and big data fields. I have one article published in a scientific journal. I was looking for a job in my country, and I have gone to 5 interviews where they wanted to pay me about 700 maximum because they say this is my first job. The house rent is about 300 and with food and daily needs I can't have money to simple things in life. It's sad that companies don't give value to people they just think in money. It's sad that our work and knowledge is not valued...7
-
Trying not to get too hyped about AI, ML, Big Data, IoT, RPA. They are big names and I'd rather focus and expand skills in mobile and web dev
-
After a year of using mongo in prod and personal projects I have realised some things. Its really nice early on the project, especially when there are changing requirements and for small projects or proof of concepts.
But when you make commercial software things tend to get more complex and relational. Stakeholders want reporting and even a report building which a document store isn't the best at.
With most projects projects when they get big things get relational and this becomes more and more expensive to handle in terms of compute power and developer time.
I don't doubt mongo has its place, maybe as an secondary specialised data store or if the project is inherently document oriented.
Blog over.7 -
Why is everyone into big data? I like mostly all kind of technology (programming, Linux, security...) But I can't get myself to like big data /ML /AI. I get that it's usefulness is abundant, but how is it fascinating?6
-
It took me a month to self taught web dev with jQuery
- made 3 sites for school projects
Took me more than a month to learn the MEAN stack.
- taught it to students as a TA in software engineering class for 3rd year while I was at 4th year.
Took me 3 months approx to learn RoR and Clojurescript at my current work.
- year later I am one of the main devs, and pushed the company towards big Data while implementing scrum and pushing for devtasks priority.
Learned React but I am still struggling to figure out how to start a new project.
And I am still fighting Eleverytime I need to center in CSS.
Am I a bad dev mommy?5 -
I was part of a team that monitored 1700 radio stations world wide and reported the AirPlay to a BigData system. There were only a few engineers on the project and it was 15 years ago before big data design or technologies were widely available. Our biggest challenge - heat from running the servers so hard.
-
TL;DR: What do you hate about the current interview process for software dev positions?
I have been reading interview related posts on reddit and other places and I have noticed that there is a lot of hate, especially from more senior devs, towards the typical software dev interview pattern i.e. the one focused on algorithms and data structures and I don't understand why. The current methods may be far from ideal but I think they do a good job of eliminating the false-positives. Plus, I can't think of a better alternative. Sure, by using current interview methods some good devs might get rejected because they haven't used/needed/studied many algorithms and data structures after they left college, but for any big company that gets thousands of applications every year, that wouldn't be a big issue compared to the negative impact a false-positive may create. I am still in college so I maybe biased, I would like to hear your thoughts on this.3 -
Unless you are an analyst who deals with a lot of information*, your relatively simple computer programs should not require hundreds of megabytes of RAM. For the love of God, please remember this.
*“Big data” sounds stupid.7 -
Opinions please.
I want to share a small model in my iOS app. Now on android I'd do with with ViewModelProviders, but on iOS I'm going with SharedDataContainer which is basically a singleton class that store key value data.
Is there any better approach? Data will not be bigger than 10 list items with guid (key) and int (value)
However; when I have big data I do cache on disk or hello OOM exceptions (or whatever they call that bitch on iOS) -
Opinion: Julia will not catch on, because the two language problem is not a big issue for most people in the industry, and most python data scientist will refuse to migrate.
-
!rant
Rant from my previous work as a consultant Data Engineer (wish I had known this site back then).
During my stay at the place, we have a big client whose contact with us was an incompetent stressful fellow.
I single-handedly build a humongous automated data pipeline using Airflow. I am very proud of my baby as my first massive project and check it obsessively for every possible flaw, especially when writing down documentation for the poor soul that would take my place.
Luckily for me, everything is working as intended, until of course on my last day of work, shit hits the fan, and everything breaks down.
After a moment of initial panic: it was Thursday morning, we had a Machine Learning model to run over the weekend, predictions to make and reports to write and a very lovely next week deadline, I calm down.
"I won't be dealing with this shit anymore, starting from 18:00 PM and anyway Fear Is The Mind Killer."
Quite sure that it couldn't have been my code, I start looking at various logs when the culprit was clear. The B(ig) S(tupid) C(lient) changed the whole schema of the data he was feeding to us.
I call him: he has no idea of what was done to the data. Hell, at first he doesn't seem to remember what the deal with schema, data, and SQL is (the guy was supposed to be a big shot in the IT department). It turns out he hired one of our competitors to do his side of the collection pipeline. He tries to get mad at me, but everything he throws bounces back to him. I am calm yet ruthless pointing out how every major hiccup had been his fault and that I could quickly reach to his board of directors explaining why their Machine Learning model was late.
Result: he apologizes, extends our deadline, and I get a round of applause from other juniors who would have to deal with me had I failed.
Never am I happier to not work as an underpaid cannon fodder apprentice in a shitty consultant firm.
Luckily for me, everything is working as intended, until of course on my last day of work, shit hits the fan, and everything breaks down.
After a moment of initial panic: it was Thursday morning, we had a Machine Learning model to run over the weekend, predictions to make and reports to write and a very lovely next week deadline, I calm down.
"I won't be dealing with this shit anymore, starting from 18:00 PM and anyway Fear Is The Mind Killer."
Quite sure that it couldn't have been my code, I start looking at various logs when the culprit was clear. The B(ig) S(tupid) C(lient) changed the whole schema of the data he was feeding to us.
I call him: he has no idea of what was done to the data. Hell, at first he doesn't seem to remember what the deal with schema, data, and SQL is (the guy was supposed to be a big shot in the IT department). It turns out he hired one of our competitors to do his side of the collection pipeline. He tries to get mad at me, but everything he throws bounces back to him. I am calm yet ruthless pointing out how every major hiccup had been his fault and that I could quickly reach to his board of directors explaining why their Machine Learning model was late.
Result: he apologizes, extends our deadline, and I get a round of applause from other juniors who would have to deal with me had I failed.
Never am I happier to not work as an underpaid cannon fodder apprentice in a shitty consultant firm. -
!rant (I got down voted for this on Stack Overflow, so I try to discuss the issue with a more professional crowd.)
In a Software Engineering class, we had an assignment to read Parnas' seminal paper on modularization [0]. In this paper, two approaches of dividing a software into modules are discussed:
Traditional Approach: A flow chart is drawn to work out the single processing steps and the program's high-level flow. Then every processing step is turned into a module. This approach doesn't yield very good results.
New Approach: Every design decision will be turned into a module by the means of information hiding. This approach leads to much better results.
My personal interpretation of the term design decision is that the modules are identified as data structures rather than as processing steps of an algorithm. This makes sense, because data structures are much more suitable for information hiding then processing steps of an algorithm. (The information inside a data structure is hidden behind functions, whereas a function only hides more detailed processing steps and no information; the information is actually passed in as arguments.)
Why does the second approach work so much better than the first approach? Here comes my second interpretation: The single processing steps of an algorithm are not replaceable (and thus not reusable), whereas it's possible to convert data structures into other data structures.
And here's my question: Could that be the reason why software development using workflow engines (based on BPMN, for example) never really took off?
My personal experience is that the activities created in such workflows are hardly ever reused, but there often are big data structures passed around all the involved activities, even if most of the activities use only one or two of them.
My question exaggerated: Could we get rid of all those clumsy workflow engines by giving managers Parnas' paper to read?
[0]: On the criteria to be used in decomposing systems into modules (Parnas 1972)2 -
Today my project manager called Hadoop a data warehouse and a Big Data lake in a meeting. I couldn't decide whether to laugh my ass off or spend the next 30 mins explaining to her what Hadoop actually is.2
-
So my future isp Jio fiber is rumoured to be using DPI. Main proof comes when a executive said "It’s called Deep Packet Inspection, and what you can do with the analytics of that is mind-boggling," in a new article. https://reuters.com/article/...
Should I be afraid or am I just being paranoid. Also should I just switch to another isp altogether if they are using DPI.
Also mini rant :- They make it harder to use your own router by not allowing bridge mode on their router and custom onts dont seem to work. The best option is to connect lan port of their router to the wan port of your router and disable wifi on their router3 -
I run a small internal dashboard for my company. One of the big parts of this workflow is collecting data from various sources, so I can start using it. I collect it all to sql db so its in one place.
What is this called? Should this be a different job role, not the developers?8 -
!rant
I did just talk with a client because they tried to analyse some data we send and they fucked up. the client told me they can't fix it because they cnat access the data because they have a divious virus. xD you're the god damn IT department of a pretty big company. wtf? really how?1 -
My most cutting edge story would be working on Huawei fusionsphere cloud for some big firm.
They needed some scripts to setup servers for their newly built data center which will handle the live feed from traffic cameras from all over the country.
It was during college but I think it still holds the top place except for the new one that I'm working on right now. -
i feel its a great time to be a developer we have so many toys to play with
machine learning, scientific python, nodejs, frontend js frameworks, nosql, NLP, elasticsearch, mongodb, open source .net, big data with java, arduino..., VR, 3d printing
what toys are you playing with? -
I work as a data engineer in my company. My senior calls himself data scientist- he is 29 years old and recently did one MOOC on data science!
I wonder when my colleagues will find out about how much he really knows.
Till then I am cleaning and arranging his data, while he sits and earns a big fat package by citing one data scientist tag to his profile!! -_-1 -
Should i do hadoop big data course ? I am thinking of this summer to do on simplilearn. I am third year student undergraduate in IT. I am java guy and good in RDBMS.. Should i learn then?2
-
Turned up on customer site yesterday to do a says SME work for them like I have done every week for last 3 months..
As I walk in they took a decision 15 mins earlier to power off the platform I'm working on to do a backup ( on a big data platform?) and its down till 13. 30...
Irony? The minute they finally let me turn it on New data arrives in the platform so their backup is out of date and they wouldn't need backups if they'd followed my original design and distributed it over two data centres....
Oh and they 'forgot' I was coming so there was little / nothing to do for the rest of the day either
Clients can be a PITA but I can't really complain.... Easy day though! -
This guy was giving an introductory course on Big Data one year, was boring as f, came in class with unreadable 80 slides presentations, asked us to re-code one of the assignment he gave us for the term exam. I went to two of his classes and still rocked the assignments, flunked the exam tho.
-
Connecting local test server with live db for testing purposes. Needs 10 min to start up because much data is preloaded.
Checked against 0 instead of null in code. Big fat null pointer error greets. Another 10 mins lost. -
30 years old PHP code (PHP 5.3). One big global variable holding system settings, entire row sets of data! and database cursors. Oh and HTML was mixed in between. Worst part, I had the task to secure the application. Sql injection didnt even exist back then.2
-
What is everyone's opinion on companies/organisations 'too big to fail'...?
I was just pondering on how 'just Google it' has become so 'natural' as a way of saying search the Internet. The more I think about it, the less I like it.
I know the chances of them failing/crumbling are neary zero (hence the name) but if an org, Ie Alphabet, made some shit decisions and bankrupted their company, what would happen then? Any ideas? I don't mean in terms of social fallout, economic etc.
I mean in terms of network infrastructure, them being such a central part of 'the web', all their Dns services, their backbone links, Google drive, Google fiber etc. What would happen to all user data? Just be destroyed?
I've never 'seen' a large tech company collapse, but just wander as to how that process would work for such a huge organisation, and the literal mountains of data they have which will need destroying or relocating.
Inb4 watch Mr robot hurrr5 -
A medical equipment that you can attach to employees and excruciatingly kill them as soon as they say things like (please note that the list is not limited and we should use a speech to text API to provide NLP states for the meaning - I want to catch all false negatives!! Kill them all!!!!):
- It works on my machine
- I tested it before!
- Haskell is a terrible language
- Big data and actionable insights
- why do you need unit tests here?
- I am a recruiter
- Anything that comes with the following construction as well: "I don't have anything against X, but..."
Any other suggestions of phrases?2 -
This nonsense gave me an idea. Now I want to start building a big, organised, db for good bad examples. I can think of so many uses for it if everything is tagged/categorised well.
Thank you rando LinkedIn reject, you gave me the best birthday gift I could hope for... another potential branch of my data architecture to play with new data in new and to be discovered ways!
The site of the rando is athensnexus.com5 -
How many languages does one have to learn...? Learnt C, C++ and Java because of college courses. Learnt HTML, CSS, and vanilla JS because I wanted to learn frontend. Now learning R for big data analytics. Today, I came to know that I need to learn more Java or start learning Python for Hadoop...!!
😧😵1 -
So I'm in a situation where I have to send a big set of data (from a numerous set of profile), but I can't because the framework used has been thought for sending few data (from an only profile) and then get a timeout.
I should take it as a challenge, a hard one but a challenge. Gonna be funny (and tiring too I guess)1 -
I am currently creating a module where I have to put data in xls sheet from a given data, which contains date column,.
I have generated the sheet and the respective date column also has the format of Date which is default of Microsoft Excel.
But the big question noew arises that I am not able to sort the data according to the date column, the sorting is not working correctly.
If anyone has ever worked on this please tell!!3 -
I was a frontend developer, and I am new to hadoop or anything related to big data.
I am currently working as a Hadoop developer and I get to work on one of existing codebase also I am trying to recollect Java which I learnt during college.
Can u please provide me any inputs on how to get started with Hadoop, a personal view point on scope and future of Hadoop. A rough time span of how long it took for you to get out of the noob zone.
If you could provide me with a good tutorial or blog that would be awesome.
Thanks in advance1 -
In the relatively-near future , we will laugh out loud when we think back of the times when every person had a big box of CPU (or even chunky laptops) ,because the movement towards cloud based systems and further development of AI will make the CPU'S an absurd entity.
every device we would have would just be a window (😂) to access the central system.3 -
I'm working on a pretty big dataviz project. There was supposed to be a deadline at the end of the month for a first full demo.
This morning I wake up to an email from the client saying the deadline moved to next Monday and they just "forgot to tell me before".
Also, the dev from the client's company who is supposed to prepare the data/API to be used for the project said he'll be able to send something at the end of the week.
They're clearly not going to get what they want for Monday...2 -
I need to tell you the story of my MOAB (Mother of all bugs).
I need to write some stuff in C (which i am fairly used to) and have a function that allocates memory for a Matrix on the heap. The matrix has a rows and columns property and an associated data array, so it looks like this
struct Matrix{
uint8_t rows;
uint8_t columns;
uint8_t data[];
}
I allocate rows*columns + 2 bytes of memory for it.
I also have a function to zero it out which does something like this
for(int i=0; i < rows*columns;i++){
data[i]=0;}
Let‘s come to the problem:
On my Mac the whole stuff works and passes all tests. We tried the code on a Linux machine and suddenly the code crashed in various places, sometimes a realloc got an invalid pointer, sometimes free got an invalid pointer and basically the code crashed at arbitrary points randomly.
I was confused af because did i really make THAT many errors?
I found out that all errors occured when testing my matrices so i looked more into it and observed it through the debugger.
Eventually i came to the function that zeroes out my matrix and it went unusually high and wondered if my matrix really was that big.
Then i saw it
The matrix wasn‘t initialised yet
It had arbitrary data that was previously in the heap.
It zeroed out a huge chunk of the heap space.
It literally wrote a zero to a shitload of addresses which invalidated many pointer.
You can imagine my facepalm2 -
I saw The Big Short. Shows how connecting data when others don't or can't can present big opportunities. Christian Bale's character was very cool.
-
What if IBM's AI is a way to gather data on a bigger scale. I see many big companies and governments relying on it rather than having their own local AI servers. What do you think? 🤔3
-
Big Data is like sex as teenagers: everyone claims to be doing it, but few have actually had a proper experience.1
-
Data structure and Analysis
For experienced ones, add "System Design".
...path for big fat packages. -
I'm just dumping 10 GB of data remotely from a mysql db, because my el cheapo VPS run out of space
can you suggest a good book?
oh, actually I already found one, the title is "Prepare your fucking server/workspace properly if you want to play around with a lot of data"5 -
!rant
BIG FUCKING SHOUTOUT: THANKS!
Thanks Devrant for being the only app (for entertainment on my device) that works well with fucking 64kbit/s when out of data mid month.
Fuck YouTube watching over mobile data instead of wlan when at home.... F*** me -
Another question for Database-Gurus:
Is a MySQL table with 4 columns and about 42000+ rows considered 'big' or should the table be split in smaller pieces?5 -
The new UK law for data sharing with the governments is crazy with making it law for service providers to hold data of browsing history and big sites like google, facebook so on to retain human readable access to there data is they offer a service to the UK, what steps do we take to protect the data, service but also follow this law I can't see anything that would make any sense to be able to follow this law.
What are your views and ideas going forward, at the moment the UK as made it law even tho the EU said stop this madness, so lets take it as red its there, is there sense-able way to do this or are we going to have to provide UK users data a means to be back doored?11 -
Honest question. When do you consider yourself a "Big data engineer"?
Today I managed to create a system that collects historical metrics from monitoring tools every 5 minutes and do all sorts of crazy transformations to make them ingestible by grafana Mimir in OTLP protocol. Doing 600gb a dat, millions of active time series, .... And I still feel it's, "small"
Thoughts?5 -
I don't know how many of you uses IBM Watson api (personality insights). We use in our office. They send back a huge data known as big 5 needs etc. They find the personality of a person from his speech. like anger, happiness etc. I don't understand how they calculate them and also every client trust the data what ibm tells is correct. if it was you if you have done that feature too many questions might have come.
that's the difference between mnc and a startup2 -
So last year we started off with an IOT smart home project combined with SAP HANA, everything went well with the hardware side of it. Wired up everything and functioning smooth as butter. When we try to connect to the HANA cloud db to store sensor data... we find out that Arduino isn't supported. A big FML!!2
-
I ain't getting any summer internship so thinking to do a good course on big data and hadoop. Can't find the free proper source for beginners😕! Any suggestions? Whats your plan btw..m thinking to dive into web dev as well.2
-
Some dev removed everything from the only server that doesn't have any backup.
We know this server is temporary, but we didn't want to remove the data without fixing the current bugs on this temporary server.
So he deleted everything, and said he made a big mistake and went offline for two days.4 -
What are you guys studying/what are your professions? What do you think of your study or profession? Also, what do you develop/what do you use to develop? I'm going to be studying bioinformatics next year, we'll be using java and python. The idea is to write programs that can find links in big data that stems from research on diseases and genetics. My two favourite subjects were always biology and computer science, (even though computer science in my middle school is a joke) so this study really appealed to me. I'm curious about you guys.4
-
Which book do you guys recommend for Big Data Analytics?I am starting out as beginner.Pls comment!9
-
It is sometimes shocking to see 10+ developers working on a fairly big project (online quiz). Missing data binding operations here and there, as a result, bunch of sql injections, which successfully led to the entire db full of questions and answers sitting on my desktop.
Vulnerabilities have been reported, took them 2 weeks to understand what happened and fix them.
Pretty sad :/1 -
Back when I was starting out in a full stack role, I worked on a fairly big chunk of functionality that would trigger off a few entry points. It was wonderful for a few months. As we approached go live, our QA team started reporting weird intermittent issues. The logic wouldn't "trigger" the first time, but would on subsequent saves. Worse yet, the state required resetting of data every time we needed to test. Three weeks later, it boiled down to a 2 second time difference between the database's GETDATE() values and the new Date() object we passed in from our application.
I'll never forget that one system should be the source of truth again. -
Question to you all, do you really think you own your computer or system/data when almost all sites/services out there state very clearly in there ELUA(Fuck yes ours) that they might use your data how they feel fit, now this does not stop with websites, Mac, Windows and some Linux Distros also do this.
I for one stop thinking that I own data but I just change a few bits to make it look different these days, everything on your computer is not yours, we its and hardware, read the ELUA/TOS many hold the right to recall, revoke and so on use of the items to the point you paid for it they will take it back.
Items now sending keylogs, data usage and apps usage data to MS, Apple, some big linux distro, and YES this happens don't fool yourself Apple and MS both admit this happen and both US and UK now requesting these companies to let the have full access to this data, if it was not there they wouldn't want it.
This wont stop me from messing with code and loving tech but do you really feel you own anything anymore?
I don't :P7 -
So I've been studying masters in business analytics and big data. So far it's been 2 months and I don't have a clear picture about what big data actually is.
I guess that's normal and no cause of concern right guys ? -
How to start learning iot?i mean, here is what i understood after searching for a while: iot consists usually the hardware devices/sensors/robos which generate data/do something ; transmit this data to some server where calculations are performed and then show it to user.. And there are some kits worth a big amount which you gotta buy... is that all right?
Guidance please .:)2 -
We need a domain specific language for AI that is tailored for big data. So many tools are just not scalable to the size needed for these massive AI problems. It needs to be able to conceptualize and handle the fattest data in the industry.
We should call the language: Your Mom2 -
What do you think happens when enterprise software meets big data and user generated content? Idk, ask Github. These guys are sitting on a goldmine. The paradise of every big company. The only reason they're not faang is cos it's niche but they'll probably be influential (read, big bad) in the coming years
I predict the copilot thing is a benevolent side. Or maybe it still seems so since it's still in infancy and hasn't aggressively started snatching most developer jobs. What will become of us when that time comes? What other form of technology can computer still require our assistance to create?16 -
Can anyone suggest me a way to store and perform CRUD operation on 10billion x 10 billion matrix. Is there any way possible?8
-
Have opted for Big Data and analytics this sem as Dept. elective. Can't understand anything in the lectures(as usual). Any suggestions how to start learning?(books, tutorials, courses, etc.)1
-
There was this faculty who had a masters degree in big data, during my graduation.
She asked the whole class to install 'hadoop' on their machines as an assignment; in a situation where most of the class didn't even properly knew what Linux was.
I installed it and showed it to her.
She: Shreyans, can you help me install it on my system, I'm getting some errors.
Me: Sure ma'am.
She: On what Linux did you install it?
Me: Linux Mint ma'am.
She: But Mint's setup won't work for my Ubuntu. Do you have a setup for my Ubuntu?
Me: Whaaat!?
And I stood there frozen thinking what to reply..
#facepalm #facechair #facetable
#facehammer #facePanzer1 -
That moment when you waste two hours of your work life trying to find a dataset in a sea of crap to answer your bosse's question...
-
Account service needs migrating, to AWS cause thats where everything is going.
Manager has got it in her head that a document store would be ideal for this.
My knee jerk reaction was a big No, i was told we'd discuss this at a later time.
My main argument here is that data is inheritly relational, and now i'm looking for more.
Any ideas why a documentstore is not a good fit for accounts?
Thanks!1 -
Jira on Android 🙄
I had an overview of the backlog, blocked, active issues, could see who works on what and so on ...
Accidentally pressed the back button, returned to the project. Now it's a single big list with only issue type, description and deadline. It's pretty much a guessing game, which feature is open. Their seems to be no way to change that, not even deleting any data.
How did this happen?? Let me change it back!6 -
Mexico just got for a big earthquake and people is organizing a lots of ways to help.
> Some guys started a webpage and they are adding useful information and data for the people. They create a repo on GitHub to improve information.
> Mexican devs start discussing which technology is better for solving imaginary problems about escalate the servers, concurrency, creating a CMS, creating a public API, tokens for publishing the API... Instead of using something quick like firebase or some Trello to just publish info.1 -
tfw you're learning a new library and a function gives an error suggesting xyz in data. You assume you're using the function wrong. 5 hours later...turns out there is xyz in data ya big dummy!
-
What is a good python project to work on to showcase my skills to employers that work with big data and AI.6
-
What is the Hot Tech (Best Emerging Technology) of this generation?
Machine Learning, Internet of Things, Big Data, Android development.5 -
So, I feel wayyy behind the tech curve right now.
The SSD implementations you see online, they're still just a bunch of seperate sort of chaos machines that contain the standard perceptron-like model of a weight, cost, and bias right ? They just kind of inferred their values by training like any other neural network, in its sep-erate parts and just fed pieces of output data generated by other parts of the neural network to it right ?
I mean it implements with pytorch so its basically a really big array of tuples in a sense that are maniupulated in a specific way.
and then CNN's just feed data back into another trained piece of the model right ?
I'm curious because object classification is about the ONLY thing I've seen work even close to properly lol
there is just so much fraud these days. sigh.
and so many lamentable tech choices and attempts... like node lol -
For all of you who like to work with big amounts of data. GHTorrent, a project that produces almost 180GB in csv files containing GitHub data.
http://ghtorrent.org/downloads.html -
Once upon a time I was working with an engineer who loved sed and awk a bit too much. We had data stored in SharePoint that was retrievable via an RSS feed. Said engineer insisted on using curl to grab the feed and sed/awk to parse the HTML ...
I on the other hand suggested using libcurl (primarily for NTLM auth support) and parsing the RSS feed using libxml.
Which engineer do you think management decided supporting?
Hint: Reusability and maintainability were big requirements in this project.1 -
I working on download function feature and full of 8 hour been debuging in local to find out why the download is notworking (this is not the main issue).
And what I found there. All the problem is clear, and I get it know, I've been using fs to save the log (you know download data is big and it's hurting my eyes even when using console.table). And using nodemon for running the project.
Image is just illustrator lol4 -
What is the best searching algorithm for big data technologies like Machine learning and Neural networks?
ANY GUESS!!!
Comment it.5 -
Daily coding would be VS Code.
> Lots of extensions and works well if the project isn't too big.
Quick and cheeky edits is Notepad++.
> "Open in Notepad++"
Serverside edits is vim.
> I don't really know any other terminal editors.
IDE would be the IntelliJ platform.
> Its just built very nicely.
For SQL (which i don't do very much) I took a liking to Azure Data Studio.