devRant - A fun community for developers to connect over code, tech & life as a programmer

Search - "encoding hell"

45

PonySlaystation

20863

8y

*wrestling commentator voice*
"In this weeks episode of encoding hell:
The iiiinnnfamous UTF-8 Byte Order Mark veeeersus PHP!"

For an online shop we developed, there is currently a CSV upload feature in review by our client. Before we developed this feature, we created together with the client a very precise specification, including the file format and encoding (UTF-8).

After the first test day, the client informed us, that there were invalid characters after processing the uploaded file.
We checked the code and compared the customer's file with our template.
The file was encoded in ISO-8859-1 and NOT as specified UTF-8.
But what ever, we had to add an encoding check, thus allowing both encodings from now on.

Well well well welly welly fucking well...

Test day 2: We receive an email from said client, that the CSV is not working, again.
This time: UTF-8 encoding, but some fields had more colums with different values than specified.
Fucking hell.
We tell the customer that.
(I was about to write a nice death threat novel to them, but my boss held me back)

Testing day 3, today:
"The uploading feature is not working with our file, please fix it."
I tried to debug it, but only got misleading errors. After about 30 minutes, at 20 stacks of hatered, I finally had an idea to check the file in a hex editor:

God fucking what!?!!?!11?!1!!!?2!!

The encoding was valid UTF-8, all columns and fields were correct, but this time the file contained somthing different.
Something the world does not need.
Something nearly as wasteful as driving a monster truck in first gear from NYC to LA.

It was the UTF-8 Byte Order Mark.
3 bytes of pure hell.
Fucking 0xEFBBBF.
The archenemy of PHP and sane people.

If the devil had sex with the ethernet port of a rusty Mac OS X Server, then 9 microseconds later a UTF-8 BOM would have been born.

OK, maybe if PHP would actually cope with these bytes of death without crashing, that would be great.

rant encoding hell bom utf-8

3
22

Root

77073

6y

Testing hell.

I'm working on a ticket that touches a lot of areas of the codebase, and impacts everything that creates a ... really common kind of object.

This means changes throughout the codebase and lots of failing specs. Ofc sometimes the code needs changing, and sometimes the specs do. it's tedious.

What makes this incredibly challenging is that different specs fail depend on how i run them. If I use Jenkins, i'm currently at 160 failing tests. If I run the same specs from the terminal, Iget 132. If I run them from RubyMine... well, I can't run them all at once because RubyMine sucks, but I'm guessing it's around 90 failures based on spot-checking some of the files.

But seriously, how can I determine what "fixed" even means if the issues arbitrarily pass or fail in different environments? I don't even know how cli and rubymine *can* differ, if I'm being honest.

I asked my boss about this and he said he's never seen the issue in the ten years he's worked there. so now i'm doubly confused.

Update: I used a copy of his db (the same one Jenkins is using), and now rspec reports 137 failures from the terminal, and a similar ~90 (again, a guess) from rubymine based on more spot-checking. I am so confused. The db dump has the same structure, and rspec clears the actual data between tests, so wtf is even going on? Maybe the encoding differs? but the failing specs are mostly testing logic?

none of this makes any sense.
i'm so confused.

It feels like i'm being asked to build a machine when the laws of physics change with locality. I can make it work here just fine, but it misbehaves a little at my neighbor's house, and outright explodes at the testing ground.

rant what is logic rspec testing arbitrary failures

4
16

kiki

36855

8y

Just imagine the world where we have CAPITALIZED DIGITS.

rant fuck utf-16 bullshit encoding hell

4
6

ThatPerlDeb

1520

8y

OK what the actual fuck is going on within this company.

TL;DR: Spaghetti Copy/Pasted code that made me mad because it's just a mess

I just looked into a code file to search for a specific procedure regarding the creation of invoices.
I thought "Oh this is gonna be a quick look-through of like 1000 lines MAX" turns out this script is 11317 fucking lines long and most of it's logic is written there multiple (up to 6-7 times). And I'm not talking about a simple 10 lines or something. No! Logic of over 300 lines.. copy & pasted over .. and over .. and over?! I mean what the fuck did this guy drink when he wrote this.

Alsooo 10000 of those 11317 lines is ONE FUNCTION.. I kid you not! It's just a gigantic if / else if construct that, as I said before, contains copy-pasted code all over the place.
Sadly my TL thinks that code cleanup / optimization is "not necessary as long as it works" like wtf dude. If anyone wants to ever fix something in this mess or add a new feature they take a few hours longer just to "adjust" to this fucking shit.
This is a nightmare. The worst part: This is not the only script that has shit like this. We got over 150 "modules" (Yeah, we ATTEMPTED something OOP-ish but failed miserably) that sometimes have over 15000 lines which could be easily cut down to 1/3 and/or splitted into multiple files.
Let's not start about centralization of methods or encoding handling or coding standards or work code review or .. you get the point because there's a character limit for one rant and I guess I'd overshoot that by a lot if I'd start with that. Holy shit I can't wait until my internship is over and I can leave this code-hell!!

undefined fml mess wtf

2
5

Lensflare

21314

1y

Why the hell do languages like Kotlin (Java) and C# handle dates and datetimes so needlessly complicated?
There are multiple types with different implementations and concepts like local time or time zones represented by those types. Some of them have capabilities like serialization, some of them don’t.
Parsing and encoding is tied to the types.

Why? Take Swift as an example:
It has one single Date type (including time) which represents a point in time independent of any calendar, time zone, encoding or format.
There is a DateFormatter to parse from APIs from iso or timestamps or whatever and to format to UI as a string in any language (localization), for any region, in any format.
If you just want a container for the date time components themselves (which the concept of local date time seems to be in those languages), you can use the DateComponents type. If you are interested in dates from the perspective of a calendar, there is a Calendar type.

Everything makes sense and the different concepts are decoupled from each other as they should be.

Damn! My memory about C# is a bit hazy but Kotlin, I’m disappointed in you! Date handling is a horrible mess!
Ok, I guess I can blame it on Java and JVM.

rant kotlin date time wtf

6

Top Tags

rant linux code windows fuck i java c programming android dev the is javascript js a life joke python

Weekly Rant

Most unrealistic deadline you've had?

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service