Story: Password hashing and UTF-8

Context: PHP 5.6, 270kloc 15+ years legacy project. ~3 years ago. tl;dr at bottom.

Password hashing & verification was done with an obsolete way of hashing passwords. I was given the task to update our password handler to from now on generate passwords with PHP's good built-in password hashing function.

It was decided that old passwords still needed to work, instead of prompting users to set a new password. The old password verification still had to function in conjunction with the new.

The previous password handler was split into multiple classes, due to (I assume) poor structuring and shoehorning in an object oriented approach. Furthermore, it abused global variables.

A new password handler had to be created.

I implemented the new password verification and creation methods (which now used PHP internal password functions), and it worked perfectly. Then to get the old password verification to work.

I removed all obsolete methods from the old handler, and was left with a hashing function which took in a password, salt, and a secret key. I copied this code into the new handler.

It failed. It returned "Password does not match" for old passwords. I was unsure what had happened here. I did all sorts of shotgun debugging. I ended up with two versions of the login page next to each other, which used the old and new code respectively. I started modifying the original code, extracting variables, logging, you name it. I ended up with exactly the same snippets of code in both password handlers, and yet it failed.

The culprit? The character encoding.

Because this project was over a decade old, the .php-file had the encoding 'windows-1252'. When I created the new password handler, my IDE set the file encoding to 'UTF-8'. Then when I copied the secret, my IDE converted the string to 'UTF-8', effectively changing the value of the secret and causing any password verifications to fail. The solution was to manually create a string using the byte values in the old secret.

It is these extreme, obscene, scenarios which makes working with legacy projects a living hell. In this scenario, it was my IDE at fault for changing the character encoding.

But my IDE is not the root problem. No, I blame it on the lack of maintenance from previous developers. Not keeping the codebase up to standard causes problems like this in the long run.

tl;dr: copied hash secret to a file with another encoding. IDE changed the byte values for those characters, causing password verification to fail. fml.

  • 5
    Something similar happened to me recently as well. It wasn't the character encodings but the bloody line end character at the end of the string 🙈. Its crazy-making...
  • 2
    I had a similar problem 4 years ago with some editor inserting UTF 8 BOM at the beginning of the file and causing server errors after deployment.

    I only found it out by accident after I opened the file in a hex editor out of desperation.
Add Comment