103

The thing about UNICODE is the ability to make it hard to parse what humans automatically see as ASCII

Comments
  • 6
    s/unicode/utf-16/

    FTFY
  • 10
    @tokumei I find the reading of this to be a difficult task.
  • 8
    My eyes hurt
  • 1
  • 0
    They are hurting!!
  • 1
    I actually got through every unicode Character which could represent an ASCII Character and noted them down in one Java method. It's not actually that much to write because they are mostly in one big Block and you can just map them to real ASCII.
  • 4
    @EaZyCode Unicode agrees with ASCII on the lower 7 bits. That means UTF-8 and ASCII are identical for all standard ASCII characters.
  • 1
    @devios1 i mean stuff like greek alphabet (Α) and latin alphabet (A) which look almost identical. Or you can add 0xFEE0 to any ASCII character and almost always get a similar looking Character.
  • 0
    @EaZyCode Interesting. What is the use case of that?
  • 4
    @devios1 everywhere, where someone can write Unicodes and you want to process some text. For example if you want devRant statistics and you search for certain words, you would never get this rant. If you pre-process it you get this rant in normal ASCII and it would work. I used it for an Anti-Advertisment program (just a small fun project)
    And the use case of using these tricks would be to bypass such filters without that pre-processing.
  • 0
    @EaZyCode yay, someone got it.
  • 0
    My brain typically slowed down while reading this.
Add Comment