15
AlgoRythm
246d

Something I learned the hard way: the steak emoji is 4 bytes, whereas a lot of the commonly used emojis are only 3 bytes.

Comments
  • 9
    Gotta have an extra byte of that stake
  • 3
    Depending on your circle, many commonly used emoji are more than 4 bytes. The trans flag for instance is

    White flag (3 bytes)
    ZWJ (3 bytes)
    Transgender symbol (3 bytes)

    Interestingly, the transgender symbol on its own is hardly ever used.
  • 1
    Encoding issues.

    Never assume the byte length of a given character
  • 0
    Both are just a single grapheme cluster though.
  • 3
    Can we just get rid of emoji? They are stupid and always have been, and people have clearly wasted far too much time and effort on them.
  • 1
  • 1
    You could remove emojis, but most.of Unicode's complexity comes from supporting other languages people actually speak, with emojis piggybacking on that, so it would not really help.
  • 0
    @spongessuck high codepoints and zwj are used by Arabic and Mandarin, alongside a bunch of other scripts that together are more popular than Latin.
  • 2
    @lorentz I'm not at all saying go back to ASCII, but there's a difference between symbols for the purpose of writing and symbols for the purpose of sending someone a picture of an eggplant.
  • 0
    @gollark ='( is much more expressive than 😢
  • 0
    @spongessuck In our current system for representing writing the eggplant comes more-or-less for free
  • 1
    @lorentz Well, thats something new I learned about the flag of my people!
  • 1
    @Alexanderr 凸(^▼‿ ▼^)凸
  • 0
    is this some kind of pleb joke i'm just too UTF32 to understand?
  • 0
    @lorentz Yeah it's interesting and unexpected that many emoji are just combinations of other emoji - rather than an entirely separate thing

    I'm not sure but I guess the family emojis is just a combo of 3-4 others. 👪👨‍👩‍👧‍👦
  • 0
    @tosensei In the name of future-proofing, each character now takes up the size of 10 average Rihanna songs. This way, in case we ever communicate using Rihanna songs, they can get about 10 times longer before we run out of space.
  • 2
    Just happened to listen to this talk, it starts from the very beginning of encoding and explains even the pride flag in the end, as well as why Windows only supports three flag emoji. https://youtu.be/gd5uJ7Nlvvo/...
  • 0
    That talk btw explained why people see Chinese characters in devRant. And it was quite entertaining, so I recommend checking it out if you've got an hour of time where you need someone speaking to your ears.
  • 1
    @electrineer I'll watch when I get home!
  • 2
    @AlgoRythm well, 4 bytes per char are more then enough to represent the _total informational value_ of each and every rihanna song, so.....
Add Comment