4
phorkyas
41d

Another hours wasted on debugging, on what I hate most about programming: strings!

Don't get me started on C-strings, this abomination from hell. Inefficient, error prone. Memory corruption through off by one errors, BSOD by out of bound access, seen it all. No, it's strings in general. Just untyped junk of data, undocumented formats. Everything has to be parsed back and forth. And this is not limited to our stupid stupid code base, as I read about the security issues of using innerHTML or having to fight CMake again.

So back to the issue this rant is about. CMake like other scripting languages as bash have their peculiarities when dealing with the enemy (i.e. strings), e.g. all the escaping. The thing I fought against was getting CMake's fixup_bundle work on macOS. It was a bit pesky to debug. But in the end it turned out that my file path had one "//" instead of an "/" and the path comparison just did a string comparison without path normalization.

Stop giving us enough string to hang ourselves!

Comments
  • 2
    String escape, quoting etc. — yes, I feel your pain.

    But fucking CMake. It's in a class of its own. Look at its quoting rules, it's fucking hilarious. Also: https://dev.to/slurpsmadrips/...
  • 6
    What's really great about C strings: you can have a pointer to somewhere in the string and hand that over to any string parsing routine to parse the remainder of the string.
  • 6
    Try wearing a g-string if that makes you feel better
  • 0
    @electrineer nah, no fling for the g-string. Better no clothes at all. (somehow I was thinking about guitars first - probably my head wanted to avoid awkward pictures..)
  • 2
    I prefer Pascal strings.
    One bite of overhead; much faster and easier to work with.

    Downside: limited to 255 bytes. 🤷🏻‍♀️
  • 0
    @Fast-Nop there might be nice hacks, but in general I don't like the length information only be retrievable in O(n), and most operations also being O(n).
  • 0
    @Root The Twitter of programming languages.
  • 1
    @phorkyas In C, at least, there's little stopping one from wrapping the pointer and length in a struct, and writing one's own string manipulation routines. The library string-manipulating facilities are sufficiently small that it is not such a great undertaking.

    For C++ of course there is std::string, which is a workable compromise.
  • 0
    @phorkyas Pascal strings.

    They’re C strings with one small change: string[0] contains the length of the string. This allows you to fetch the last character in two operations instead of O(n). It also helps with e.g. memcpy, comparisons, etc.

    It’s the light version of what @halfflat suggested.
  • 1
    @Root You speak right from my soul.
    (German English but too late to find correct translation)

    (I meant for bringing up pascal strings. Kept searching too long fort translation)
  • 2
    @phorkyas That's mostly an issue if it suddenly turns into O(n^2), which it does when repeatedly appending strings to a given string.

    In that case, the usual solution is a custom strcat that returns a pointer to the nullbyte (after appending).
  • 1
    @halfflat I still have fond memories of Pascal. The compiler was so fast (TurboPascal). And just clicking together your dialogs in Delphi felt cool in 1998.
    Now I'm mostly in C/C++, which is OKish only macOS manages to be a bigger PITA than the full distro hell of Linux together.
  • 1
    @halfflat Usually, I love rant, especially if it pisses at so well established an painful like CMake. But then again: Aren't a lot of the pains self inflicted? This dumb tool was only meant to generate your build system/files, why should it write binary output itself? Maybe you can open a tin with a fork, but no wonder it'll be painful. (but yeah computing: just give me Nand operation to build a CPU, Turing complete language to do universal computing or a write/read primitive to own your device - so sure you can do _anything_ in Cmake)
  • 0
    I’m not a c expert, but is this more like a c make problem than a c problem? Isn’t c strings really fucking simple?
  • 0
    @jesustricks So they seem, but then someone messes up some edgecase and there is the next vulnerabity. Or Linux introduces a hack like abstract namespspace sockets (which I love) with a NUL termonation at the start!

    CMake is adifferent story. Also extremely string based. Eg a list is just a string with elements delimited by ';'.
Add Comment