so far, a pretty nice tutorial book about making a parser and interpreter, but when i saw this, my thought was "don't you fucking dare..."

...but he does. flow control via exceptions.

  • 0
    he claims.
    thoughts? any other non-wtf ways you can think of?
  • 3
    Sounds and looks like Java.
  • 1
    @Midnight-shcode Honestly, that's fucked up lazy bum excuse.

    I would raise an eyebrow in a eye to eye situation and ask the person if they meant this seriously serious or if they deliberately try to make me angry.

    Usually recursive and tree calls for binary trees.

    There might even be some fitting data libraries around - after all, binary trees are pretty common.

    Otherwise it's not that hard to implement....
  • 2
    IIRC the way I did it in rust when I wrote a small c compiler, it would adjust the current line/token/column for lexing etc and just return for the parsing which makes the IR because it would recursively pussle together the pieces
  • 0
  • 0
    "fucked up lazy bum excuse"

    i agree, but i'll defer definitive opinion until i've read the chaptera where hw implementa the same parser and interpreter in C. if they're good, i'm gonna forgive him being a lazy bum in the java half of the book =D
  • 0
    @IntrusionCM wait what? what are you talking about with trees? of course he's using trees, he's traversing an AST, so of course, and he implemented it all in thw book naturally, no problem there, i was complaining about using exceptions for control flow, what are YOU complaining about?
  • 0
    ... now this is something i'm still not sure i wholly understand. how do you implement an actual own *compiler*? barring using LLVM, do you... actually hand-craft and hand-type the actual machine code fragments for all the operations?

    i mean... yeah, that makes sense, but for some reason it seems strange/incredible to me. i can grok the whole process from the source code, up to before the executable, but for some reason the idea of writing a program where i define which sequences of bytes to emit for each operation, and that program then spits out an actual working executable... that is still slightly crazy/unimaginable to me, and moreso the more i read about cpu architectures and instruction sets.
  • 0

    If I understood correctly, exceptions are used to generate a controlflow for executing the generated AST...

    If you want to avoid exceptions, you need to implement a control flow graph - so you know which method was traversed.
    A stacktrace is just an iteration over an control flow graph, printing the information inside.

    My assumption was that you didn't want to generate an control flow graph for the Java language, rather for the parser you seem to have implemented.

    Which requires the implementation of another data structure, as you need to track the order of execution....

    * control flow graphs mean usually a visual representation of the executed code. I didn't know a better word to describe what I mean.
  • 0
    @IntrusionCM oh, okay, yes, I understand now.

    sorry, I was tired.

    well, my thought, though, was that in case of handling just return statement, all you need is to bubble up the AST one level above the nearest function/method, no?

    no need to actually keep any call tree, or even a stack, just implement early returns for anything that evals a list of statements (such as block node, and... hmm... that's all, basically)... no? am I missing something?

    make statement nodes able to return a value, normal statements return null, return statement returns the actual value on eval, and then each time you eval a statement node just check if it returned anything, if yes, we know it's either directly the return statement itself, or it's a value of one bubbling up the AST, and just keep returning until we're back at a node that did the call(), at which point we use/discard that return value...?

    would be annoying to implement the check to all the nodes, but... would be more correct(?)
  • 1
    @Midnight-shcode You start top to bottom, scanning, lexing, parsing to IR, do whatever optimisations you wanna do, chuck it over to assembly and then you either use another assembler or start writing your own. But the asm step is very nice for verification/debug same for pretty printing your ast and IR

    I mean just for flat binaries you just take the assembly and encode the instruction as machine code, the ISA defines what numbers mean what as well as plan your memory, put the things at the right place etc.

    For elf you just put the things in machine code but position independent code I guess and put the fields according to the spec for elf then your OS's linker+loader will happily jam it in and run it. If you're still hungry you can write a linker+loader yourself which just means you have to translate the symbols to actual memory addresses etc and copy in/put in the memory for the libs etc if any is used
  • 0
    @matt-jd yeah, so it's basically the way i thought it is.

    BUT... it still seems incredible, the AST -> assembly step seems... so different from all the previous ones. why? because ISA is so huge, and contains so many instructions, where loads of them seem to be very specific "weird" ones which combine several "normal" instructions into one for high optimization. like, basically, I've seen instructions that go "(x * y) + z", i think. and that's an example which still makes sense to me, but... that kind of stuff.

    i mean... any programming language itself has loads of "instructions" (functions and such), but there I don't feel pressured to learn and use them all. in context of assembly/machine code, it seems to me that when you only use a small subset of the simple obvious ones, the resulting code will be extremely inefficient and slow...

  • 1
    @Midnight-shcode sorry, I had missed this comment somehow. I wouldn't say that the code is inefficient, RISC ISAs have only a few basic instructions so there aren't that many fancy instructions to begin with. And really if you want to use such instructions all you have to do is add some code in your expression parsing to detect such use casses :)
  • 2
    oh by the way if you use something like GCC you can use the -S option to get assembly and then you can compare what you would get! so benchmark benchmark benchmark, check the difference, it might be smaller than you think!
Add Comment