Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
You're not alone! Let's find a nice bridge together, preferably walking distance from your home. You don't want to arrive in hell tired huh. It's a lot walking there. They never tell about the mountains but hell is full of it. Terrible people don't like mointains :)
-
writing a compiler is easy, but only after you've done it a *few* times at least :)
-
retoor10982d@Liebranca still it can massage your brain for ever. So many ways to choose. But the grafting interpreter book kinda showed me that there are some ways just figured out. Did kick the creativity a bit.
-
@12bitfloat are you:
[A] doubtful it gets easier
[B] despairing at the prospect of starting over
[C] both
either case, it's an iterative process at heart. chars to tokens to expressions to blocks to instructions and then to value if const: it really is a chain of maps.
the odd part is that the flow isn't linear due to the "value if const" bit making recursion inevitable. which raises the question: must i be able to execute the instructions *while* compiling? to which the answer is yes, you do, and if you don't design around this you're gonna have a bad time. whether you prefer to think about it as such or not, you're still inevitably building a virtual machine.
point being: once you have a clear mental model of the system (and what it does) it becomes much easier to implement. but to achieve that, you have to reduce complexity, which relates to **relationships between layers**, not necessarily size; it'll still take time, obviously.
i could go on but comment size. -
@Liebranca please continue. I could talk and read about it endlessly as well. I think @lorentz too.
-
@Liebranca I mostly struggle with the enormous complexity of all the little moving parts which have to integrate
For example ABI: Just the amd64 sysv rules are complex enough on their own: Structs get recursively split up into 8 byte chunks then assigned registers or stack space accordingly depending on a bunch of subrules. But ultimately let's say a struct is passed on the stack, my backend should be able to produce e.g. `add [rbp+16], 8` -- directly interacting with the param passed on the stack instead of load/add/store
Obviously that's not impossible to implement, and I think my current plan should work out. But you do get what I mean? That "value" has to snake through 2 ssa (+1 non ssa) IRs, through instruction selection, through register allocation and then I still have to be able to make that connection
I'm making progress, but designing something this large and interconnected is definitely a challenge -
@Liebranca For example I've given up on actually implementing full SYSV abi, because it allows passing multiple values packed in a single physical register. Now all downstream logic has to be able to cope with that: A physical register can now actually be multiple subregisters and reg alloc has to respect that, the illusion that each SSA reg can be stored in a pyhsical register is broken (I already disallow aggregates in ssa regs, also for complexity reasons)...
It's a really deep rabbit hole -
@whimsical retoori do you realize you are opening the gates of hell i will never shut up
@12bitfloat not implying but "little moving parts" can sometimes also mean "blocks that could've been isolate have cthulu as umbilical cord". is it *really* an absolute necessity?
since you're talking about the abi, guess i'll give you a related example. suppose i respect the calling convention even when i'm not inside main, nor making syscalls, nor inside a symbol that is exported for use by other programs. why ain't i such a boy scout. but i'd be following an interfacing rule for non-interfacing code, which i don't *have* to do. and then extra work that i could've handled only in very specific cases would instead be propagated to the entire codebase for no benefit.
now, that itself is a non issue, but if you're keeping more than seven or so concerns like that in mind at the same time, all the time... not to sound like uncle bob but that's liquid.
holdup more comment. -
@Liebranca Well, I have to follow *some* ABI
And that ABI, whatever it is, needs to account for passing values via registers and via memory, and via memory if we don't have enough registers left, and via registers if it's a small enough struct so we don't nuke performance
And it has to deal with returning stuff in both registers and memory
Ya see the issue? :P -
@12bitfloat maybe i derailed this conversation but interesting nonetheless.
you have some way of knowing what to bother preserving from the caller's context, right? you don't just blindly push everything on each call.
well then, you don't need a fixed convention because you have access to that information. there's but a handful of exceptions: recursion, function pointers, and extern/system calls. in *those* cases it's mandatory.
but if i have an internal symbol, which isn't recursive, calling any externals/indirect calls, and is not itself used for indirect calls, then i can do whatever i want. unadvisable if i was handrolling assembly, of course, because i'd just lose track of things. but that's not what we're doing here is it.
you can pass arguments or return results in non-standard locations when it's convenient, and if you're juggling registers, it usually is. but my brain is tortellini right now so i'll leave it as homework for the reader to figure out the rest. -
@Liebranca I agree with you that I don't have to follow any specific specification (which to be fair i'm not because of complexity :P)
But do note that I'm writing a real compiler with an x86_64 machine code backend. I legitimately have to put things into physical registers or on the stack for this to work, and I'd like to not make something *completely* shitty so I'd also like to support some pretty important optimizations like folding ptr calculations into x86 memory operands
I've written a stack machine based compiler before and that was fine-ish (still pretty complex :P) but for x86 that just won't cut it -
@12bitfloat my backend is flat assembler lmao. i compile first to bytecode, and then transpile that into fasm, with the big-brain idea being oh i can eventually support more than one architecture if i write more translators.
but my virtual machine is *kind of* like an x86, and what i output has to be valid x86_64 assembly code for it to work, so wink wink, not all *that* different. -
@Liebranca How are you dealing with the complexities of x86 though!??
I mean you also have to follow some ABI when doing function calls, and you also have to deal with x86 memory operands which can be as complex as `[rbx + rax * 4 + 0x1337]` (and should be as complex for performance!), you have to implement register allocation, etc.
I'm actually curious -
@12bitfloat mainly? i like writing assembly, and spent a good time doing it, so it's kinda easy for me to generate assembly. i then just designed the vm to be similar enough to make it viable, and that was that, pretty much.
here's that 'complex' memory operand getting decoded https://github.com/Liebranca/... the 'seg' part is from back when i was doing segmented memory, which isn't actually the case anymore. anyway, it's actually like what... 40 SLOC? :)
ok im cheating because it's perl. i should rewrite it in C to be honest. but i'll use my preprocessor, which does let me use perl inside C macros. hehe.
but anyway defining your own ISA kinda helps, which is what i did here. you just take the instructions you actually care about and the rest...
EDIT: oooh my bad, i read the operand wrong. anyway, it's just swapping out what is multiplied by the scale. -
whimsical146223h@Liebranca I think you can convert it with AI with some effort. Ai is guite good in C.
-
Liebranca130223h@whimsical reviewing that would be such a nightmare that i think i like my chances better with a straight-up rewrite. -
12bitfloat1096413h -
@12bitfloat perl is more or less bash on steroids. it's very convenient for string processing, and surprisingly fast at it too, so i tend to use it a lot. but 95% of this virtual machine stuff i actually shouldn't be doing in perl like at all lmao.
Related Rants
-
cdrice105"You gave us bad code! We ran it and now production is DOWN! Join this bridgeline now and help us fix this!" ... -
MoboTheHobo36My Friend: Dude our Linux Server is not working anymore! Me: What? What did you do? My friend: Nothing I swe... -
tommy16Right now someone at Google is coding something useless for us to laugh at on April Fools.

Ain't nothing like compiler development to make you want to jump of a bridge
I went in with my childish naivety. How wrong I was... lmao
rant
compiler dev
i want to die
rust
fml