54
Geoxion
3y

Heck yes!

I implement a stack trace in my embedded systems!

Whenever a device crashes, it makes a stack dump in an unused part of ram.
After it has rebooted and is connected to the server again, it uploads the stack dump.

The server then opens the correct firmware elf file, walks the stack and associates the debug info from the elf.

The result? A beautiful stack trace with file names, function names and line numbers.

No more guessing where random crashes come from.

Comments
  • 11
  • 1
    Nice
  • 9
    The stack copy survives a power cycle in RAM?
  • 0
    @kwilliams yup. All memory is kept intact. However, all conventional memory is then reinitialized to bring it to a state that the program expects.
    All 'unused' memory is left untouched.
  • 4
    What kind of RAM do you have that holds data when there's no electricity? O.o
  • 4
    @h4xx3r a reboot is just a reset. No power is being cut.
  • 2
    @h4xx3r also there is SRAM. It's often used near CPU and systems can have a bit of dedicated SRAM/Flash or CMOS like setup for things like this.
  • 3
    @hjk101 often something like that is called backup ram. You often don't have a whole lot of it, like 4k. Very nice to keep your encryption keys stored in.

    Other fancier chips have memory bank that you can selectively shut down for power savings.

    Anything is possible! That's what I really like about embedded.
  • 2
    Good stuff man! šŸ˜„ Nothing makes me happier than knowing why/where a program has crashed. With embedded this can be a real challenge i imagine. Just out of interest: which type of hardware are we talking about here?
  • 2
    @jkommeren my current project has been specced to be overkill because we didn't know the full requirements beforehand.

    And it turned out to be very overkill haha.

    So the mcu is an STM32H753. Has 2mb flash, 600-something kb ram and runs at 480mhz.

    But one of the F7 series at 180mhz and 1mb flash would also have sufficed, though we do make good use of the double precision fpu...

    If I really did my best, then I think I could fit it all in 0.5mb, but that'd be stretching it.
  • 2
    Great stuff. Could you share some info on how you did it?
  • 3
    @Scade Yeah sure!

    When the server receives the stack dump it will check which firmware version it's from. The device reports that itself. The server then tries to find the corresponding elf file.

    Btw, the stack dump also includes a register dump.

    The elf file is compiled with debug info.
    The using https://crates.io/crates/addr2line I can read the function, file and line the given program counter corresponds with.

    Then you need to walk the stack. The new PC is the current LR (link register). Then you need to check with addr2line again to get the location of that PC.

    You also need to deal with interrupts, so if the LR has a special value, you need to unwind the stack yourself. You need to read the stack to retrieve the registers from before the interrupt. Now you can use addr2line again.

    Repeat this until you've got the entire stack analyzed and you've got a stack trace!

    No variables display yet, but I haven't put a lot of effort into that yet.
Add Comment