Is there a programmatic way to check stack corruption Is there a programmatic way to check stack corruption multithreading multithreading

Is there a programmatic way to check stack corruption


ARM9 has JTAG/ETM debugging support on-die; you should be able to set up a data access watchpoint covering e.g. 64 bytes near the top of your stacks, which would then trigger a data abort, which you could catch in your program or externally.

(The hardware I work with only supports 2 read/write watchpoints, not sure if that's a limitation of the on-chip stuff or the surrounding third-party debug kit.)

This document, which is an extremely low-level description of how to interface with the JTAG functionality, suggests you read your processor's Technical Reference Manual -- and I can vouch that there's a decent amount of higher-level info in chapter 9 ("Debug Support") for the ARM946E-S r1p1 TRM.

Before you dig into understanding all this stuff (unless you're just doing it for fun/education), double-check that the hardware and software you're using won't already manage breakpoints/watchpoints for you. The concept of "watchpoint" was a bit hard to find in the debugging software we use -- it was a tab labelled "Hardware" in the add breakpoint dialog.


Another alternative: your compiler may support a command-line option to add function calls at the entry and exit points of functions (some sort of "void enterFunc(const char * callingFunc)" and "void exitFunc(const char * callingFunc)"), for function cost profiling, more accurate stack tracing, or similar. You can then write these functions to check your stack canary value.

(As an aside, in our case we actually ignore the function name that is passed in (I wish I could get the linker to strip these) and just use the processor's link register (LR) value to record where we came from. We use this for getting accurate call traces as well as profiling information; checking the stack canaries at this point would be trivial too!)

The problem is, of course, that calling these functions changes the register and stack profiles for the functions a bit... Not much, in our experiments, but a bit. The performance implications are worse, and wherever there's a performance implication there's the chance of a behavior change in the program, which may mean you e.g. avoid triggering a deep-recursion case that you might have before...


Very late update: these days, if you have a clang+LLVM based pipeline, you may be able to use Address Sanitizer (ASAN) to catch some of these. Be on the lookout for similar features in your compiler! (It's worth knowing about UBSAN and the other sanitizers too.)


What compiler are you using? I'm guessing a OS specific one. If you're using GCC, you may be able to use the Stack-Smashing Protector. This might be a fix for your production system prevent the issue, and would also allow you to detect it in development.

To effectively check for stack corruption, you need to check your available stack space, put guards on both sides of the stack arguments before the call, make the call, and then check the guards on the call's return. This kind of change generally requires modification to the code which the compiler generates.


When working on an embedded platform recently, I looked high and low for ways to do this (this was on an ARM7).

The suggested solution was what you've already come up with: initialize the stack with a known pattern and make sure that pattern exists after returning from a function. I thought the same thing "there's got to be a better way" and "hasn't someone automated this". The answer to both questions was "No" and I had to dig in just as you've done to try to find where the corruption was occuring.

I also "rolled my own" exception vectors for the data_abort, etc. There are some great examples on the 'net of how to backtrace the call stack. This is something you could do with a JTAG debugger, break when any of these abort vectors occurs and then investigate the stack. This can be useful if you only have 1 or 2 breakpoints (which seems to be the norm for ARM JTAG debugging).