SEH Equivalent in Linux or How do I handle OS Signals (like SIGSERV) and yet keep continuing SEH Equivalent in Linux or How do I handle OS Signals (like SIGSERV) and yet keep continuing windows windows

SEH Equivalent in Linux or How do I handle OS Signals (like SIGSERV) and yet keep continuing


As you said, you could catch SIGSEGV via signal() or sigaction().

Continuing is not really advisable, as this would be undefined behaviour, i.e. your memory might be corrupted, which might let other test cases fail as well (or even terminate your whole process prematurely).

Would it be possible to run the test cases one by one as a sub process? This way, you could check the exit status and will detect if it terminated cleanly, with an error or due to a signal.

Running the test cases in a separate thread will have the same problem: you do not have memory protection between your test cases and the code driving the test cases.

The suggested approach would be:

fork() to create a child process.

In the child process, you execve() your test case. This could be the same binary with different arguments to select a certain test case).

In the parent process, you call waitpid() to wait for the termination of the test case. You received the pid from the fork() call in the parent process.

Evaluate the sub-process status with the WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG macros.

If you need timeouts for your test cases, you can also install a handler for SIGCHLD. If the timeout elapses first, kill() the child process. Be aware that you may only call certain functions from signal handlers.

Just a further note: execve() is not really required. You can just proceed and call your specified testcase directly.


To complement sstn's answer, on Linux, you could have processor and system specific C code which:

  • installs a signal handler using sigaction(2) with SA_SIGINFO
  • use the third argument to that signal handler, it is a (machine specific) ucontext_t* pointer
  • analyze the machine specific context state (i.e. the machine registers mcontext_t* from that ucontext_t*) - see getcontext(3) for details; by "disassembling" the code pointer you will be able to know which operation failed and you can get the faulting address.

  • modify and repair that machine state, this means changing the process address space by calling mmap(2) and/or modify some machine registers thru that mcontext_t*

  • return from your signal handler into a "repaired" state, perhaps at a different instruction address.

This of course is non portable and painful to code and debug. You may need to disable some compiler optimizations, use asm instructions or volatile pointers, etc...

On Debian or Ubuntu see the /usr/include/x86_64-linux-gnu/sys/ucontext.h header fle.

IIRC some old version of SML/NJ played such tricks.

Read very carefully signal(7) and study the ABI specification for your processor, e.g. the x86-64 ABI specification


In practice, you might also use (more easily) siglongjmp(3) from the signal handler. You might also deliberately violate the signal(7) rules. You could use Ian Taylor (working on GCC at Google) libbacktrace library, it works better if your applications and its libraries have debug info (e.g. is compiled with g++ -O1 -g2). See also GNU libc backtrace(3) and dladdr(3)


Handling SIGEGV is rumored to be not very efficient on Linux. On GNU/Hurd you would use its external pager mechanism.


Another possibility is to run the tested program from the gdb debugger. Recent versions of gdb can be scripted in Python, so you could automate a lot of things. This might be practically the most portable approach (since recent gdb has been ported on many systems).

addenda

Recent (june 2016) 4.6 or future or patched kernels might be able to handle page faults in user space and notably userfaultfd; but I don't know much the details. See also this question.