Signals and Syscalls
CS 301 Lecture, Dr. Lawlor
A signal is when the OS calls you:
- To inform you that some data arrived on the network or from the disk, SIGIO.
- To tell you to stop running, SIGTERM or SIGKILL.
- To inform you that your code just accessed out of bounds memory, SIGSEGV.
An syscall is when you call the OS:
- To ask the OS to perform some I/O, like screen output or network traffic. (write)
- To ask the OS who you are, or what time it is. (getpid)
Both, of course, depend on exactly which OS you're using!
The common feature to both signals and syscalls is the CPU's
"interrupt" feature, where the CPU stops what it was doing and calls
the operating system.
Writing a Signal Handler
Signals can be seen as a standardized interface for delivering
interrupts to user programs. Exactly like interrupts, a signal handler
is just a subroutine that gets called when something weird happens.
Overall signal delivery looks like this:
- Something causes an interrupt--a hardware device needs attention,
or a program reads a bad memory address, divides by zero, executes an
illegal or privileged instruction, etc.
- The CPU looks up the OS interrupt service routine in the interrupt table (or "interrupt vector", for some strange reason.)
- The OS's interrupt service routine figures out if it can handle
the interrupt, or if it should deliver the interrupt to a process as a
signal.
- To deliver a signal, the OS essentially just calls your process's subroutine.
To set yourself up to receive signals (add a signal
handler), you just call an operating system routine like signal. You pass in the name of the signal you want to receive, and a function
to execute once the signal is received. For example:
#include <signal.h>
void myHandler(int i)
{
printf("Sorry dude--you just hit signal %d\n",i);
exit(1);
}
int foo(void) {
int *badPointer=(int *)0;
printf("Installing signal handler\n");
signal(SIGSEGV,myHandler); /* <------------- */
printf("Signal handler installed. Segfaulting...\n");
(*badPointer)++;
printf("Back from segfault?!\n");
return 0;
}
(Executable NetRun Link)
Which on my machine prints out:
Installing signal handler
Signal handler installed. Segfaulting...
Sorry dude--you just hit signal 11
Signals are available on all POSIX operating systems (including Windows, Linux, Mac OS X), and include:
- SIGSEGV, segmentation fault, is delivered when your
program accesses an out-of-bounds memory address. If you manipulate the
memory map, you can actually resume from this signal!
- SIGFPE, floating-point (or arithmetic) exception, is
delivered when you divide by zero or encounter a problem with
floating-point (like a signalling NaN).
- SIGILL, illegal instruction, is delivered when your
program hits an invalid instruction, usually caused by overwriting your
own code or jumping to a bad function pointer.
On UNIX machines, there's also a slightly more sophisticated interface called sigaction. The signal handler function for sigaction can take a siginfo_t, which includes machine register info.
Signals can also be used to indicate that I/O is ready (SIGIO,
enabled using ``fcntl''), that a timer has expired (SIGALRM, SIGPROF,
or SIGVPROF, enabled using ``setitimer''), that the operating system
wants you to shut down (SIGTERM, SIGQUIT, SIGKILL, all UNIX-specific),
that various events have happened on the terminal (SIGHUP, SIGWINCH,
SIGPIPE, SIGTTIN, SIGTTOU, all UNIX-specific), or for
application-defined purposes (SIGUSR1/SIGUSR2, which must be sent
explicitly). See signal.h for the full list of signals.
Signals, exactly like interrupts, are hence a generic ``catch-all''
notification mechanism, used for a variety of unrelated tasks.
Direct Operating System Calls: Syscalls
Normally, to interact with the outside world (files, network, etc) you
just call some function, usually the exact same function you'd call
from C or C++. But sometimes, such as when you're implementing
a C library, or when there is no C library call to access the
functionality you need, you want to talk to the OS kernel
directly. There's a special x86 "interrupt" instruction to do
this, called "int".
On Linux, you talk to the OS by loading up
values into registers then calling "int 0x80". Register rax
describes what to do (open a file, write data, etc) and rbx, rcx, rdx,
rsi, and rdi have the parameters describing how to do it. This
register-based parameter passing is similar to how we call functions in
64-bit x86, but the Linux kernel uses this convention both in 32 and 64
bit mode. Other operating systems like BSD store syscall
parameters on the stack, like the 32-bit x86 call interface!
Konstantin Boldyshev has a good writeup and examples of Linux, BSD, and BeOS x86 syscalls, and a list of common Linux syscalls.
(The full list of Linux syscalls is in
/usr/include/asm/unistd_32.h.) Here's a 64-bit version of his
Linux example:
push rbx ; <- we'll be using ebx below, and it's a saved register (hallelujah!)
; System calls are listed in "asm/unistd.h"
mov rax,4 ; the system call number of "write".
mov rbx,1 ; first parameter: 1, the stdout file descriptor
mov rcx,myStr ; data to write
mov rdx,3 ; bytes to write
int 0x80 ; Issue the system call
pop rbx ; <- restore ebx to its old value
ret
section .data
myStr:
db "Yo",0xa
(Try this in NetRun now!)
This 64-bit version matches the way you make 32-bit Linux system calls.
There's also a second marginally faster way to make 64-bit system calls using a *different*
list of syscall numbers under /usr/include/asm/unistd_64.h.
Like
the 32-bit version, the system call number is passed in rax, but the
parameters are in rdi, rsi, rdx, r10, r8, r9, somewhat like a function
call but with slightly different registers! Instead of "int
0x80", for this interface you use the "syscall"
instruction. The return is still in rax.
; "sysenter" instruction call numbers are listed in "asm/unistd_64.h"
mov rax,1 ; the (new) system call number of "write".
mov rdi,1 ; first parameter: 1, the stdout file descriptor
mov rsi,myStr ; data to write
mov rdx,3 ; bytes to write
syscall ; Issue the system call
; leave syscall's return value in rax
ret
section .data
myStr:
db "Yo",0xa
section .text
(Try this in NetRun now!)
Windows system call numbers keep changing, so direct system calls
aren't at all easy to use on Windows. The current numbers are
stored in kernel32.dll. This is partly a security
feature, to make it harder to write portable Windows viruses.