Call and Return Implementation, Tail Recursion
CS 301 Lecture, Dr. Lawlor
(Call and return info moved here from the "The Stack" lecture note, because we only reached it today...)
Call and Return
OK, so far we've seen that the stack gets used in assembly language for:
- Temporary storage, like small arrays in the program. You just "sub esp, N" to allocate N bytes starting at esp; as long as you be sure to "add esp,N" to give those bytes back before your function returns. One
nice part about the stack is that once you move the stack pointer over
an area, those bytes are YOURS until you give them back, unlike
registers, which almost always get overwritten when you call another
function.
- Saving preserved registers.
There's one more place the stack gets used, and that's to keep track of
where "ret" should go when you return from a function. This is
very simple--"ret" jumps back to the address on the top of the
stack. "call" pushes this return address before jumping into the
new function.
Instruction
|
Equivalent Instruction Sequence
|
call bar
|
push next_instruction
jmp bar
next_instruction:
|
ret
|
pop rdx (rdx or some other scratch register; ret doesn't modify any registers)
jmp rdx
|
For example, there's one subtle difference between these two pieces of
code: in the first case, we go and come back; in the second case, we
leave forever.
|
Assembly
|
C/C++
|
Call
|
call make_beef mov eax,0xC0FFEE ret
make_beef: mov eax,0xBEEF ret
(Try this in NetRun now!)
Returns 0xC0FFEE, because we come back from "make_beef".
|
int make_beef(void); int foo(void) { make_beef(); return 0xC0FFEE; }
int make_beef(void) { return 0xBEEF; }
(Try this in NetRun now!)
Also returns 0xC0FFEE, for the same reason.
|
Jump
|
jmp make_beef mov eax,0xC0FFEE ret
make_beef: mov eax,0xBEEF ret
(Try this in NetRun now!)
Returns 0xBEEF, because we never come back from "make_beef".
|
int foo(void) { goto make_beef; return 0xC0FFEE; make_beef: return 0xBEEF; }
(Try this in NetRun now!)
Again, "make_beef" never comes back, so we get 0xBEEF.
|
It's easy to manually add code to jump back from "make_beef", like this:
jmp make_beef
come_back:
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
jmp come_back
(Try this in NetRun now!)
But the "call" instruction allows "ret" to jump back to the right place
automatically, by pushing the return address on the stack. "ret"
then pops the return address and goes there:
push come_back ; - simulated "call" -
jmp make_beef ; - continued -
come_back: ; - end of simulated "call" -
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
pop rcx ; - simulated "ret" -
jmp rcx ; - end of simulated "ret" -
(Try this in NetRun now!)
Optimization: Tail Recursion
Because calling functions takes some overhead (push return address,
call function, do work, pop return address, return there), recursion is
slower than iteration. For example:
C++ Plain Recursion
|
Assembly Plain Recursion
|
int sum(int i) { if (i==0) return 0; else return sum(i-1)+i; } int foo(void) { return sum(10000000); }
(Try this in NetRun now!)
|
mov rdi,10000000 call sum ret
; sum function: computes sum of numbers from 0...i ; one parameter: i, in rdi: number of recursions left sum: cmp rdi,0 ; check if we're done jle base_case push rdi sub rdi,0x1; i-1 call sum ; recursion step pop rdi add rax,rdi ; partial + i ret
base_case: ; no more iterations--return zero mov rax,0 ret
(Try this in NetRun now!)
|
Folks who love recursion have found an interesting optimization
called "tail recursion", where you arrange for there to be *nothing*
for the function to do after recursing, so there's no point in your
children returning to you--you just "jmp" to them, not "call", because
you don't want them to ever come back to you. The base case is the only "ret". Here's an example:
C++ Tail Recursion
|
Assembly Tail Recursion
|
int sum(int i,int partial) { if (i==0) return partial; else return sum(i-1,partial+i); } int foo(void) { return sum(10000000,0); }
(Try this in NetRun now!)
|
mov edi,1000000000 ; sum first argument mov esi,0 ; partial result call sum ret
sum: mov eax,esi cmp edi,0 je base_case lea esi,[rax+rdi*1] sub edi,0x1 jmp sum ; tail recursion step!
base_case: ret
(Try this in NetRun now!)
|
Tail recursion eliminates both the memory used on the stack for the
return addresses, and the time taken for the call and return. It
can make recursion exactly as fast as iteration. (Curiously, you
can always transform an iteration into a recursion, and vice versa,
although the code may get nasty.)
Sadly, my version of the gcc compiler doesn't seem to do the tail
recursion optimization anymore on 64-bit machines, although it used to
do them on 32-bit machines.
Why you care #1: Stack Space Usage
Every time you call a nested function, the stack has to hold the
address to return to. This actually takes up a few bytes of stack
space per call, so a deeply-recursive function can run out of space
pretty quickly. For example, this code runs out of stack space
and exits (rather than crashing or printing the return value) for an
input value as low as 10 million:
int silly_recursive(int i) {
if (i==0) return 0;
else return i+silly_recursive(i-1);
}
int foo(void) {
std::cout<<"Returns: "<<silly_recursive(read_input());
return 2;
}
(Try this in NetRun now!)
The same computation works fine (aside from integer overflow) when
written as an iteration, not a recursion, because iteration doesn't
touch the stack:
int silly_iterative(int n) {
int sum=0;
for (int i=0;i<=n;i++) sum+=i;
return sum;
}
int foo(void) {
std::cout<<"Returns: "<<silly_iterative(read_input());
return 2;
}
(Try this in NetRun now!)
Why you care #2: Buffer Overflow Attack
Another place understanding call and return come in handy is in writing secure code. Here's some insecure code:
int happy_innocent_code(void) {
char str[8];
cin>>str;
cout<<"I just read a string: "<<str<<"! I'm a big boy!\n";
return 0;
}
void evil_bad_code(void) {
cout<<"Mwa ha ha ha...\n";
cout<<"...er, I can't return. Crashing.\n";
}
int foo(void) {
//void *p=(void *)evil_bad_code; /* address of the bad code */
//printf("evil code is at: '%4s'\n",(char *)&p);
happy_innocent_code();
cout<<"How nice!\n";
return 0;
}
(Try this in NetRun now!)
The "cin>>str" line in happy can overwrite happy's stack space
with whatever's in the read-in string, if the read-in string is longer
than 7 bytes. So you can get a horrific crash if you just enter
any long string, because the correct return address is overwritten with
string data.
But it gets worse. Note that we never explicitly call
"evil_bad_code", but the commented-out code helped me craft the attack
string "aaaabbbbccccDŠeeeeffff", where the funky binary bytes of that attack
string get written into the part of the stack that should be storing
happy's return address. If we overwrite this with the address of
evil code, happy will return directly to evil bad code, which then can
do anything it likes. Kaboom!
Be sure to use "std::string", not raw arrays of char, in all your input data!
There's a pretty informative writeup on this by the hacker Aleph One called "smashing the stack for fun and profit".
Luckily, most network-facing code nowadays (including NetRun itself)
uses strings properly, and isn't vulnerable to buffer overflow exploits
like this. Modern compilers, like gcc 4, automatically include
protection against stack smashing (try the above on my 64-bit machine,
which has this new compiler!).