mov edi,1000"call" and "ret" match one another, so myfunction's "ret" goes back to our first function. If you use "jmp" instead of "call", "ret" will return all the way back out to main, so this code will return 1000:
call myfunction
add eax,7
ret
myfunction:
mov eax,edi ; copy our first parameter into eax (to be returned)
ret ; go back to foo
mov edi,1000In fact, the only difference between "jmp" and "call" is what happens at the following "ret"; so if you want "ret" to return to your function again, you can actually speed up your code by replacing "call" with "jmp". This is a "tail call": it's a way to leave a function without ever coming back.
jmp myfunction
add eax,7 ; <- never executed!
ret
myfunction:
mov eax,edi ; copy our first parameter into eax (to be returned)
ret ; go back to main
jmp somewhere |
Start executing code at somewhere. That is, it sets register rip=somewhere. |
call somewhere comeback: |
Save where to come back on the stack, and go somewhere. That is, it pushes comeback, then sets rip=somewhere. |
Instruction |
Equivalent Instruction Sequence |
call bar |
push next_instruction jmp bar next_instruction: |
ret |
pop rdx (rdx or some other scratch register; ret doesn't modify any registers) jmp rdx |
Assembly |
C/C++ |
|
Call |
call make_beef
|
int make_beef(void); (Try this in NetRun now!) |
Jump |
jmp make_beef (Try this in NetRun now!) |
int foo(void) { Again, "make_beef" never comes back, so we get 0xBEEF. |
jmp make_beefBut the "call" instruction allows "ret" to jump back to the right place automatically, by pushing the return address on the stack. "ret" then pops the return address and goes there:
come_back:
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
jmp come_back
push come_back ; - simulated "call" -
jmp make_beef ; - continued -
come_back: ; - end of simulated "call" -
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
pop rcx ; - simulated "ret" -
jmp rcx ; - end of simulated "ret" -
There's a very weird hacky way to figure out what address your code
is running from: call the next instruction, and then pop the return
address that "call" pushed!
call nextline
nextline:
pop rax ; rax will store the location in memory of nextline
ret
This is only useful if your code doesn't know where in memory it
will get loaded. This is true for some shared libraries, where
you see exactly the above instruction sequence!
Because calling functions takes some overhead (push return address,
call function, do work, pop return address, return there), recursion is
slower than iteration. For example:
C++ Plain Recursion |
Assembly Plain Recursion |
int sum(int i) { |
mov rdi,10000000 |
Folks who love recursion have found an interesting optimization
called "tail recursion", where you arrange for there to be *nothing*
for the function to do after recursing, so there's no point in your
children returning to you--you just "jmp" to them, not "call", because
you don't want them to ever come back to you. The base case is the only "ret". Here's an example:
C++ Tail Recursion |
Assembly Tail Recursion |
int sum(int i,int partial) { |
mov edi,1000000000 ; sum first argument |
Tail recursion eliminates both the memory used on the stack for the
return addresses, and the time taken for the call and return. It
can make recursion exactly as fast as iteration. (Curiously, you
can always transform an iteration into a recursion, and vice versa,
although the code may get nasty.)
Sadly, my version of the gcc compiler doesn't seem to do the tail
recursion optimization anymore on 64-bit machines, possibly due to the
new stack alignment rules, although it used to this optimization
reliably on 32-bit machines.
int silly_recursive(int i) {The same computation works fine (aside from integer overflow) when written as an iteration, not a recursion, because iteration doesn't touch the stack:
if (i==0) return 0;
else return i+silly_recursive(i-1);
}
int foo(void) {
std::cout<<"Returns: "<<silly_recursive(read_input());
return 2;
}
int silly_iterative(int n) {I've met truly dedicated recursion-centric mathematicians who described iteration as a 'degenerate form of recursion'. He preferred writing his recursions in 'tail position', so the compiler itself could put the iterative jump in there.
int sum=0;
for (int i=0;i<=n;i++) sum+=i;
return sum;
}
int foo(void) {
std::cout<<"Returns: "<<silly_iterative(read_input());
return 2;
}
Here's the perfectly mathematical recursive form of factorial. |
int f(int n) { |
Adding a variable to store the multiplications computed "so far" puts the recursive call in 'tail position', where we could replace the call with a jmp (since we have nothing to do when the child returns).
This would save stack space at runtime, and be about 50% faster.
However, the compiler doesn't seem to do it on my 64-bit machine. |
int f(int n,int sofar) { |
Since the compiler doesn't seem
to do the call->jmp translation, do it ourselves with a hideous
goto. This looks more like assembly! |
int f(int n,int sofar) { |
Instead of the circular "jmp", we can just do a cleaner while loop. |
int f(int n,int sofar) { |
Might as well make the loop count upwards--then we can get rid of the extra variable. Our transition to iteration is now complete! |
int f(int n) { |
int happy_innocent_code(void) {The "cin>>str" line in happy_innocent_code, can overwrite happy's stack space with whatever's in the read-in string, if the read-in string is longer than 7 bytes. So you can get a horrific crash if you just enter any long string, because the correct return address is overwritten with string data.
char str[8];
cin>>str;
cout<<"I just read a string: "<<str<<"! I'm a big boy!\n";
return 0;
}
void evil_bad_code(void) {
cout<<"Mwa ha ha ha...\n";
cout<<"...er, I can't return. Crashing.\n";
}
int foo(void) {
//void *p=(void *)evil_bad_code; /* address of the bad code */
//printf("evil code is at: '%4s'\n",(char *)&p);
happy_innocent_code();
cout<<"How nice!\n";
return 0;
}