const int commands[]={
0, /* sleeps */
1, /* meow */
1,
2,73, /* eats */
0, /* sleep */
9 /* exit */
};
int foo(void) {
int index=0;
while (true) {
switch (commands[index]) {
case 0: std::cout<<"sleep\n"; break;
case 1: std::cout<<"meow\n"; break;
case 2: /* need a parameter so we know what to eat */
{
index++; /* skip over the 2 */
int what=commands[index];
std::cout<<"eat "<<what<<"\n";
break;
}
case 9: return 0;
default: std::cout<<"Unrecognized cat-cmd! cmd="<<commands[index]<<"\n";
}
index++;
}
}
The only interesting thing happening here is the "eat" case: this
command needs a parameter, so we put the parameter right in the array
of commands.
The CPU is not a cat. So as you might expect, running the above command table on the CPU just crashes horribly:
const char commands[]={
0, /* sleeps */
1, /* meow */
1,
2,73, /* eats */
0, /* sleep */
9 /* exit */
};
int foo(void) {
typedef int (*fnptr)(void); // pointer to a function returning an int
fnptr f=(fnptr)commands; // typecast the command array to a function
return f(); // call the new function!
}
(Don't worry about the hideous C++ syntax for function pointer stuff.)
However, the CPU is a table-driven machine, only it uses different
values for commands: for example, the byte "0xc3" tells an x86 CPU to
return from the current function. The byte "0xb0" is followed by
a one-byte parameter to load up for return. So this works!
const char commands[]={
0xb0,73, /* load a value to return */
0xc3 /* return from the current function */
};
int foo(void) {
typedef int (*fnptr)(void); // pointer to a function returning an int
fnptr f=(fnptr)commands; // typecast the command array to a function
return f(); // call the new function!
}
These raw byte commands that the CPU executes are called "machine
code". "assembly language" is just a human-readable translation
of machine code. An "assembler", like NASM, reads assembly language and writes executable machine code. A "disassembler", like PE Explorer or IDA Pro (for Windows), or objdump (for Linux or Mac OS X), reads an executable and writes assembly language.
If you just want to look at the machine code inside a function, you
can just do some pointer typecasting and start printing bytes of
machine code:
int bar(void) { /* some random function: we look at bar's machine code below! */
return 17;
}
int foo(void) {
const unsigned char *data=(unsigned char *)(&bar);
for (int i=0;i<10;i++) /* print out the bytes of the bar function */
std::cout<<"0x"<<std::hex<<(int)data[i]<<"\n";
return 0;
}
We'll be learning about assembly language and machine code for the rest of the semester!