const unsigned char table[]={
0, /*yo! */
1, /*print x */
1, /*print x */
0, /*yo! */
2 /* exit */
};
int foo(void) {
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0: cout<<"Yo!\n"; break; /* single-Yo instruction */
case 1: cout<<"x\n"; break; /* single-X instruction */
case 2: return 0; /* stop looping through the table */
default:
cout<<"Unrecognized table entry!\n";
return -999;
}
}
Rather than having two identical "print x" commands, we can make the "x" command repeatable, by adding a repetition count.
const unsigned char table[]={Note that 0, a "Yo!" instruction, stands alone in the table, while 1, a "multi-x" instruction, takes two bytes, because the second byte is an x count. The indented "2" is not an exit command, it's the repetition count for the 1 instruction!
0, /*yo! */
1, /*print x... */
2, /* ... two times */
0, /*yo! */
2 /* exit */
};
int foo(void) {
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0: cout<<"Yo!\n"; break; /* single-Yo instruction */
case 1: { /* multi-x instruction */
int count=table[i++]; /* next byte in table is the x repeat count */
for (int repeat=0;repeat<count;repeat++)
std::cout<<'x'<<endl;
break;
}
case 2: return 0; /* stop looping through the table */
default:
cout<<"Unrecognized table entry!\n";
return -999;
}
}
const unsigned char table[]={
0xb0, /*set x = ... */
7, /* ... this byte */
0xc3 /* exit */
};
int foo(void) {
int x=0; /* our "register" (temporary storage, and return value) */
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0xb0: { /* set-x instruction */
x=table[i++]; /* next byte is the new value for x */
break;
}
case 0xc3: return x; /* stop looping through the table */
default:
cout<<"Illegal instruction!\n";
return -999;
}
}
Our table just has (8-bit) bytes in it, but sometimes we want to be
able to set an entire (32-bit) int. The standard x86 solution to
this is to split the integer into four bytes: first the low byte
(lowest value, last two hex digits), then the not-so-low byte, the
not-so-high byte, and the highest byte, like so.
const unsigned char table[]={
0xb8, /* set x =... */
4, /* low byte is 4 (that is, 0x04) */
1, /* next byte is 1 (that is, 0x01) */
0, /* highest two bytes are both zero */
0,
0xc3 /* return that */
};
int foo(void) {
int x=0; /* register */
int i=0;
while (1) switch (table[i++]) {
case 0xb8: // bitwise magic! Reassemble x from the next 4 bytes in the table.
x=table[i]|(table[i+1]<<8)|(table[i+2]<<16)|(table[i+3]<<24);
i+=4;
break;
case 0xc3: return x;
default:
cout<<"Illegal instruction!\n";
return -999;
}
}
This returns "0x104". The "0x04" is the low byte. "0x01" is the next higher byte, and all higher bytes are zero.
const char commands[]={
0xb0,73, /* load a value to return */
0xc3 /* return from the current function */
};
int foo(void) {
typedef int (*fnptr)(void); // pointer to a function returning an int
fnptr f=(fnptr)commands; // typecast the command array to a function
return f(); // call the new function!
}
These raw byte commands that the CPU executes are called "machine
code". "assembly language" is just a human-readable translation
of machine code. An "assembler", like NASM, reads assembly language and writes executable machine code. A "disassembler", like PE Explorer or IDA Pro (for Windows), or objdump
(for Linux or Mac OS X), reads an executable and writes assembly
language (in NetRun, hit "Disassemble" checkbox under "Options").
If you just want to look at the machine code inside a function, you
can just do some pointer typecasting and start printing bytes of
machine code:
int bar(void) { /* some random function: we look at bar's machine code below! */This prints out the same bytes inside bar that you see in the "Disassembler" tab. Which instructions is the compiler using?
return 17;
}
int foo(void) {
const unsigned char *data=(unsigned char *)(&bar);
for (int i=0;i<10;i++) /* print out the bytes of the bar function */
std::cout<<"0x"<<std::hex<<(int)data[i]<<"\n";
return 0;
}