const unsigned char table[]={Here's a similar table-driven program that reads two entries in the table each time around the loop. The first table entry is treated as a repetition count, and the second table entry is the letter to repeat. Again, the program stops when it hits a repetition count of zero:
20,
20,
2,
2,
2,
20,
20,
0 /* end of the table */
};
int foo(void) {
int i=0; /* location in the table */
while (table[i]!=0) { /* print one entry in the table */
int n=table[i++]; /* number of times to print */
char c='#'; /* character to print */
for (int repeat=0;repeat<n;repeat++) /* print it n times */
std::cout<<c<<" ";
std::cout<<std::endl;
}
return 0;
}
const unsigned char table[]={Note that the above is just a nine-entry table; the relationship between 3 and 'q' is purely conceptual.
/* --- n, character to print n times ---- */
3,'q',
2,'@',
4,'~',
1,'z',
0 /* end of the table */
};
int foo(void) {
int i=0; /* location in the table */
while (table[i]!=0) { /* print one entry in the table */
int n=table[i++]; /* number of times to print */
char c=table[i++]; /* character to print */
for (int repeat=0;repeat<n;repeat++) /* print it n times */
std::cout<<c<<" ";
}
return 0;
}
int x=0; std::cin>>x; /* read what to do */If all the "if" statements are testing the same integral value, a "switch" does the same thing:
if (x==3) std::cout<<"Lucky three!\n";
else if (x==7) std::cout<<"Seven! Yes!\n";
else if (x==13) std::cout<<"NooooOOOO!!!!\n";
else std::cout<<"meh\n";
int x=0; std::cin>>x; /* read what to do */Often "switch" is faster than a nested block of "if" statements, especially if there are many possibilities. I'm using "switch" below, so I thought I'd explain this syntax first.
switch (x) {
case 3: std::cout<<"Lucky three!\n"; break;
case 7: std::cout<<"Seven! Yes!\n"; break;
case 13: std::cout<<"NooooOOOO!!!!\n"; break;
default: std::cout<<"meh\n"; break;
}
const unsigned char table[]={
0, /*yo! */
1, /*print x */
1, /*print x */
0, /*yo! */
2 /* exit */
};
int foo(void) {
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0: cout<<"Yo!\n"; break; /* single-Yo instruction */
case 1: cout<<"x\n"; break; /* single-X instruction */
case 2: return 0; /* stop looping through the table */
default:
cout<<"Unrecognized table entry!\n";
return -999;
}
}
Rather than having two identical "print x" commands, we can make the "x" command repeatable, by adding a repetition count.
const unsigned char table[]={Note that 0, a "Yo!" instruction, stands alone in the table, while 1, a "multi-x" instruction, takes two bytes, because the second byte is an x count. The indented "2" is not an exit command, it's the repetition count for the 1 instruction!
0, /*yo! */
1, /*print x... */
2, /* ... two times */
0, /*yo! */
2 /* exit */
};
int foo(void) {
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0: cout<<"Yo!\n"; break; /* single-Yo instruction */
case 1: { /* multi-x instruction */
int count=table[i++]; /* next byte in table is the x repeat count */
for (int repeat=0;repeat<count;repeat++)
std::cout<<'x'<<endl;
break;
}
case 2: return 0; /* stop looping through the table */
default:
cout<<"Unrecognized table entry!\n";
return -999;
}
}
const unsigned char table[]={
0xb0, /*set x = ... */
7, /* ... this byte */
0xc3 /* exit */
};
int foo(void) {
int x=0; /* our "register" (temporary storage, and return value) */
int i=0; /* our location in the table */
while (1) /* always keep looping through the table */
switch (table[i++]) { /* look at the next thing in the table */
case 0xb0: { /* set-x instruction */
x=table[i++]; /* next byte is the new value for x */
break;
}
case 0xc3: return x; /* stop looping through the table */
default:
cout<<"Illegal instruction!\n";
return -999;
}
}
Our table just has (8-bit) bytes in it, but sometimes we want to be
able to set an entire (32-bit) int. The standard x86 solution to
this is to split the integer into four bytes: first the low byte
(lowest value, last two hex digits), then the not-so-low byte, the
not-so-high byte, and the highest byte, like so.
const unsigned char table[]={
0xb8, /* set x =... */
4, /* low byte is 4 (that is, 0x04) */
1, /* next byte is 1 (that is, 0x01) */
0, /* highest two bytes are both zero */
0,
0xc3 /* return that */
};
int foo(void) {
int x=0; /* register */
int i=0;
while (1) switch (table[i++]) {
case 0xb8:
x=table[i]|(table[i+1]<<8)|(table[i+2]<<16)|(table[i+3]<<24);
i+=4;
break;
case 0xc3: return x;
default:
cout<<"Illegal instruction!\n";
return -999;
}
}
This returns "0x104". The "0x04" is the low byte. "0x01" is the next higher byte, and all higher bytes are zero.
To actually get work done, it's handy to have several different
values loaded up, not just one "x" like above. In assembly
language, we use "registers", which are just integers we can use for
temporary storage. Somehow, we need to be
able to load different values into these registers--here, we use the
low bits of the 0xb8 instruction to determine which register to write
to. For example, "0xb8" will load the integer into register 0
(our main register), while "0xb9" will load the integer into register
1, "0xba" will load into register 2, and so on.
x86 32-bit load instruction (5 bytes long) |
|||||||||||||
|
Here's the code to simulate multiple registers and our new load instruction:
const unsigned char table[]={
0xb8,0x04,0x1,0,0, /* set register 0 to 0x104 */
0xba,0x05,0x0,0,0, /* set register 2 to 0x5 */
0xc3
};
int foo(void) {
int regs[8]; /* registers */
int i=0;
while (1) {
unsigned char instruction=table[i++];
switch (instruction) {
case 0xb8: case 0xb9: case 0xba: case 0xbb: case 0xbc: case 0xbd: case 0xbe: case 0xbf:
regs[instruction&7]=table[i]|(table[i+1]<<8)|(table[i+2]<<16)|(table[i+3]<<24);
i+=4; /* skip over entries in table we just read */
break;
case 0xc3: return regs[0];
default:
cout<<"Unrecognized table entry!\n";
return -999;
} }
}
We'd also like to be able to add our choice of registers together, which we can do in an "add" instruction, the byte "0x03". The CPU determines which registers to add by reading the next byte, which is called "modR/M" on x86. "modR/M" is a byte, so it has 8 bits. The high two bits are both 1's (indicating a register-to-register add). The next three bits determine the destination register (the register being added to). The low three bits determine the source register (the register being read for the add). Because of these three-bit register counts, it's easiest to write a modR/M byte in octal.
x86 register-register add instruction (2 bytes long) |
||||||||||
|
const unsigned char table[]={
0xb8,0x04,0x1,0,0, /* set register 0 to 0x104 */
0xba,0x05,0x0,0,0, /* set register 2 to 0x5 */
0x03,0302, /* add register 2 to register 0 */
0xc3
};
int foo(void) {
int regs[8]; /* registers */
int i=0;
while (1) {
unsigned char instruction=table[i++];
switch (instruction) {
case 0x03: {
int modRM=table[i++]; /* figure out what to add based on next byte */
int S=modRM&7, D=(modRM>>3)&7, hi=modRM>>6;
if (hi!=3) return -998; /* high bits must be 1 for a register-register operation */
regs[D] += regs[S]; break;
}
case 0xb8: case 0xb9: case 0xba: case 0xbb: case 0xbc: case 0xbd: case 0xbe: case 0xbf:
regs[instruction&7]=table[i]|(table[i+1]<<8)|(table[i+2]<<16)|(table[i+3]<<24);
i+=4; /* skip over entries in table we just read */
break;
case 0xc3: return regs[0];
default:
cout<<"Unrecognized table entry!\n";
return -999;
} }
}
And for the final horror, because above I "happened" to chose the same numbers as
a real x86 CPU uses in its executable table (that is, x86 machine
code), now we can actually execute our table directly on the hardware,
rather than writing an interpreter. Don't worry about the function pointer syntax
in foo; all the interesting stuff is happening in the bytes of the
table, and the CPU hardware that they command!
const unsigned char table[]={The bottom line: this notion of "read a table to figure out what to do next" is very powerful--it's the basis for programmable computers!
0xb8,0x04,0x1,0,0, /* set register 0 to 0x104 */
0xba,0x05,0x0,0,0, /* set register 2 to 0x5 */
0x03,0302, /* add register 2 to register 0 */
0xc3
};
int foo(void) {
typedef int (*myFn)(void); /* defines a function type that returns int */
myFn f=(myFn)table; /* make "table" into that function type */
return f(); /* call our newly-defined function */
}
const int program[]={
10,42, /* load into reg 0 */
11, 3, /* load into register 1 */
3,0,1, /* regs[0] += regs[1] */
0 /* return */
};
int regs[8];
int where=0;
int instruction=0;
while (true) {
instruction=program[where++];
switch (instruction) {
case 0:
std::cout<<"We're done! The answer is "<<regs[0]<<"\n";
return regs[0];
break;
case 10:
regs[0]=program[where++];
break;
case 11:
regs[1]=program[where++];
break;
case 3: {
int dest=program[where++];
int src=program[where++];
regs[dest] += regs[src];
break;
}
default: std::cout<<"ILLEGAL INSTRUCTION ERROR! ERROR!\n"; break;
}
}