C++ Structs and Classes in Assembly

CS 301 Lecture, Dr. Lawlor

Classes and Structs

Any class or struct is just an object that contains a bunch of subobjects, called "class members" "struct fields". 

"class" is only available in C++.  "struct" works in either C++ or C.  The only difference between a "class" and a "struct" in C++ is that "class" requires a "public:" keyword; "struct" is public by default.

So consider a declaration like:

struct bar {
    int x;
    int y;
};

This is a struct, bar, that contains two fields x and y.  x and y are usually laid out in memory right next to each other, so bar is a total of 8 bytes long: the first 4 bytes are x, and the next 4 bytes are y.  In memory, it's exactly like a two-element array, to the point where you can't always tell whether bar was a struct or an array by looking at the assembly!

There's a cool macro called "offsetof(struct,field)" that returns the number of bytes between the start of the struct and the start of that field.  So "offsetof(bar,x)==0" bytes and "offsetof(bar,y)==4" bytes, while "sizeof(bar)==8" bytes. 
struct bar {
int x;
int y;
};
bar b;
std::cout<<"Location of bar: "<<(void *)&b<<"\n";
std::cout<<"Location of bar.x: "<<(void *)&b.x<<"\n";
std::cout<<"Location of bar.y: "<<(void *)&b.y<<"\n\n";

std::cout<<"sizeof bar: "<<sizeof(b)<<"\n";
std::cout<<"offsetof of bar.x: "<<offsetof(bar,x)<<"\n";
std::cout<<"offsetof of bar.y: "<<offsetof(bar,y)<<"\n";
return 0;

(executable NetRun link)

Here's what this program prints out.  Notice that a pointer to the struct has the same value as a pointer to the first element of the struct.
Location of bar: 0x7fffffffe650
Location of bar.x: 0x7fffffffe650
Location of bar.y: 0x7fffffffe654

sizeof bar: 8
offsetof of bar.x: 0
offsetof of bar.y: 4
Program complete. Return 0 (0x0)

In assembly, if rdx was pointing to the start of bar,
    mov [rdx+4],eax
would set bar's y field to eax, because the y field starts 4 bytes from the start of bar.

So:

Classes & Structs in Assembly

In assembly language, there is no syntax for defining classes or structures.  Instead, the "sizeof" and "offsetof" information is stored inside your head, and ideally in your assembly comments!  For example, this code defines an 8-byte class containing two DWORD values:
mov rdx,bar ; points to bar "class"
mov DWORD[rdx+4],11 ; write to y field
mov eax,DWORD[rdx+4] ; read from y field
ret

section .data
bar:
dd 3 ; first field, "x", offset 0 bytes from bar
dd 7 ; second field, "y", offset 4 bytes from bar

(Try this in NetRun now!)

Of course, there's nothing magical about allocating the class inside section .data; here's the same idea, where I allocate the 8 bytes for the class on the stack:
sub rsp,8 ; allocate space for a bar
mov rdx,rsp ; points to bar "class"
mov DWORD[rdx+4],11 ; write to y field
mov eax,DWORD[rdx+4] ; read from y field
add rsp,8 ; give back space
ret

(Try this in NetRun now!)

Or, finally, I can call malloc to allocate space:
mov rdi,8 ; allocate space for a bar
extern malloc
call malloc
mov rdx,rax ; points to bar "class"
mov DWORD[rdx+4],11 ; write to y field
mov eax,DWORD[rdx+4] ; read from y field
ret

(Try this in NetRun now!)

Pointers inside Structs

It's almost trivial to put a pointer inside a structure--leave room for 8 bytes, and write the QWORD there.  Of course, the usual problem with pointers is knowing when to follow them!

You can make some really interesting behavior in programs by having one function that follows various pointers inside a struct.  For example, a linked list is just a struct with a pointer to the "next" struct.

Structs in plain C

In plain C, not C++, "class" isn't a keyword, so you have to use "struct".  Also, "struct bar" isn't the same as "bar".  So you need to use a typedef to make both a "struct tag" and an actual typename at the same time:

typedef struct bar_tag {
    int x;
    int y;
} bar;

So now "bar" is a typedef for "struct bar_tag", which acts just a struct like in C++.  This is one case where the C++ version is so much better the C way has been almost totally forgotten.

Structs and Alignment

"alignment" is where a 4-byte object must sit in memory at a pointer divisible by 4.    On some CPUs (PowerPC and DEC Alpha are prominent examples), for example reading an int from an unaligned address like 0x10000003 can be 30x to 1000x slower than reading from an aligned address like 0x10000004!  On x64, the penalty for unaligned access is small--either unnoticable or at worst a twofold slowdown.  On MIPS or SPARC machines, an unaligned pointer access can kill your program! 

Here's a runnable example of unaligned access.  It works fine on x86, but crashes a SPARC or MIPS machine:
static const char arr[8]={0xaa,0xbb,0xcc,0xdd,0xee,0xff,0x00,0x11};
int *matey=(int *)(arr+1); /* <- treat the middle of the array like an int */
return *matey;

(Try this in NetRun now!)

To avoid unaligned accesses, the compiler may insert "padding" (unused space) into your structs to improve alignment.

For example,
struct bar {
int x;
char z;
int y;
};

std::cout<<"sizeof(bar)=="<<sizeof(bar)<<"\n";
std::cout<<"offsetof(bar,x)=="<<offsetof(bar,x)<<"\n";
std::cout<<"offsetof(bar,z)=="<<offsetof(bar,z)<<"\n";
std::cout<<"offsetof(bar,y)=="<<offsetof(bar,y)<<"\n";
return 0;
(executable NetRun link)

In a perfect world, this would be a 9-byte struct: two 4-byte ints, and one one-byte char.  But to avoid an unaligned access to that last int, the compiler sticks in 3 bytes of padding after the char.

Field
Size
x
4 bytes
z
1 byte
(padding)
3 bytes (to a total of 4)
y
4 bytes

On x64, char is 1-byte aligned (in other words, char never has padding), short is 2-byte aligned (meaning the pointer must be a multiple of 2), and everything else (int, long, long long, and even double) are 4-byte aligned.  On most other machines, including PowerPC, a builtin type of N bytes must be N-byte aligned; so double is on 8-byte alignment--a char followed by a double wastes 7 bytes for alignment!

Fighting Padding Waste

Padding is usually there to make the CPU work right, so it's often not something you *should* fight.

Yet padding can cause wasted space, and cause very strange values for disk files and the network, so we often want to avoid padding.
Curiously, padding ignores sub-struct boundaries--the compiler will insert padding if the contained fields aren't aligned properly.  In particular, this means a struct/class containing nothing but char fields will never itself need padding.  So you can make a 5-byte class containing only chars, and it won't have padding.  Put two of them together in a new class, and it'll only be 10 bytes.  Put one 5-byte class plus a 4-byte int together in a class, and it'll be 12 bytes because of padding.

On some machines, like x64-64, the stack itself is 16-byte aligned.  This means if you need only 4 bytes of space on the stack, you have to take 16 bytes, and leave 12 bytes unused!

Bitfields

A "bitfield" is a struct where you tell the compiler you only care about a subset of the bits in each field.  The syntax is just to put a colon and a number of bits after each field.  For example, "t" is just 2 bits long here because of the ":2"
struct bar {
unsigned char src:3;
unsigned char dest:3;
unsigned char t:2; /* just 2 bits long! */
};

bar b;
b.t=3;
b.dest=6;
b.src=2;

printf("b in octal is 0%o\n", *(unsigned char *)&b);
return sizeof(b);
(executable NetRun link)

The overall struct is just 1 byte, 8 bits, which is pretty cool.  This example is actually the funk_emu 03ds byte, which is the x86 ModR/M byte.

Be warned that the usual padding and alignment rules apply even to bitfields; so replacing "unsigned char" with "int" above results in a 4-byte struct, because the compiler makes sure "int"s are 4-byte aligned, even in a bitfield!

Fake Integers

One way to avoid padding waste is to build a sort of "synthetic integer" (my terminology, I don't know if there is a standard name!).  This is a class, containing only chars, that uses operator overloading to act like an integer.  Because no real integers are involved, only chars, the compiler won't pad the class:
class nopadint {
public:
unsigned char data[4];
int operator=(int i) {
data[0]=(unsigned char)i;
data[1]=(unsigned char)(i>>8);
data[2]=(unsigned char)(i>>16);
data[3]=(unsigned char)(i>>24);
return i;
}
operator int () {
return data[0]+(data[1]<<8)+(data[2]<<16)+(data[3]<<24);
}
};
class bar {
public:
char x;
nopadint y;
bar() {x=3; y=7;}
void do_stuff_on_bar(int stuff) { x+=stuff;}
};
int foo(void) {
bar b;
std::cout<<"size of bar: "<<sizeof(b)<<"\n";
return b.x+(int)b.y;
}

(Try this in NetRun now!)

Compilers are almost frighteningly good at figuring out what you mean by these things: the compiler is able to optimize away both x and y in the above function, and make foo just "return 10"!