Memory Allocation
CS 321 Lecture,
Dr. Lawlor, 2006/02/24
So now
we know how to ask the OS for raw pages of memory. How do you
break those pages up into usable chunks? You use a
smaller-than-page-granularity allocator like "new" (the C++ allocator)
or "malloc" (the C allocator).
Summary
When you ask for n bytes, the memory allocator has to use more than n bytes of memory because:
- The allocator has to keep the memory aligned to at least 8 byte boundaries, so n is often rounded up to 8 bytes.
- The allocator has to keep track of the size of each block (because 'delete" and "free" don't tell you this!), so asking for n bytes almost always results in the allocator using 8+n bytes.
The "housekeeping" data associated with an allocated piece of memory is
invariably stored right before the start of the memory, like this:
stuff
|
size: 8+n+1
|
<- User
returned pointer
here.
-- n bytes of user data go here --
|
So on almost every 32-bit machine (Linux, Windows, MacOS), this code prints the size of an allocated block of memory:
int getBlockSize(void *block) {
int *p=(int *)block;
int *sz=p-1; /* points to the "size" field */
return (*sz - 8)&~1; /* subtract off the 8-byte header, and mask off the "present" flag */
}
int foo(void) {
int *buf=new int[1234];
return getBlockSize(buf);
}
On Windows, the "stuff" field above seems to store the size of the *next* block. E.g., after allocating blocks of size 0xa0, 0xb0, 0xc0, ....
stuff size
p[0]- 407D1B50: 73737373 000000b1 000000a1 p->73737373 73737373 73737373
p[1]- 407D1AA0: 73737373 000000c1 000000b1 p->73737373 73737373 73737373
p[2]- 407D19E0: 73737373 000000d1 000000c1 p->73737373 73737373 73737373
p[3]- 407D1910: 73737373 000000e1 000000d1 p->73737373 73737373 73737373
p[4]- 407D1830: 73737373 000000f1 000000e1 p->73737373 73737373 73737373
p[5]- 407D1740: 00000000 00000730 000000f1 p->73737373 73737373 73737373
On Linux, the "stuff" field
above seems to store the size of the *previous* block, but only if the
previous block is empty. The same picture above is now:
stuff size
p[0]- 0x804a008: 00000000 00000000 000000a1 p->73737373 73737373 73737373
p[1]- 0x804a0a8: 73737373 00000000 000000b1 p->73737373 73737373 73737373
p[2]- 0x804a158: 73737373 00000000 000000c1 p->73737373 73737373 73737373
p[3]- 0x804a218: 73737373 00000000 000000d1 p->73737373 73737373 73737373
p[4]- 0x804a2e8: 73737373 00000000 000000e1 p->73737373 73737373 73737373
p[5]- 0x804a3c8: 73737373 00000000 000000f1 p->73737373 73737373 73737373
Gory Linux Details
The Linux glibc malloc (see glibc-2.3.4/malloc/malloc.c line 1700) has this big comment:
Chunks of memory are maintained using a `boundary tag' method as
described in e.g., Knuth or Standish. (See the paper by Paul
Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
survey of such techniques.) Sizes of free chunks are stored both
in the front of each chunk and at the end. This makes
consolidating fragmented chunks into bigger chunks very fast. The
size fields also hold bits representing whether chunks are free or
in use.
An allocated chunk of memory looks like this:
chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if NOT allocated |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_space() bytes) .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where "chunk" is the front of the chunk for the purpose of most of
the malloc code, but "mem" is the pointer that is returned to the
user. "Nextchunk" is the beginning of the next contiguous chunk.
Chunks always begin on even word boundries, so the mem portion
(which is returned to the user) is also on an even word boundary, and
thus at least double-word aligned.
The 'M' bit indicates that this block is allocated directly using mmap().
The 'P' bit indicates that the previous chunk is: 1 present and allocated; or 0 free.
Free chunks are stored in circular doubly-linked lists, and look like this:
chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`head:' | Size of chunk, in bytes |P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Forward pointer to next chunk in free list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Back pointer to previous chunk in free list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused space (may be 0 bytes long) .
. .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`foot:' | Size of chunk, in bytes |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The P (PREV_INUSE) bit, stored in the unused low-order bit of the
chunk size (which is always a multiple of two words), is an in-use
bit for the *previous* chunk. If that bit is *clear*, then the
word before the current chunk size contains the previous chunk
size, and can be used to find the front of the previous chunk.
The very first chunk allocated always has this bit set,
preventing access to non-existent (or non-owned) memory. If
prev_inuse is set for any given chunk, then you CANNOT determine
the size of the previous chunk, and might even get a memory
addressing fault when trying to do so.