Calling 'malloc' to Allocate Memory
CS 301 Lecture, Dr. Lawlor
Memory Allocation in General
Memory (like real estate) in theory could be used by anybody for
anything at any time (the anarchist squatter's paradise!). Of
course, in practice, it works a lot better to set up rules by which you
can figure out what memory's yours, and what isn't (e.g., deeds,
leases, rental contracts). So a piece of memory can be:
- Owned by you, and used by you. This is the good kind of memory, the only kind you should be using.
- Owned by somebody else, and erroniously used by you. You
can read or write surprisingly far past the end of an array before
crashing, although you can easily end up overwriting something used by
some other part of the program, or crashing yourself. (A
confusing "memory corruption" error.)
- Owned by somebody else, and deadly to even look at. You get
a "segmentation fault" access violation if you access this
pointer. The CPU enforces the OS's wishes using the "page table",
which you'll hear about in CS 321 (unless I tell you first!).
You can't tell the difference between type 2 memory (dangerous, but not
right now) and type 3 memory (immediate death), so stick to type 1!
The bottom line is you really need to claim memory before using it,
and then only use the part you claimed. It's easy to accidentally
run off the end of an array (owned by you, class 1 memory) into other
bytes of memory owned by some other part of the program (class 2
memory), or delete an array (so it's no longer owned by you) and use it
later, etc. Sadly, in C++ it's up to you the programmer to make
sure your uses of memory are correct, unlike Java or C# where pointers
aren't allowed and array indices are all carefully checked by the
compiler.
Anyway, there are a bunch of different ways for your code to legally claim some memory, including:
- Call C++ "new" operator, like "int *p=new int[10];". But
"new" isn't easy to call from assembly language, because it's a
compiler-builtin operator, not just a function.
- Call the C function "malloc", which takes a byte count and
returns a pointer, like "int *p=(int *)malloc(40);". This is a
perfectly ordinary function, so it's easy to call from assembly
language.
- Allocate space on the stack (see the next lecture).
Calling Malloc from Assembly Language
It's a pretty straightforward function: pass the number of *BYTES* you
want as the only parameter, in rdi. "call malloc." You'll
get back a pointer to the allocated bytes returned in rax. To
clean up the space afterwards, copy the pointer over to rdi, and "call
free" (I'm leaving off the free below, because you need the stack to do
that properly).
Here's a complete example of assembly memory access. I call
malloc to get 40 bytes of space. malloc returns the starting
address of this space in rax (the 64-bit version of eax). That
is, the rax register is acting like a pointer. I can then read
and write from the pointed-to memory using the usual assembly bracket syntax:
mov edi, 40; malloc's first (and only) parameter: number of bytes to allocate
extern malloc
call malloc
; on return, rax points to our newly-allocated memory
mov ecx,7; set up a constant
mov [rax],ecx; write it into memory
mov edx,[rax]; read it back from memory
mov eax,edx; copy into return value register
ret
(Try this in NetRun now!)
Rather than copy via the ecx register, you can specify you want a
32-bit memory write and read using "DWORD" in front of the brackets,
like this:
mov edi, 40; malloc's first (and only) parameter: number of bytes to allocate
extern malloc
call malloc
; on return, rax points to our newly-allocated memory
mov DWORD [rax],7; write constant into memory
mov eax,DWORD [rax]; read it back from memory
ret
(Try this in NetRun now!)
Malloc on arrays
The typical place you use malloc is to make some space for a
variable-length array. A bunch of "dd" commands works fine if you
know how many integers to allocate (for example, "times 100 dd 0" makes
room for a hundred integers), but you have to know how many you need at
compile time.
To allocate an array of n integers, you can:
- Compute the number of *BYTES* by multiplying by 4 (the number of bytes/integer).
- Call malloc to get a pointer to this many bytes. The pointer to the start of the array is returned in rax.
- Pick an array index, such as rcx.
- Multiply rcx by 4 to get a byte offset, add to the start of the array rax: "mov DWORD [rax+4*rcx], 17"
- Be sure to free the array when finished.
For example:
mov edi,10 ; ten integers in our array
imul edi,4 ; multiply by 4 to get a byte count
extern malloc
call malloc
; rax is a pointer to the allocated space
mov rdi,10; n
mov rcx,0 ; i
jmp testloop
initloop:
mov DWORD[rax+4*rcx],ecx; write to integer at index rcx
add rcx,1 ; i++
testloop:
cmp rcx,rdi
jl initloop
mov eax,DWORD[rax+4*7] ; pull out the integer at index 7
ret
(Try this in NetRun now!)
To allocate an array of n 64-bit "long" values, you just need to replace the "4" bytes/integer above with "8" bytes/long:
mov edi,10 ; ten longs in our array
imul edi,8 ; multiply by 8 to get a byte count
extern malloc
call malloc
; rax is a pointer to the allocated space
mov rdi,10; n
mov rcx,0 ; i
jmp testloop
initloop:
mov QWORD[rax+8*rcx],rcx; write to long at index rcx
add rcx,1 ; i++
testloop:
cmp rcx,rdi
jl initloop
mov rax,QWORD[rax+8*7] ; pull out the long at index 7
ret
(Try this in NetRun now!)