mov ... DWORD[somePtr] ...Here are all the constant-generating "instructions", and the size of the data they make:
section .data
somePtr:
dd constant0
Instruction | C++ | Access |
Register |
Bits | Bytes |
dq 0x3 | long |
QWORD[somePtr] |
r _ _ |
64 |
8 |
dd 0x3 | int |
DWORD[somePtr] |
e _ _ |
32 |
4 |
dw 0x3 | short |
WORD[somePtr] |
_ _ |
16 |
2 |
db 0x3 | char |
BYTE[somePtr] |
_ l |
8 |
1 |
mov eax,DWORD[myInt] ; copy this int into eaxYou can copy a pointer value into a register, too. Here we're dereferencing a pointer stored in a register:
ret
section .data
myInt:
dd 0xa3a2a1a0 ; "data DWORD" containing this value
mov rdx, someIntPtr ; copy the address myIntPtr into rdx (like C++: p=someIntPtr;)A pointer to an array initially looks just like a pointer to anything else:
mov eax, DWORD [rdx] ; read memory rdx points to (like C++: return *p;)
ret
section .data
someIntPtr: ; A place in memory, where we're storing an integer.
dd 123 ; "data DWORD", our integer
mov rcx, myArray ; rcx points to myArray (like C++: p=arr;)Here's an example where we index into our little 4-integer array:
mov eax, DWORD [rcx] ; read memory pointed to by rcx (like C++: return *p;)
ret
section .data
myArray: ; A place in memory, where we're storing some integers.
dd 100 ; "data DWORD", here our array element [0]
dd 101 ; [1]
dd 102 ; [2]
dd 103 ; [3]
mov eax, DWORD [myArray+4*2] ; read myArray[2]
ret
section .data
myArray: ; A place in memory, where we're storing some integers.
dd 100 ; "data DWORD", here our array element [0]
dd 101 ; [1]
dd 102 ; [2]
dd 103 ; [3]
mov DWORD[myInt],7 ; overwrite our intBut if you leave off the "section .data", the constant is stored next to the program's machine code in "section .text" (a weird ancient name; machine code is not human-readable text!). This code section is readable but not writeable, so this segfaults:
mov eax,DWORD[myInt] ; copy the modified int into eax
ret
section .data
myInt:
dd 2 ; "data DWORD" containing this value
mov DWORD[myInt],7 ; overwrite our intYou can even store *code* in the modifiable "section .data".
mov eax,DWORD[myInt] ; copy the modified int into eax
ret
myInt:
dd 2 ; "data DWORD" containing this value
call myFunction
ret
section .data
myFunction:
mov eax,2
ret
The "mov" and "ret" instructions just emit bytes of machine code, identical to:
call myFunctionHowever, when code is in modifiable memory, you can modify the machine code! For example, if I know what bytes the assembler will output for "myFunction", I can actually figure out where to go in and modify the "myFunction" machine code, to change what the function returns! In this case, I just want to skip in past the 0xb8 (mov opcode) and overwrite the constant being loaded:
ret
section .data
myFunction:
db 0xb8,0x02,0x00,0x00,0x00,0xc3; code for my function
mov DWORD[myFunction+1],7 ; overwrite constant loaded by first, 0xb8 instructionThis returns 7, because the bytes of "myFunction" are modified before execution.
call myFunction
ret
section .data
myFunction:
mov eax,2 ; <- modified at runtime!
ret
mov rcx,myFirstData ; cur=head
keep_printing:
mov edi,DWORD[rcx+8] ; print_int(cur->value)
extern print_int
push rcx
call print_int
pop rcx
mov rcx,QWORD[rcx] ; cur=cur->next
cmp rcx,0
jne keep_printing
ret
section .data
myFirstData:
dq mySecondData
dd 3
mySecondData:
dq myThirdData
dd 7
myThirdData:
dq 0 ; END of list
dd 0
movzx reg,BYTE[address]Accessing data as bytes is useful for string processing, or to understand what really shows up in memory.
movzx eax,BYTE[myString + 2] ; read this byte into eax
ret
section .data
myString:
db 'w','o','a'
These are all equivalent ways to get the same 3-byte string:
db 0x77 db 0x6f db 0x61 |
db 'w' db 'o' db 'a' |
db 'w','o','a' |
db 'woa' |
db "woa" |
There are several standard functions that take a "C string": a pointer to a bunch of ASCII bytes, followed by a zero byte.
"puts" is one such standard function, and it prints the string you pass it plus
a newline. We can call puts to print out our string like this:
mov rdi,myString ; points to string constant below
extern puts
call puts
ret
section .data
myString:
db 'woa',0 ; need the trailing zero to mark the end of the string...
Here's an example where we load a byte from the middle of an
integer. Note that this returns 0xa2, since byte 0 is the
0xa0--the little byte--on our little-endian x86 machines.
movzx eax,BYTE[myInt + 2] ; read this byte into eax
ret
section .data
myInt:
dd 0xa3a2a1a0 ; "data DWORD" containing this value