Bits and Hexadecimal

CS 301 Lecture, Dr. Lawlor

"Never trust a man that can count to 1023 on his fingers."
    - Filippo Gioachin's signature

So here's normal, base-1 counting on your fingers.  In base 1, you just raise the number of fingers equal to the value you're trying to represent:
  • To represent two, raise two fingers.
  • To represent six, raise six fingers.
  • To represent 67, grow more fingers.


Base-1 computation on a human hand: all fingers count as 1 unit
This is funky base-2 counting on your fingers. Each finger represents a different value now, so you have to start counting with '1' at your pinky, then '2' with just your ring finger, and '3=2+1' is pinky and ring finger together. '4' is a single raised middle finger. Then '5=4+1' is middle finger and pinky, and so on. Just 10 digits actually allows you to count all the way to 1023, but we'll ignore the thumbs and just use 8 fingers, to count up to 255=128+64+32+16 (left hand palm-up, pinky is 16) +8+4+2+1 (right hand palm-down, pinky is 1).
  • To represent one, raise the 1 finger.
  • To represent three, raise the 2 and 1 fingers together.
  • To represent ten, raise the 8 and 2 fingers together.
  • To represent twenty, raise the 16 (left pinky) and 4 fingers.
  • To represent 67, raise the 64 (left middle finger), 2, and 1 fingers.

This is actually somewhat useful for counting--try it!

(Note: the numbers four, sixty-four, and especially sixty-eight should not be prominently displayed.  Digital binary counting is not recommended in a gang-infested area.)
Base-2 computation on a human hand: finger values are 8, 4, 2, and 1
Counting on your fingers is "digital" computation--it uses your digits!

Why are bits important?

So most of the usual work you do in C/C++/Java/C# manipulates integers or strings.  For example, you'll write a simple line like:
    x = y + 4;
which adds 4 to the value of y.

But sometimes you have to understand how this works internally.  For example, on a 32-bit machine, this code returns... 0.
long x=1024;
long y=x*x*x*4;
return y;

(Try this in NetRun now!)

Why?  The real answer is 4 billion (and change), which requires 33 bits: a 1 followed by 32 zero bits.  But on a 32-bit machine, all you get is the zeros; the higher bits "overflow" and (at least in C/C++) are lost!  Understanding the bits underneath your familiar integers can help you understand errors like this one.  (Plus, by writing assembly code, you can actually recover the high-order bits after a multiplication if you need them.)

What's the deal with all this hex?

Humans have used the "place/value" number system for a long time--the Sumerians used base-60 in 4000BC! (Base-60 still shows up in our measurement of time and angles: hours have 60 minutes, which have 60 seconds, degrees also have seconds, and the circle is divided into six sections of 60 degrees each.  The Maya used base 20.  The world standard, though, is now base 10 using Arabic numerals.  For example, decimal 86 = 8 * 10 + 6.

But every computer built today uses binary--1's and 0's--to do its work.  The reason is electrical--0 is no voltage, 1 is voltage.  Having just two states makes it easy to build cheap and reliable circuits; for example, a transistor will threshold the input value and either conduct or not conduct.  A single zero or one is called a "bit". 

OK, so we got 1's and 0's: how to we build bigger numbers?  The modern standard method is using "binary", which is just the place-value system using base 2.  In binary, 1 means 1; 10 (base 2) means 2 (base 10); 100 (base 2) means 4 (base 10); 1000 (base 4) means 8 (base 10);  10000 (base 2) means 16 (base 10); and so on.  Every machine produced today supports direct binary arithmetic. Sadly, for a human writing or reading binary is really painful and error-prone for large numbers.  For example, one million is 11110100001001000000 (base 2), which is painful to write or read.  So instead, we often use a larger base. 

Back in the 1970's, it was pretty common to use octal (base 8), but the modern standard is hexadecimal--base 16.  Base 16's gotta use 16 different digits, and there are only 10 arabic numerals, so we use normal alphabet letters for the remaining digits.  For example, 15 (base 10) is just F (base 16); an one million in hex is F4240 (base 16).  You've got to be careful about the base, though--the string "11" would be interpreted as having the value 1*2+1=3 if it was base 2, the usual 1*10+1=11 if it was base 10, or 1*16+1=17 in base 16!

Place/bit Numberi
...
4
3
2
1
0
Decimal: Base-10
10i
...
10000
1000
100
10
1
Binary: Base-2
2i ...
16 = 24
8 = 23
4 = 22
2
1
Octal: Base-8
8i
...
4096=84 512=83 64=82 8
1
Hex: Base-16
16i ...
65536 = 2
4096 = 163 256 = 162
16
1
Base-n
ni ...
n4
n3 n2 n
1 = n0

Number bases used throughout time:
Decimal
Hex Binary
0
0 0
1
1 1
2
2 10
3
3 11
4
4 100
5
5 101
6
6 110
7
7 111
8
8 1000
9
9 1001
10
A 1010
11
B 1011
12
C 1100
13
D 1101
14
E 1110
15
F 1111
16
10
10000
Hexadecimal-to-decimal and binary table.

Note that a single digit in base 16 corresponds to exactly 4 bits, since 16=2*2*2*2.  This means it's easy to convert from a binary number to a hex number: take groups of 4 bits and convert to a hex digit--or back again: take each hex digit and expand out to the 4 bits it represents.  For example, 0xF036 is, in binary, 1111000000110100, because you can match up the place-values like this:
Hex Place-Value
163
162
16
1=160
Hex Digit
F
0
3
6
Binary Digit
1
1
1
1
0
0
0
0
0
0
1
1
0
1
1
0
Binary Place-Value
215
214
213
212
211
210
29
28
27
26
25
24
23
22
21
20
Converting 0xF036 (hex) to 1111000000110110 (binary)

Hex really is the only true universal in assembly and disassembly.  For example, here's some random disassembled code (produced using "objdump -drC -M intel /bin/ls" on a Linux machine):
 804a516:       80 fa 3f                cmp    dl,0x3f
804a519: 0f 84 7c 01 00 00 je 804a69b <exit@plt+0xc2b>
Note that every single number is listed in hex--the addresses, on the left; the machine code, in the middle; and the constants in the assembly, on the right.  A binary file display tool is called a "hex dump".  A binary file editor is called a "hex editor".  That's how common hex is, so for the rest of the class to make sense, you've gotta learn it!