Course Review for Final Exam

CS 301 Lecture, Dr. Lawlor

The final exam is Friday, December 17, 3:15pm, in the usual classroom (Chapman 106). It's nominally two hours long, but in reality it shouldn't take much longer than the midterm, about an hour.

Things you should know for the final exam are all contained in the lecture notes and homeworks:

Basics of assembly language: pre-midterm review stuff

bits and bytes

size of char, short, int, long, and pointers

Bitwise operations: & | ^ ~ << >>

what they're useful for: SIMD
how to use them to simulate branch operations: ret=(mask&then) | (~mask&else);

How classes get laid out in memory:

size in bytes of a class
byte offsets from a pointer to the start of the class
padding for alignment

preserved and scratch registers for x86-64
how the stack works, especially around a function call (what happens to the stack at a "ret", for example)
how flags work:

how "cmp rcx,4" and "jle somewhere" actually work together
how to detect overflow
how to do multi-precision operations

existence of and purpose of machine code

Floating-point numbers:

SSE instructions, like "addss" and how to use them to solve float problems.
Parallel SSE instructions, like "addps", and how to use them to solve bigger problems. But be careful of alignment!
Bits used to represent the sign (1 bit), exponent (8 bits), and mantissa (23 bits, plus implicit 1)
Floating-point roundoff:

small + big - big == 0 ?!
big + small == big ?!

Fixes for roundoff:

use a bigger datatype (replace float with double)
switch to integer (but watch out for overflow!)
rearrange floating-point operations to do all small numbers first, then the bigger ones

Performance impact of denormal and NaN floats

High performance computing:

How to time your code, and the vagaries of timers (such as noise and quantization)
Compiler hints, like "inline" and "const", or ways to scare the compiler, like "extern" or "volatile"
Algebraic rearrangement, like replacing x=x*x*x*x*x*x*x*x; with x=x*x; x=x*x; x=x*x;
Strength reduction, like replacing integer x%2ⁿ with x&(2ⁿ-1); or floating point divide "a/b" with multiply "a*(1.0/b)" .
Dynamic translation: sticking together new machine code bytes at runtime.

Other roads to assembly language

Java Bytecode:

Stack-based operands (no registers)
Dynamic translation to assembly

PowerPC assembly language:

Registers named "r3" and such.
Three-operand additions.
Two-instruction 32-bit loads.

GPU computing, like GLSL

Inherently parallel hardware and software
Extraordinarily high performance

Quantum computers

Up to exponential speedups via the laws of physics: superposition, entanglement, and collapse
No useful hardware currently exists: may be coming soon, or may not be physically possible to build!