CS 441/641 Project
Architecture Research Project (Project 2)
CS 441/641, Dr. Lawlor
A substantial chunk of your course grade comes from the two semester
projects. Project 1 was research-oriented. Project 2
will be more applied and implementation-oriented. From the
syllabus:
PROJ2:
a software development or hardware performance analysis project, due
in December.
Here's what's left of the semester:
November 2012
Su Mo Tu We Th Fr Sa
11 12 13 14 15 16 17 <- topic
18 19 20 21 22 23 24 <- working prototype (and Thanksgiving break)
25 26 27 28 29 30
December 2012
Su Mo Tu We Th Fr Sa
2 3 4 5 6 7 8 <- presentations
9 10 11 12 13 14 15 <- final draft (and final exam!)
The deliverables are:
- Topic: In class on November 13, be ready to very briefly (a few sentences) describe your chosen project topic.
- Prototype: Turn in working code by midnight Tuesday, December 20. It doesn't need a performance analysis or writeup by this point.
- Presentations: Prepare about fifteen minutes of clear, easy to understand material introducing us to the
topic, showing what you did, and discussing performance. 441
students will present Tuesday, December 4; 641 students will present
Thursday, December 6. No lecture notes or slides are required,
but clear evidence of preparation is!
- Final: Turn in your tuned code and cleanly summarized performance analysis by midnight Thursday, December 13. 641 students should turn in a *brief* (about 4 pages) technical paper summarizing selected prior work, their design, and the results of a performance analysis.
Possible Project 2 Topics
Or choose your own topic! Topics should all be applied
work of the form "Build your own X", not just paper research.
- Build an interesting circuit: extend your HW1 CPU, build a superscalar dependency detection unit, etc.
- Hardware performance analysis: benchmark some test programs that demonstrates some aspect of modern hardware, such as:
- Out-of-order execution (e.g., reorder instructions manually, compare to automatic reordering)
- Branch
prediction and execution speculation (e.g., reverse-engineer x86 branch
hardware, like compare always-taken branch performance with even-odd
branch performance)
- Dependency tracking (e.g., benchmark performance benefit from decreasing dependency tree depth)
- Cache prefetching and out-of-order loads and stores (e.g., compare cached loads with cached loads matching a previous store)
- Define a new instruction set, with a software or circuit simulator.
- Write and benchmark some code to perform any interesting task quickly on a particular architecture:
- Use bitwise operations to do something simple faster, or do something simple in a fiendishly complex way.
- Use assembly language or your knowledge of branch prediction, caching, etc to improve the performance of some program.
- Write a dynamic binary translator for any architecture.
- Use SSE or AVX instructions to speed up some code with the power of SIMD.
- Use OpenMP or pthreads to speed up some code with the power of multicore. (But you must get the right answer!)
- Using MPI or sockets to speed up code with the power of clustering.
- Use CUDA or OpenCL to speed up some code with the power of the GPU.
Your starting code can be something completely new, something you found
on the net (with a citation), an extension of any homework, example
from the lecture notes, etc.