Modern Multicore: Intel Sandy Bridge
CS 641 Lecture, Dr. Lawlor
This January a smart guy by the name of Anand Lai Shimpi wrote a good review of Intel's latest multicore chips, the "Sandy Bridge" series. This series has a lot of the features I expect to see in future CPUs:
- SMP + SMT: every part is multicore. Each core has two-way hyperthreading. These multiply out to 4-8 threads running at once.
- Wider SIMD. Sandy Bridge supports Advanced Vector Extensions(AVX), an 8-wide update to SSE.
- Every CPU has a GPU directly attached. Sandy Bridge now has an onboard 100-million-transistor GPU (details here).
AMD and ATI have been talking about this for years now, with the goal
of reducing CPU/GPU latency (the 4us kernel startup time in CUDA, for
example). Intel actually beat them to market, although the Intel
GPU is still substantially slower than even a $70 card from ATI or NVIDIA.
- Speed and thermal "binning". Some chips will tolerate faster clockrates;
these get sold as "K" or "S" series for more money. Some chips
will tolerate lower voltages, which reduces the power consumed; these
get sold as "T" series for more money.
- Utterly elastic clock rate. Once upon a time, a chip had a
fixed clock frequency, like 2.0GHz. Then to run longer on
batteries, manufacturers added features like SpeedStep to drop theclock rate to cut power usage.
Then to get better performance, they added features like Turbo Boost to
automatically overclock the CPU (if you've got thermal
headroom). Now they're actually selling CPUs with more
Turbo Boost power for more money. Modern CPU's internal clock
rate starts with a base clock "BCLK" traditionally generated by themotherboard at around 100-133MHz, which is then multiplied in a circuit called a phase-locked-loop (PLL) to hit the chip's gigahertz internal rate.
- Power consumption is lower, even at higher clock rates. The
twin goals of long battery life and minimal thermal envelope are
increasingly being augmented by increasing electric rates and an
appreciation of energy generation's long-term costs.
- The ancient 1980's text-based PC BIOS is being replaced with the Unified Extensible Firmware Interface (UEFI), which among other advantages provides a GUI for machine configuration.
- Hardware media decoding. Decoding full-framerate HD video is one of the most compute-intesive operations a modern computer is ever requested to perform. Dedicated
circuits provide parallelism and direct unit-to-unit communication
which make them faster than software: even GPU software. Anand's tests show Sandy Bridge's dedicated video encoding circuits as faster than even a CUDA encoder on the GeForce GTX 460.
- Cost is as important as performance. This is the recession's impact on hardware!