Stereo Imaging

CS 481 Lecture, Dr. Lawlor

The human visual system perceives 3D shape from a variety of sources:

Size, occlusion, lighting, and texture cues work together to help your brain estimate objects' size and distance. For example, in this image both black doors are exactly the same size on the image, but we perceive the door on the horizon as being much further away, and hence bigger. The same illusion causes the moon to look way bigger down on the horizon (where it's clearly behind the mountains, and hence must be huge) than near-overhead (without any distance scale, your brain isn't faced with the enormous size of the moon).
Motion parallax is probably the next strongest visual depth cue--when you move your head sideways by a distance h, 3D objects at a distance d move by h/d radians. Motion parallax falls off quickly with distance. Faking motion parallax is what 3D graphics is all about; but you've got to make sure the camera (or objects) can move sideways to the camera. Some systems have a "wiggle" mode that causes the camera to slide back and forth, to enhance motion parallax. Head-tracking systems, that watch the user's head location, can do a good job of simulation motion parallax.
Atmospheric perspective is another strong distance cue at long ranges. It is caused by sky-light scattering into your view, as your view cuts through larger amounts of air, and eventually converges on sky color. For example, the more distant hills and mountains in the photo below begin to approach sky color. A mathematical model might be that with every mile of distance, we blend in a little more sky light (C(d+1)=0.9*C(d)+0.1*sky), which integrates to the usual exponential model C(d)=pow(0.9,d)*C(0)+(1-pow(0.9,d))*sky. The computer graphics equivalent of this effect is usually called "fog", although in graphics fog is often absurdly thick. Real atmospheric scattering tends to take tens of kilometers to significantly affect colors, while often in games the end of a hallway is often visibly foggier. Maybe game programmers work in really smoky buildings, or live in foggy places like Seattle or London!
Stereo disparity is the difference between the image seen by both of your eyes. Here's an experiment--try covering up one eye, holding your head perfectly still, and staring at the world. It looks flat! Stereo disparity is pretty tricky to add to rendering systems, so of course people are fascinated by it!

Stereo Disparity

Your two eyes see the world slightly differently. Your brain estimates depth from these differences. To simulate this effect, you first need to render the world from the point of view of each eye. This is easy. You then need to feed each eye-image into the user's eyes independently. That's the hard part, but there are lots of interesting ways to do it:

You can show the images side-by-side, and let the user cross his eyes, or go wall-eye. Special hardware called a stereoscope can help you do this, and dates back to the victorian era. I shoot a lot of my digital photos in stereo, and then just view them side by side.
You can build two independent screens, and put one in front of each eye. This looks cool, but so far it's flopped in the marketplace (see Nintendo's 1995 "Virtual Boy").
You can build one high-resolution screen, and put a sheet of tiny lenses in front of the screen, so each eye sees a different part of the screen through the lenses. This is called a lenticular display, and has the advantage that you don't need to wear anything funky on your head (it's "autostereoscopic"). The downside is that even commercial systems currently have fairly pathetic resolution and high prices.
You can use holography to capture the light from real-life or digital models. Full holography requires maybe 50,000dpi, but you can do a decent job with as few as 5,000dpi. MIT has built several realtime computational holographic output devices based on graphics cards.
You can use a single screen, and show each eye's image in rapid succession. You can then use lightweight LCD shutterglasses to black out the other eye, and the user's brain will fuse the images together. Because the left-right flipping cuts the per-eye frame refresh rate in half, it helps to start with a pretty high-refresh-rate system (90+Hz works best). The upside is this is really quite cheap (my eDimensional glasses cost $40), and it works on any display with varying degrees of success. I've gotten poor results on LCD displays and projectors (they just can't switch between the left and right image fast enough, resulting in ghosting), passable results on CRT monitors (there's still a bit of ghosting, due to the phosphors staying lit), and truly incredible results with some DLP projectors. DLP projectors are designed to flash out their images to a passive screen, so there's zero ghosting or blurring. The ARSC Discovery Lab uses shutterglasses too.