Stereo Vision

A Brief Introduction
by David Sharp
Any robot or creature that hopes to be able to handle itself well in a
real-world environment needs to take in much information about its
surroundings. The field of hobby robotics has long employed such
sensors as infrared emitters/detectors for such things as proximity
detectors, rangefinders, and line trackers, and sonar has been used to
acquire crude range maps. These and other sensors are great because
they are simple and inexpensive, but they quickly reach their practical
limits. It is hard to deny that some form of vision is essential for a
robot to be truly successful in the real world. For example, it is
necessary to tell how far away and how large objects are, and whether
the terrain is negotiable or not.
Humans use many visual methods, or depth cues, to tell the distances to
the things in our visual field. Here are a few examples:
- Everyone knows how big a basketball is, so if we see a basketball
that is a certain distance away, we can use its relative size to estimate how
far away it probably is.
- If you ride on a train and look out the window, the trees at the
side of the road will pass by much more quickly than the mountains in
the distance, so we can use assumptions about relative movement to estimate
distances.
- An object that is very close to or far from the eye requires that
the muscles in the eye distort the shape of the lens differently to focus the image correctly.
- If you see a building that covers most of another building in
your field of view, you can assume that the first building is in front
of the second building and thus nearer to you (occlusion).
However, perhaps the most compelling and interesting depth cue is that
of binocular disparity,
and that is what this page is primarily concerned with.
Created: 2/22/04