PCL/OpenNI tutorial 0: The very basics
Go to root: PhD-3D-Object-Tracking
One of the most important fields of robotics is computer vision. A machine that is able to "see" its environment can have a lot of applications. For example, think of a robot in a production line that grabs some parts and moves them somewhere else. Or a surveillance system that is able to recognize how many people are in a room. Or a biped robot making its way through a room, evading obstacles such as tables, chairs or other people.
For many years, the most common sensors for computer vision were 2D cameras, that retrieved a RGB image of the scene (like all the digital cameras that are so common nowadays, in our laptops or smarphones). Algorithms exist that are able to find an object in a picture, even if it is rotated or scaled, or that can retrieve motion data from a stream of video, and even perform 3D analysis to track the camera's position. All these years' worth of research is now available in libraries like OpenCV (Open Computer Vision).
3D sensors are available, too. The biggest advantage they offer is that it is trivial to get measurements about distances and motion, but the addition of a new dimension makes calculations expensive. Working with the data they retrieve is a lot different that working with a 2D image, and texture information is rarely used.
During the next tutorials I will explain how to get a common depth sensor working with a 3D processing library.
Depth sensors
3D or depth sensors give you precise information about the distance to the "points" in the scene (a point would be the 3D equivalent of a pixel). There are several types of depth sensors, each working with a different technique. Some sensors are slow, some are fast. Some give accurate, high-res measurements, some are noisy and low-res. Some are expensive, some can be bought for a hundred bucks. There is no "perfect" sensor and which one you choose will depend on your budget and the project that you want to implement.
Stereo cameras
Stereo cameras are the only passive measurement device of the list. They are essentially two identical cameras assembled together (some centimeters apart), that capture slightly different scenes. By computing the differences between both scenes, it is possible to infer depth information about the points in each image.
A stereo pair is cheap, but perhaps the least accurate sensor. Ideally, it would require perfect calibration of both cameras, which is unfeasible in practice. Bad light conditions will render it useless. Also, because of the way the algorithm works (detecting corresponding points of interest in both images), they give poor results with empty scenes or objects that have plain textures, where few interest points. Some models circumvent this by projecting a grid or texture of light on the scene (active stereo vision).
I will not go into detail about the math involved with the triangulation process, as you can find it on the internet.
Time-of-flight
Time-of-flight (ToF) sensors work by measuring the time it has taken a ray or pulse of light to travel a certain distance. Because the speed of light is a known constant, a simple formula can be used to obtain the range to the object. These sensors are not affected by light conditions and have the potential to be very precise.
LIDAR
A LIDAR (light+radar) is just a common laser range finder mounted on a platform that is able to rotate very fast, scanning the scene point by point. They are very precise sensors, but also expensive, and they do not retrieve texture information.
Go to root: PhD-3D-Object-Tracking
Links to articles:
PCL/OpenNI tutorial 0: The very basics
PCL/OpenNI tutorial 1: Installing and testing