Researchers at the Johns Hopkins University (Baltimore, MD) have developed a simple video-based object-recognition method that provides a unique description of items difficult to recognize with other techniques. The group’s method requires little computation and produces relatively small object signatures for objects without edges, textures, colors, or patterns. Says researcher Elli Angelopoulou, "We extract a unique description of the object directly from the video images in one step. The method is simple and local and can be performed in real time at video rates." The technique also requires only off-the-shelf equipment.
Motivation
Object recognition is a crucial part of robotics and automation applications. Most systems compare a viewed object with a model, or signature, of the desired object. The signature can be created by extracting features of the object, such as edges. Fast, efficient techniques for object recognition have been developed using pattern recognition, edge-detection, and other methods. "There are lots of good techniques to recognize objects if they have prominent features such as edges, textures, colors, or patterns," says Angelopoulou. Smooth curved objects, however, do not lend themselves to these methods.
Angelopoulou, along with James Williams and Lawrence Wolff at Hopkins’ Computer Vision Laboratory, developed a method that can recognize objects by shape, even if the item is rotated, translated, or a different apparent size than the model. The method is based on the Gaussian curvature of the object—an invariant of a surface specified by Gauss’s theorem, sometimes called the total curvature.
A property related to the square of the curvature, called DeCov, has been shown to be invariant as long as the surface of the article stays in the same shape. Once the DeCov values of the object on the visible parts are calculated, then the object can be described by a signature distribution in a way that does not change when the article is moved. The signature generated using DeCov values is very different for objects with other shapes. This enables a machine system to identify an item after seeing the model it is looking for in only two to four poses. More-complex, asymmetrical objects might require more poses.
Method
Other researchers have found that the surface normal and local curvature of objects can be found from multiple images of the item taken from the same viewpoint but illuminated from different, precisely known locations. Wolff`s group determined that the DeCov invariant can be determined without having to explicitly calculate the surface normal and without knowing where two of the three light sources are located.
To create a signature, the group records several poses of the object—for a cylindrically symmetric golf tee, for example, top, side, and bottom views would all be used to create the model object signature, whereas a sphere would require only one pose. For each pose, three images are captured, each showing the object illuminated by a light in a different spot. By using three lighting conditions, each point in the image (lit by all three lights) provides three light-intensity data points showing how much light is reflected to the camera. The DeCov invariant can be calculated directly from these triplets. The DeCov information is then condensed to a pose-invariant object signature composed of a short, fixed-length sequence of real numbers. For recognition, a signature based on one pose of a new object is compared to a database of signatures of known objects.
The beauty of this method is that the system need not perform many calculations to create a signature, and the database of signatures is relatively small. Most methods either require large databases—such as those based on extracting simple features—or require complex calculations, which slows the recognition process. The DeCov method "bridges this gap between feature complexity and model efficiency," says Angelopoulou.
Further work
The DeCov method does have limitations and is still being developed. The method incorporates color, which may be useful, for example, for distinguishing a Coca Cola can from a Pepsi can, but in some cases, the user may not care about the color, wanting only to distinguish the object by shape. Ideally, a system would allow a user to choose which cues it uses.
Probable applications include parts tolerancing of, for example, molded plastic parts, and automated manufacturing. Thus, the technique could be used with a robot arm to pack balls used to play pool. The system could distinguish the different colors of balls, allowing the arm to pick a complete set out of a bin of mixed balls. "My favorite example," Angelopoulou says, "is a robot arm that selects and packages an assortment of chocolates."
More information on this work is available on the World Wide Web at http://www.cs.jhu.gifdu/~angelop. The re searchers will also present a paper on the subject next month (paper 2909-21) at Photonics East (Boston, MA).