Uniqueness algorithm mimics human ability to match images across domains
Pittsburgh, PA--Scientists at Carnegie Mellon University have developed a computer algorithm that uses "uniqueness" to find images that are similar to each other across different media—photos, paintings, sketches, and so on.
The research team, led by Alexei Efros, associate professor of computer science and robotics, and Abhinav Gupta, assistant research professor of robotics, found that their surprisingly simple technique performed well on a number of visual tasks that normally stump computers, including matching sketches of automobiles with photographs of cars. The team will present its findings on "data-driven uniqueness" on Dec. 14 at SIGGRAPH Asia (Hong Kong). Their research paper is available online at http://graphics.cs.cmu.edu/projects/crossDomainMatching/.
Most computerized methods for matching images focus on similarities in shapes, colors and composition. That approach has proven effective for finding exact or very close image matches and enabled successful applications such as Google Goggles. But those methods can fail miserably when applied across different domains, such as photographs taken in different seasons or under different lighting conditions, or in different media, such as photographs, color paintings, or black-and-white sketches.
"The language of a painting is different than the language of a photograph," says Efros. "Most computer methods latch onto the language, not on what's being said." One problem, he adds, is that many images have strong elements, such as a cloud-filled sky, that may have superficial similarities to other images, but really only distract from what makes the image interesting to people. He and his collaborators hypothesized that it is instead the unique aspects of an image, in relation to other images being analyzed, that sets it apart and it is those elements that should be used to match it with similar images.
Lots of number-crunching
The team computes uniqueness based on a very large data set of randomly selected images. Features that are unique are those that best discriminate one image from the rest of the random images. In a photo of a person in front of the Arc de Triomphe in Paris, for instance, the person likely is similar to people in other photos and thus would be given little weight in calculating uniqueness. The Arc itself, however, would be given greater weight because few photos include anything like it.
"We didn't expect this approach to work as well as it did," Efros acknowledges. "We don't know if this is anything like how humans compare images, but it's the best approximation we've been able to achieve."
In one use, the technique can be combined with large GPS-tagged photo collections to determine the location where a particular painting of a landmark was painted.
The technique also can be used to assemble a "visual memex," or a data set that explores the visual similarities and contexts of a set of photos. For instance, the researchers downloaded 200 images of the Medici Fountain in Paris—paintings, historic photographs, and recent snapshots from various seasons and taken from various distances and angles—and assembled them into a graph, as well as a YouTube video that shows a particular path through the data.
Future work includes using the technique to enhance object detection for computer vision and investigating ways to speed up the computationally intensive matching process.

John Wallace | Senior Technical Editor (1998-2022)
John Wallace was with Laser Focus World for nearly 25 years, retiring in late June 2022. He obtained a bachelor's degree in mechanical engineering and physics at Rutgers University and a master's in optical engineering at the University of Rochester. Before becoming an editor, John worked as an engineer at RCA, Exxon, Eastman Kodak, and GCA Corporation.