Pittsburgh, PA--The Robotics Institute at Carnegie Mellon University is using crowdsourcing from head-mounted cameras to provide subjective information about social groups that would otherwise be difficult or impossible for a robot to ascertain. What individuals are looking at within a group typically identifies something of interest, or helps delineate social groupings--insights that will someday be essential for vision-aided robots designed to interact with humans.The researchers tested the method using groups of people with head-mounted video cameras. By noting where their gazes converged in three-dimensional (3D) space, the researchers could determine if they were listening to a single speaker, interacting as a group, or even following the bouncing ball in a ping-pong game. The technique was tested in three real-world settings: a meeting involving two work groups; a musical performance; and a party in which participants played pool and ping-pong and chatted in small groups. Head-mounted cameras provided precise data about what people were looking at in social settings and software algorithms developed by the research team were able to automatically estimate the number and 3D position of "gaze concurrences"--positions where the gazes of multiple people intersected.
The camera images and software algorithms for determining "social saliency" could ultimately be used to evaluate a variety of social cues, such as the expressions on people's faces or body movements, or data from other types of visual or audio sensors. "This really is just a first step toward analyzing the social signals of people," said Hyun Soo Park, a Ph.D. student in mechanical engineering, who worked on the project with Yaser Sheikh, assistant research professor of robotics, and Eakta Jain of Texas Instruments, who was awarded a Ph.D. in robotics last spring. "In the future, robots will need to interact organically with people and to do so they must understand their social environment, not just their physical environment."
Though head-mounted cameras are still unusual, police officers, soldiers, search-and-rescue personnel and even surgeons are among those who have begun to wear body-mounted cameras. Head-mounted systems, such as those integrated into eyeglass frames, are poised to become more common. Even if person-mounted cameras don’t become ubiquitous, Sheikh noted that these cameras someday might be used routinely by people who work in cooperative teams with robots.
But the researchers were surprised by the level of detail they were able to detect. In the party setting, for instance, the algorithm didn't just indicate that people were looking at the ping-pong table; the gaze concurrence video actually shows the flight of the ball as it bounces and is batted back and forth. This finding suggests another possible application for monitoring gaze concurrence: player-level views of ball games. Park said if basketball players all wore head-mounted cameras, for instance, it might be possible to reconstruct the game, not from the point of view of a single player, but from a collective view of the players as they all keep their eyes on the ball. Another potential use is the study of social behavior, such as group dynamics and gender interactions, and research into behavioral disorders, such as autism.
More information on gaze concurrence, including a video, is available on the project website at www.cs.cmu.edu/~hyunsoop/gaze_concurrence.html. The researchers reported their findings Dec. 3 at the Neural Information Processing Systems Conference in Lake Tahoe, NV and the research was sponsored by the Samsung Global Research Outreach Program, Intel, and the National Science Foundation.
SOURCE: Carnegie Mellon University; www.ri.cmu.edu/news_view.html?news_id=282&menu_id=238