When Microsoft introduced the Kinect camera for the Xbox in 2010, it started a new era of man-machine interaction. The system was able to track people and their motions, using an infrared (IR) grid projector, a camera, and sophisticated software to create a 3D image.
Software and hardware have improved ever since. Today, some professional setups are available for advanced spatial-sensing tasks. A team from the Fraunhofer Institute for Applied Optics and Precision Engineering (IOF; Jena, Germany) has developed a new high-speed 3D sensor for such tasks. The system delivers full spatial information from 1000 × 1000 pixels at a 36 Hz frame rate. The resolution of the system is about 0.4 mm at a distance of 1.5 m.
High-speed 3D sensing with a GOBO projector
Industrial 3D sensing systems usually comprise three parts: illumination, camera(s), and a sophisticated computing system. While computers and cameras are more or less stock items, illumination is decisive for the performance of current 3D sensors.
Illumination has to solve two tasks here: First, it has to illuminate the scene sufficiently for image acquisition. That can be a challenge if more than a thousand frames are acquired per second or when shutter speeds drop below 1/10,000 s. In those cases, flux requirements can easily reach several tens of thousands of lumens.
Furthermore, many systems project a special pattern onto the scene; the pattern helps them find corresponding points in the different frames taken by the cameras around the object. So-called active illumination uses varying patterns, and a computer combines several frames from two cameras to obtain one 3D point cloud.
The team from Fraunhofer IOF has developed an illumination system that uses aperiodic sinusoidal patterns with varying period lengths and amplitudes.1 These are generated by a special rotating filter that is placed between the light source and the lens—a technique well known from theatrical illumination devices as GOBO (goes before optics).
In the initial setup from the Fraunhofer team, the system is capable of acquiring 1300 3D point clouds per second. The new sensor with irritation-free IR illumination is optimized for measurements of humans, particularly human faces.
Spatial information is recorded with two 1000 × 1000 pixel IR cameras. This data is merged with the data from a regular color camera. The signals of all three cameras are processed, and the system delivers 36 3D point clouds per second with full color information.
The illuminator emits about 4.5 W and changes patterns with a rate of 360 Hz. An additional ultrasonic distance-measurement system with two separate exits is located on top of the IR illuminator.
Applications in real-time face monitoring
The initial high-speed system is used for time-resolved crash-test monitoring in the automotive industry. The new IR sensor was developed for direct detection of human faces (see figure). It is able to detect and record people's poses, gestures, and facial expressions, which will be important for all kinds of man-machine interaction. A longer test series is scheduled for face and gesture recognition inside a car. The system's depth of detail also makes it appropriate for interactive training systems or advanced security applications.
The experts from Fraunhofer IOF have already done body motion studies with the original high-speed system. Using a high temporal resolution of 0.75 ms, they looked at rope skipping and the process of kicking a soccer ball. With the new eye-safe system, similar motion studies can be performed with focus on facial details, opening up new applications in medicine and in sports science.
Currently, the system has prototype status. As with most of its projects, Fraunhofer IOF is customizing the system for special applications. In this case, it will be used in a large-scale man-machine interaction research project.
1. S. Heist et al., Opt. Lasers Eng. (2016); http://dx.doi.org/10.1016/j.optlaseng.2016.02.017.