The tragic events of September 11, 2001, almost overnight introduced new challenges to the world of security. The need for more-reliable and versatile sources of video information became a reality.
More than 40 million surveillance cameras are installed in various monitoring applications around the world. With very few exceptions, these are conventional closed-circuit television (CCTV) installations in which the cameras are interlaced analog TV cameras with a maximum resolution of 640 × 480 pixels and frame rate of 25 or 30 frames per second (f/s). If an object or a person is to be identified using such a camera, the corresponding angle resolution must be better than 45 to 50 pixels per linear foot, which sets the maximum camera field of view to 14 × 10 ft. In CCTV, the video information is captured on a standard videocassette recorder (VCR). In most cases, to save space (tape), the frame rate is reduced to 1 to 5 f/s and the resolution to 320 × 240 pixels, which leads to a reduction of the maximum field of view to about 7 or 8 ft.
Unfortunately, almost all CCTV cameras are set to monitor much wider areas, which further leads to significantly reduced resolution. Thus, in most cases, the camera cannot be used as a unique identification of the object of interest—for example, to identify a person. Rather, the camera can only indicate that someone is entering the building. Furthermore, the CCTV camera-VCR combination does not provide any cross-reference or search capabilities, which makes object and person tracking close to impossible.
Filling the new security needs
Most who closely follow surveillance technology developments agree that current surveillance cameras are not efficient and up to the level of the latest security requirements. For example, there are more than 1.5 million analog cameras installed in England, most of them in London, and the average Londoner has his or her picture taken more than 100 times per day. Still, no significant arrests have been made with the help of these cameras. The criminals, aware of camera presence and its limited capabilities, just move their area of activities to the nearest corner beyond the camera's field of view. The bombing of the central train station in London several years ago serves as a case study of the current security challenges. More than 1000 people were involved in reviewing the recorded video data from more than 40 cameras installed in the station. Following days of meticulous and exhausting examination of the tapes, it was concluded that a person with a red jacket had placed the deadly package. However, nothing more could be extracted from the taped scenes—nobody was able to identify the person, where he came from, or which direction he went after placing the bomb.
The first attempt at a digital solution was the development of "webcams"—low-cost, low-resolution universal-serial-bus (USB) cameras tied to personal computers (PCs) and primarily used for consumer applications such as teleconferencing. The next step up was the introduction of Internet-enabled cameras, also known as "network cameras." The camera is plugged into the existing local-area network (LAN) and the video information can be viewed via the Internet. Most of the complaints about these cameras are related to poor image quality, lack of plug-and-play setup, small image size—typically 320 × 240 pixels (recently 640 × 480 pixels)—and relatively slow frame rate (less than 15 f/s even at low resolution). To reduce the required bandwidth, most of the cameras implement a standard MJPEG compression. Regardless of the drawbacks, these cameras are gaining popularity; future "smart-camera" systems will probably be based on a similar approach.
In light of September 11, CCTV camera installations are growing rapidly, especially in airports, bus and train terminals, government buildings, public places, private homes, and so on. It is now becoming clear that very soon it will not be humanly possible to monitor, store, retrieve, search, analyze, and correlate all the incoming imaging data. To the outside world, installing more cameras may create a false sense of increased security, but in reality more cameras result in more problems and challenges. The solution is to replace old-fashioned analog CCTV installations with "smart" digital systems in which the camera itself plays an essential role.
Such a smart camera should have megapixel resolution, possibly a multispectral response from ultraviolet to infrared, fast frame rate, wide dynamic range, the ability to perform imaging algorithms (that is, to carry out interpreting tasks), network compatibility, multiple connectivity options (wireless, remote-control capabilities, cross-platform data sharing), and an affordable price.
Capable of implementing simple motion- and pattern-recognition algorithms, the smart camera is connected to a remote server via cable or wireless network (see Fig. 1). From the server, the information is distributed to the end user via the Internet (LAN or wireless) and should be accessible via regular PC, personal digital assistant, or mobile telephone. The user can control the camera performance, resolution, speed, and algorithm implementation. The camera is able to transfer control information to the server and to the other cameras in the same network.
The main part of the smart camera is the image sensor. As is well known, CMOS (complementary metal-oxide semiconductor) devices are becoming a strong competitor to CCDs (charge-coupled devices). Initially, CMOS was used only in low-cost consumer video products, but now it has migrated to more-advanced imaging applications. Some remaining issues such as dark current, fixed-pattern noise, lower light sensitivity, and relatively low signal-to-noise ratio are still obstacles for CMOS for precision applications and high-quality surveillance. With the help of extensive and well-funded research, however, the performance gap between CMOS and CCD is rapidly closing. By expanding array sizes and improving the image quality, signal-to-noise ratio, dynamic range, and light sensitivity, the migration from CCD to CMOS in demanding imaging applications becomes more feasible.
A set of image-processing capabilities is an essential feature of the smart camera. The camera "brain" should be able to determine the best compromise between resolution and speed to optimize transmission of information to and from other cameras in the network. The camera tasks will shift from simple "vision" tasks such as motion detection and compression to more sophisticated "understanding and interpreting" tasks such as pattern analysis and recognition, active scene monitoring, and decision-making. The actual implementation of the "brain" is wide open, and every camera manufacturer has its own approach. Most existing cameras of this type use a hardware chip as the brain, in which most of the camera functions are preset and cannot be changed. While this type of chip provides faster execution of complicated camera algorithms, given the two-year average development cycle, the risk of such a camera soon becoming obsolete is high.To add versatility, engineers at Imperx have developed 1-, 2-, and 4-megapixel surveillance cameras based on the programmable system-on-a-chip approach. The brain of these cameras is based on a programmable hardware-software platform, offering flexibility, high speed, and efficiency (see Fig. 2).
Choosing an interface standard
A critical function is the communication between the camera and server. It must provide a wide bandwidth for fast data transfer, reliable long-distance spans between repeaters, bidirectional communication, flexible topology, many devices per network, and easy and affordable cabling. In addition, it is desirable that the data format be platform independent. There are several standards used for camera-computer interface, but none addresses all the needs.
Historically the first serial standard to break a 1-Mbit/s camera-computer transfer interface, USB, is very popular in still digital cameras. The newer USB2 has a bandwidth of greater than 400 Mbit/s. Accommodating 127 devices per bus, it is bidirectional and can distribute 5 V of power to the camera; however, its cable-length limitation of 5 m per single span (30 m max) presents problems for surveillance applications.
Firewire (IEEE-1394) was developed to address high-speed data transfer. It has a very flexible network topology—tree or star, with up to 63 devices per bus. It can distribute 5 V of power to the camera, and with a bandwidth of 400 Mbit/s (with future expansion to 800 Mbit/s) this standard is widely used in consumer and industrial camera applications. Unfortunately, it has cable-length limitations similar to USB, and thus is not a good candidate for surveillance applications.
An interface standard called SDI (IEEE 292M, IEEE 274) is widely popular in the broadcast industry. Almost all broadcast cameras have SDI interfaces with optical fiber or coaxial cables used as communications media. It has a bandwidth of 1.5 Gbit/s with no significant cable-length limitations. The existing infrastructure used in broadcast—data recorders, wireless interface, editing stations—is readily available and can be seamlessly integrated into surveillance equipment. For PC applications, a frame grabber is required. One of the biggest drawbacks is that the standard is not network-compatible; that is, it is only a point-to-point communication.
Camera Link is a relatively new standard primarily used in machine-vision applications. It has very wide bandwidth—more than 5 Gbit/s. With a fiberoptic extension, distance is not an issue. However, similar to SDI, it is not network-compatible. Also like SDI, a frame grabber is required for PC applications and there is no provision for power distribution to the camera.
Ethernet (1000BASE-T) is probably the best candidate for surveillance applications. It has relatively good bandwidth (up to 1 Gbit/s), natively supports bidirectional communication, and is network compatible with virtually no limit on the number of devices. It does not require special PC hardware and has a long cable span of more than 100 m with good noise immunity and reliability. Ethernet connectivity solves cross-platform data-transfer problems. If the camera is Internet-compatible (supports TCP/IP or similar protocol), the data can be transported and viewed over different platforms, including mobile means of communication.
PETKO DINEV is chief technical officer of Imperx, 6421 Congress Ave., Suite 204, Boca Raton, FL 33487; e-mail: [email protected].