Despite 3D imaging technology being around for several decades, the first commercial 3D imaging products have only become available within the last 10-15 years following the use of the latest High Definition (HD) video cameras by major film studios to produce their 3D movies. Since then, 3D imaging technology has rapidly evolved rapidly across consumer markets, as well as within the machine vision industry.
Currently, the need for higher levels of process monitoring and automation are driving forces of improving 3D imaging technology within the machine vision industry. The traditional 2D approach is no longer a sufficient means of imaging, as the required level of accuracy and distance measurement that is essential to achieve complex object recognition and dimensioning applications cannot be achieved. Additionally, traditional 2D imaging is also incapable of handling complex interaction situations, such as the growing trend for human/robot co-working.
Overview of 3D Imaging
To obtain a 3D image, the four main techniques include stereo vision, structured light 3D imaging, laser triangulation and Time of Flight (ToF), the last three of which are part of the ‘active’ imaging family that require the use of an artificial source of light.
Stereo vision requires the use of two mounted cameras to obtain different visual perspectives of an object. The calibration techniques used in stereo vision involve aligning pixel information between the cameras and extracting the necessary information regarding the depth of the image, in a similar manner by which our brains visually measure distance. The transposition of the cognitive process into a system therefore requires a significant computational effort by the imaging system.
Figure 1. Stereo vision – Source Tech Briefs
The incorporation of standard image sensors into stereo vision cameras allow for the low costs of these cameras to be maintained, as the use of more sophisticated sensors, such as a high-performance sensor or global shutter, will result in a higher overall cost of the system.
However, it is important to note that the distance range is limited by mechanical constraints, as the requirement to achieve a physical baseline requires larger dimension modules. It is also necessary that the precise mechanical alignment and recalibration are accurate for this system. Additionally, this technique does not work well in changing light conditions, as it is heavily dependent upon the object’s reflective characteristics.
Figure 2. Structured light – Source University of Kentucky, Laser Focus World
In the structured light technique, a predetermined light pattern is projected onto an object to obtain an image’s depth information through the analysis of the distorted pattern. Since there is no conceptual limit on frame times, this imaging technique prevents motion blur from occurring, thereby allowing it to be a robust technique against multi-path interfaces.
However, it is important to note that the active illumination requires a complex camera while precise and stable mechanical alignment between the lens and pattern projector is needed. In these situations, there remains a risk of de-calibration. Additionally, the reflected pattern is sensitive to optical interference in the environment and therefore limited to indoor applications.
Laser triangulation systems measure the geometrical offset of a line of light whose value is directly related to the height of the object. This one-dimensional imaging technique is based on the scanning of the object and, depending on the distance between the laser and its point on the surface of the object, will determine the position of the laser dot appears in the camera’s field of view. The term triangulation indicates that the laser dot, the camera and the laser emitter form a triangle.
Figure 3. Laser triangulation
High-resolution lasers are typically used in displacement and position monitoring applications where high accuracy, stability and low temperature drift are required. Unfortunately, this technique is capable of covering only a short range and is also sensitive to ambient light and structured and/or complex structures, thereby limiting this technology to scanning applications. Complex algorithms and calibration are also required when utilizing laser triangulation.
Time of Flight
The term Time of Flight (ToF) represents each of the methods that implement a measurement of the distance from a direct acquisition or calculation of the double time of flight of the photons present between the camera and the scene. This measurement is performed either directly (D-ToF) or indirectly (I-ToF). The D-ToF requires a complex and constraining time-resolved apparatus, whereas the I-ToF is more simple in its operation, as its a light source is synchronized with an image sensor.
The pulse of light is emitted in phases with the shuttering of the camera, during which the dysynchronization of the light pulse is used to calculate the ToF of the photons to determine the distance between the point of emission and the object. In doing so, a direct measurement of the depth and amplitude in every pixel is ensured to create the final image, otherwise referred to as the depth map.
The ToF system has a small aspect ratio and a monocular approach with easy once-in-a-lifetime calibration that allows for this system to operate well in ambient light conditions. Some drawbacks of the ToF system include the need for active illumination synchronization, the potential for multi-path interference and distance aliasing.
Figure 4. ToF operating principle
There are only a few 3D systems currently in use, today mainly of which are based on 3D stereo vision, structured light cameras or laser triangulation. These systems typically operate at fixed working distances and require significant calibration to achieve specific areas of detection.
The ToF systems are therefore particularly advantageous, as they overcome many challenges to provide a greater amount of flexibility from an application point of view. Today, as a result of the pixel complexity and/or power consumption, most commercial solutions remain limited in image resolution to VGA (Video Graphics Array) or less.
Table 1. 3D imaging techniques ‘top-level’ comparison
CMOS Sensor Solution for ToF
The ToF technology offers a high application perspective, which has therefore prompted Teledyne e2v to develop the first 3D ToF solution with a true 1.3 megapixel (MP) depth of resolution, as well as a 1 inch optical format. This 3C ToF solution is based on a specific high sensitivity and high dynamic range CMOS sensor to enable grey scale image and depth fusion capability.
Additional product features of the CMOS sensor for ToF Imaging includes:
- State of the art 1.3 MP depth map resolution: depth map at full resolution, accuracy ±1 cm, high speed
- 3D image of fast moving objects at a rate of up to 120 frames per second (fps) and 30 fps depth map at full resolution, all the while maintaining a high global shutter efficiency
- Large range of 3D detection: from 0.5 m to 5 m
- High Dynamic Range (HDR) of 90 dB
- Visible and NIR high sensitivity sensor of 50% Quantum Efficiency (QE) at 850 nm and a HDR night/day night vision
- Embedded 3D processing of multiple regions of interest (ROI), of which includes two windows and a binning and/or on-chip histogram data contextual.
- A demonstrator platform has been developed to evaluate the unique 1.3 MP depth resolution that is outputted in either a depth map or a point cloud format.
This ToF demonstrator platform, as shown in Figure 5, consists of a compact 1 inch optical format board camera system that is based on the high sensitivity 1.3 MP sensor. An embedded multi-integration on-chip function (gated sensor), a light source and optics all combine to allow the ToF device to perform at a full 1.3 MP resolution.
Figure 5. The ToF demonstrator platform
Active Imaging Using ToF with an Adapted 5T CMOS Sensor
Active imaging utilizes an artificial light source. One example of a simple form of active imaging involves the assisted autofocus feature that is found on most modern cameras that use an infrared signal to measure the distance in low light conditions. Active imaging can be used to produce images in harsh weather conditions, such as rain and/or fog, as well as maintain active imaging techniques including range gating and ToF.
Range gating combines two components that include a pulsed light wave front and a specific high speed shuttering camera. The pulse light wave is sent towards the target and once the reflection returns from the plane of reflection, the high speed electronic shutter of the camera turns on at just the right moment.
Range gate allows for the selection of an image plane distance, which is dependent upon the synchronization of the light and the sensor. When the target is separated from the camera by a diffusive environment, such as rain, fog or the presence of aerosol particles within the environment, some of the photons, which are otherwise termed as ‘ballistic photons,’ are still capable of crossing the medium back towards the camera.
Although these photons are small in number, the capture synchronization of these photons allows the image to be captured through the diffusing medium. The range gating technique operation is successful at long distance almost without any limitation, however this is dependent on the power of the light source.
As compared to range gating, ToF instead allows the direct measurement of ToF light to determine the distance and location of the reflection plane from the camera. A system that is based on ToF technology will therefore require a rapid global shutter camera when the object is at a short distance away from the camera. Unlike active imaging, ToF does not focus on a specific image plane and therefore allows direct distance imaging within the range of interest.
As shown in Figure 6, the implementation of range gating image capture is based on a synchronized camera source light system that can be operated in the slave or master mode, depending on the specific constraints of the application. This type of camera exhibits an extremely fast global shutter that is in the impressive order of hundreds of nanoseconds.
A pulse of light is emitted by the source according to a trigger at the starting time (τ0) by the camera. Following the emission of the pulse of light, which is otherwise denoted as τ1, the pulse of light will reach the range to either be reflected or not, depending on whether an object is present or not. In the case of a reflection, the time that is required for the light to travel the distance back towards the camera is denoted as τ2. At the instant τ3 = τ0 + 2τ, where t represents the return time of light and the camera shutter opens.
At this point in time, the images captured by the camera will eventually the reflected signal. This cyclic process is repeated thousands of times during the frame duration to accumulate a sufficient signal with regards to the readout noise.
The image produced during this process is in the grey scale and corresponds only to objects that are present within the range. To produce a depth image, it is necessary to either sweep a set of images in range gating mode at several depths or by adjusting the delay τ. The distance of each point is then computed from this set of images.
Figure 6. Range Gating principle
Figure 7. Global Shutter pixel
As shown in Figure 7, the approach used in the pixel image sensor to produce short and synchronous integration times is also known as global shutter, which can be implemented through the use of a five-transistor (5T) pixel that is associated with a dedicated phase driver. The signal integration phase is therefore not carried out at all at once, but instead through a continuous motion as a result of the accumulation of synchronous micro-integration.
Teledyne e2v has developed a proprietary technique that is based on a five-transistor pixel and timing generation on alternate lines to determine the change in time (Δt) periods, all of which have been narrowed down to approximately 10 nanoseconds, which represents a significant improvement in temporal resolution. With its high sensitivity/low noise ratio, the 1.3 MP CMOS image sensor includes a multi-integration or ‘accumulation’ mode.
A high Parasitic Light Sensitivity (PLS) ratio, which is also referred to as the ‘extinction ratio’, which is placed between the pinned photodiode and the storage node capacitor, is also required to reject scene background for sharp images by rejecting parasitic light during the camera gating ‘off’ period.
Figure 8. 5T pixel CMOS with adapted timing and sync circuit needs an adequate extinction ratio to reject the scene background
To take further advance the ToF technology, Teledyne e2v has developed the novel BORA 1.3 MP CMOS image sensor for systems operating at short distances and ranges. As one of the only sensors currently available for industrial use, the BORA 1.3 MP CMOS features an optimized multi-integration mode, excellent performance in low light conditions and an electronic global shutter, all the while maintaining the accuracy and frame rate performance of previously existing ToF systems.
The BORA sensor was released in Fall 2017, and is available with a complete support service that provides assistance to customers to build their systems according to their specific application requirements. The new competitive performances of the BORA sensor are shown in Table 3.
Table 3. ToF platform performance comparison
(1) Accuracy gives the gab between the measured value and the actual value
(2) Temporal noise gives the RMS precision of measurement from frame to frame which represents the repeatability of the system
To improve the effectiveness and autonomy of industrial systems, the use of vision systems for guided robotics and other autonomous machines now requires the integration of 3D vision of object recognition at a superior accuracy. Several 3D techniques exist with specific advantages and limitations, depending on the specified application requirements. Time of Flight (ToF) currently offers extraordinary perspectives for 3D vision, and is therefore driving the design of a new generation of dedicated CMOS image sensors.
This information has been sourced, reviewed and adapted from materials provided by Teledyne E2V.
For more information on this source, please visit Teledyne E2V.