WO2002008685A2

WO2002008685A2 - Apparatus and method for determining the range of remote objects

Info

Publication number: WO2002008685A2
Application number: PCT/US2001/023535
Authority: WO
Inventors: Robert Dougherty
Original assignee: Optinav, Inc.
Priority date: 2000-07-26
Filing date: 2001-07-25
Publication date: 2002-01-31
Also published as: WO2002008685A3; EP1316222A2

Abstract

Range estimates are made using a passive technique. Light is focussed and then split into multiple beams. These beams are projected onto multiple image sensors (10a-g), each of which is located at a different optical path from the focussing system. By measuring the degree to which point objects are blurred on at least two of the image sensors, information is obtained that permits the calculation of the ranges of objects within the field of view of the camera. A unique beamsplitting system permits multiple, substantially identical images to be projected onto multiple image sensors (10a-g) using minimal overall physical distances, thus minimizing the size and weight of the camera. This invention permits ranges to be calculated continuously and in real time, and is suitable for measuring the ranges of objects in both static and nonstatic situations.

Description

APPARATUSAND METHOD FORDETERMINING THE RANGE OFREMOTE

OBJECTS

BACKGROUND OF THE INVENTION

The present invention relates to apparatus and methods for optical image acquisition and analysis. In particular, it relates to passive techniques for measuring the range of objects. In many fields such as robotics, autonomous land vehicle navigation, surveying and virtual reality modeling, it is desirable to rapidly measure the locations of all of the visible objects in a scene in three dimensions. Conventional passive image acquisition and processing techniques are effective for determining the bearings of objects, but do not adequately provide range information. Various active techniques are used for determining the range of objects, including radar, sonar, scanned laser and structured light methods. These techniques all involve transmitting energy to the object and monitoring the reflection of that energy. These methods have several shortcomings. They often fail when the object does not reflect the transmitted energy well or when the ambient energies are too high. Production of the transmitted energy requires special hardware that consumes power and is often expensive and failure prone. When several systems are operating in close proximity, the possibility of mutual interference exists. Scanned systems can be slow. Sonar is prone to errors caused by wind. Most of these active systems do not produce enough information to identify objects.

Range information can be obtained using a conventional camera, if the object or the camera is moving a known way. The motion of the image in the field of view is compared with motion expected for various ranges in order to infer the range. However, the method is useful only in limited circumstances. Other approaches make use of passive optical techniques. These generally break down into stereo and focus methods. Stereo methods mimic human stereoscopic vision, using images from two cameras to estimate range. Stereo methods can be very effective, but they suffer from a problem in ahgning parts of images from the two cameras. In cluttered or repetitive scenes, such as those containing soil or vegetation, the problem of determining which parts of the images from the two cameras to align with each other can be intractable. Image features such as edges that are coplanar with the line segment connecting the two lenses cannot be used for stereo ranging.

Focus techniques can be divided into autofocus systems and range mapping systems. Autofocus systems are used to focus cameras at one or a few points in the field of view. They measure the degree of blur at these points and drive the lens focus mechanism until the blur is minimized. While these can be quite sophisticated, they do not produce point-by-point range mapping information that is needed in some applications. In focus-based range mapping systems, multiple cameras or multiple settings of a single camera are used to make several images of the same scene with differing focus qualities. Sharpness is measured across the images and point-by- point comparison of the sharpness between the images is made in a way that effect of the scene contrast cancels out. The remaining differences in sharpness indicate the distance of the objects at the various points in the images.

The pioneering work in this field is a paper by Pentland. He describes a range mapping system using two or more cameras with differing apertures to obtain simultaneous images. A bulky beamsphtter/mirror apparatus is placed in front of the cameras to ensure that they have the same view of the scene. This multiple camera system is too costly, heavy, and limited in power to find widespread use.

In U. S. Pat. 5,365,597, Holeva describes a system of dual camera optics in which a beamsplitter is used within the lens system to simplify the optical design. This is an improvement on Pentland's use of completely separate optics, but still includes some unnecessary duplication in order to provide for multiple aperture settings as Pentland proposed.

Another improvement of Pentland's multiple camera method is described by Nourbakhsh et al. (U.S. Pat. 5,793,900). Nourbakhsh et al. describe a system using three cameras with different focus distance settings, rather than different apertures as in Pentland's presentation. This system allows for rapid calculation of ranges, but sacrifices range resolution in order to do so. The use of multiple sets of optics tends to make the camera system heavy and expensive. It is also difficult to synchronize the optics if overall focus, zoom, or iris need to be changed. The beamsplitters themselves must be large since they have to be sized to full aperture and field of view of the system. Moreover, the images formed in this way will not be truly identical due to manufacturing variations between the sets of optics.

An alternative method that uses only a single camera is described by Nakagawa et al. in U.S. Pat. No. 5,151,609. This approach is intended for use with a microscope. In this method, the object under consideration rests on a platform that is moved in steps toward or away from the camera. A large number of images can be obtained in this way, which increases the rangefinάing power relative to Pentland's method. In a related variation, the camera and the object are kept fixed and the focus setting of the lens is changed step-wise. However, this method is not suitable when the object or camera is moving, since comparison between images taken at different times would be very difficult. Even in a static situation, such as a surveying application, the time to complete the measurement could be excessive. Even if the scene and the camera location and orientation are static, the acquisition of multiple images by changing the camera settings is time consuming and introduces problems of control, measurement, and recording of the camera parameters to associate with the images. Also, changing the focus setting of a lens may cause the image to shift laterally if the lens rotates during the focus change and optical axes and the rotation axis are not in perfect alignment.

Thus, it would be desirable to provide a simplified method by which ranges of objects can be determined rapidly and accurately under a wide variety of conditions. In particular, it would be desirable to provide a method by which range-mapping for substantially all objects in the field of view of a camera can be provided rapidly and accurately. It would be especially desirable if such range- mapping can be performed continuously and in real time. It is further desirable to perform this range-finding using relatively simple, portable equipment. SUMMARY OF THE INVENTION

In one aspect, this invention is a camera comprising

(a) a focusing means

(b) multiple image sensors which receive two-dimensional images, said image sensors each being located at different optical path lengths from the focusing means and,

(c) a beamsplitting system for splitting fight received though the focusing means into three or more beams and projecting said beams onto multiple image sensors to form multiple, substantially identical images on said image sensors.

The focussing means is, for example, a lens or focussing mirror. The image sensors are, for example, photographic film, a CMOS device, a vidicon tube or a CCD, as described more fully below. The image sensors are adapted (together with optics and beamsplitters) so that each receives an image corresponding to at least about half, preferably most and most preferably substantially all of the field of view of the camera. The camera of the invention can be used as described herein to calculate ranges of objects within its field of view. The camera simultaneously creates multiple, substantially identical images which are differently focussed and thus can be used for range determinations. Furthermore, the images can be obtained without any changes in camera position or camera settings. In a second aspect, this invention is a method for determining the range of an object, comprising

(a) framing the object within the field of view of a camera having a focusing means

(b) spHtting light received through and focussed by the focusing means and projecting substantially identical images onto multiple image sensors that are each located at different optical path lengths from the focusing means,

(c) for at least two of said multiple image sensors, identifying a section of said image that includes at least a portion of said object, and for each of said sections, calculating a focus metric indicative of the degree to which said section of said image is in focus on said image sensor, and (d) calculating the range of the object from said focus metrics.

This aspect of the invention provides a method by which ranges of individual objects, or a range map of all objects within the field of view of the camera can be made quickly and, in preferred embodiments, continuously or nearly continuously. The method is passive and allows the multiple images that form the basis of the range estimation to be obtained simultaneously without moving the camera or adjusting camera settings.

In a third aspect, this invention is a beamsplitting system for spHtting a focused Hght beam through n levels of spHtting to form multiple, substantiaUy identical images, comprising (a) an arrangement of 2ⁿ-l beamsphtters which are each capable of spHtting a focused beam of incoming Hght into two beams, said beamsphtters being hierarchically arranged such that said focussed Hght beam is divided into 2ⁿ beams, n being an integer of 2 or more.

This beamsphtting system produces multiple, substantially identical images that are useful for range determinations, among other uses. The hierarchical design aUows for short optical path lengths as well as small physical dimensions. This permits a camera to frame a wide field of view, and reduces overaU weight and size.

In a fourth aspect, this invention is a method for determining the range of an object, comprising

(a) framing the object within the field of view of camera having a focusing means,

(b) spHtting Hght received through and focussed by the focusing means and projecting substantiaUy identical images onto multiple image sensors , that are each located at a different optical path length from the focusing means, (c) for at least two of said multiple image sensors, identifying a section of said image that includes at least a portion of said object, and for each of said sections, determining the difference in squares of the blur radii or blur diameter for a point on said object and, (d) determining the range of the object based on the difference in the squares of the blur radii or blur diameter.

As with the second aspect, this aspect provides a method by which rapid and continuous or nearly continuous range information can be obtained, without moving or adjusting camera settings.

In a fifth aspect, this invention is a method for creating a range map of objects within a field of view of a camera, comprising

(a) framing an object space within the field of view of camera having a focusing means,

(b) spHtting Hght received through and focussed by the focusing means and projecting substantiaUy identical images onto multiple image sensors that are each located at a different optical path length from the focusing means,

(c) for at least two of said multiple image sensors, identifying sections of said images that correspond to substantiaUy the same angular sector of the object space,

(d) for each of said sections, calculating a focus metric indicative of the degree to which said section of said image is in focus on said image sensor,

(e) calculating the range of an object within said angular sector of the object space from said focus metrics, and

(f) repeating steps (c) - (e) for aU sections of said images.

This aspect permits the easy and rapid creation of range maps for objects within the field of view of the camera. In a sixth aspect, this invention is a method for determining the range of an object, comprising

(a) forming at least two substantiaUy identical images of at least a portion of said object on one or more image sensors, where said substantiaUy identical images are focussed differently;

(b) for sections of said substantiaUy identical images that correspond to substantiaUy the same angular sector in object space and include an image of at least a portion of said object, analyzing the brightness content of each image at one or more spatial frequencies by performing a discrete cosine transformation to calculate a focus metric, and

(c) calculating the range of the object from the focus metrics.

This aspect of the invention aUows range information to be made from substantiaUy identical images of a scene that differ in their focus, using an algorithm of a type that is incorporated into common processing devices such as JPEG, MPEG2 and JPEG processors. In this aspect, the images are not necessarUy taken simultaneously, provided that they differ in focus and the scene is static. Thus, this aspect of the invention is useful with cameras of various designs and aUows range estimates to be formed using conveniently available cameras and processors. BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is an isometric view of an embodiment of the camera of the invention. Fig. 2 is a cross-section view of an embodiment of the camera of the invention.

Fig. 3 is a cross-section view of a second embodiment of the camera of the invention.

Fig. 4 a cross-section view of a third embodiment of the camera of the invention.

Fig. 5 is a diagram of an embodiment of a lens system for use in the invention. Fig. 6 is a diagram iUustrating the relationship of blur diameters and corresponding Gaussian brightness distributions to focus.

Fig. 7 is a diagram iUustrating the blurring of a spot object with decreasing focus.

Fig. 8 is a graph demonstrating, for one embodiment of the invention, the variation of the blur radius of a point object as seen on several image sensors as the distance of the point object changes.

Fig. 9 is a graph iUustrating the relationship of Modulation Transfer Function to spatial frequency and focus.

Fig. 10 is a block diagram showing the calculation of range estimates in one embodiment of the invention.

Fig. 11 is a schematic diagram of an embodiment of the invention. Fig. 12 is a schematic diagram showing the operation of a vehicle navigation system using the invention. DETAILED DESCRIPTION OF THE INVENTION In this invention, the range of one or more objects is determined by bringing the object within the field of view of a camera. The incoming Hght enters the camera through a focussing means as described below, and is then passed through a beamsphtter system that divides the incoming Hght and projects it onto multiple image sensors to form substantiaUy identical images. Each of the image sensors is located at a different optical path length from the focussing means. The "optical path length" is the distance Hght must travel from the focussing means to a particular image sensor, divided by the refractive index of the medium it traverses along the path. Sections of two or more of the images that correspond to substantiaUy the same angular sector in object space are identified. For each of these corresponding sections, a focus metric is determined that is indicative of the degree to which that section of the image is in focus on that particular image sensor. Focus metrics from at least two different image sensors are then used to calculate an estimate the range of an object within that angular sector of the object space. By repeating the process of identifying corresponding sections of the images, calculating focus metrics and calculating ranges, a range map can be built up that identifies the range of each object within the field of view of the camera.

As used in this apphcation "substantiaUy identical images" are images that are formed by the same focussing means and are the same in terms of field of view, perspective and optical quahties such as distortion and focal length. Although the images are formed simultaneously when made using the beamspHtting method described herein, images that are not formed simultaneously may also be considered to be "substantiaUy identical", if the scene is static and the images meet the foregoing requirements. The images may differ shghtly in overaU brightness, color balance and polarization. Images that are different only in that they are reversed (i.e., mirror images) can be considered "substantiaUy identical" within the context of this invention. Similarly, images received by the various image sensors that are focussed differently on account of the different optical path lengths to the respective image sensors, but are otherwise the same (except for reversals and/or smaU brightness changes, or differences in color balance and polarization as mentioned above) are considered to be "substantiaUy identical" within the context of this invention.

In Figure 1, Camera 19 includes an opening 800 through which focussed hght enters the camera. A focussing means (not shown) will be located over opening 800 to focus the incoming hght. The camera includes a beamspHtting system that projects the focussed Hght onto image sensors 10a- lOg. The camera also includes a pluraHty of openings such as opening 803 through which Hght passes from the beamsphtter system to the image sensors. As is typical with most cameras, the internal Hght paths and image sensors are shielded from ambient Hght. Covering 801 in Figure 1 performs this function and can also serve to provide physical protection, hold the various elements together and house other components.

Figure 2 iUustrates the placement of the image sensors in more detaU, for one embodiment of the invention. Camera 19 includes a beamspHtting system 1, a focussing means represented by box 2 and, in this embodiment, eight image sensors lOa-h. Light enters beamspHtting system 1 through focussing means 2 and is spHt as it travels through beamspHtting system 1 so as to project substantially identical images onto image sensors 10a- lOh. In the embodiment shown in Figure 2, multiple image generation is accompHshed through a number of partiaUy reflective surfaces 3-9 that are oriented at an angle to the respective incident Hght rays, as discussed more fuUy below. Each of the images is then projected onto one of image sensors 10a- lOh. Each of image sensors 10a- lOh is spaced at a different optical path length (D_a-Dh, respectively) from focussing means 2. In Figure 2, the paths of the various central Hght rays through the camera are indicated by dotted Hnes, whose lengths are indicated as Di through D25. Intersecting dotted Hnes indicate places at which beam spHtting occurs. Thus, in the embodiment shown, image sensor 10a is located at an optical path length D_a, wherein

D_a = Dι/ni2 + D2 ni3+ D3 113 + D4/nιβ +Dβ/ni6 Similarly, Db = Dι/ni2 + D₂/ i3 + D3 113 + D4Δ116 +D6/nπ + D7/nιιt>, Dc = Dι/ni2 + D2 113 + Dβ/nw + Ds/ms +Dιo/nιs+ Dn/nnc D = Dι/ni2 + U2/ni3 + Ds/ni4 + Dθ/niβ +Di2/ni9+ Diβ/niid, D_e = Dι/ni2 + Di4/ni2 + D15Λ112 + Di6/ni4 + Drz /niie, Df = Dι/ni2 + Di4 ni2 + Diβ/ni2 + Dιs/ni2 + Dig/nin , D_g = Dι/ni2 + Di4/ni2 + D2o/nιs + D2i n2o + D22Λ121 + Dae/niig, and

Dh = Dι/ni2 + D14/1112 + U2o ni5 + D2i/n2o + D24 1-20 + D∑s nnh where niib-iih and ni2-2i are the indices of refraction of spacers llb-llh and prisms 12-21, respectively. As shown, D_a < Db < D_c < Dd < D_e < Df < D_g < Dh.

TypicaUy, the camera of the invention wiU be designed to provide range information for objects that are within a given set of distances ("operating H its").

The operating Hmits may vary depending on particular appHcations. The longest of the optical path lengths (Dh in Figure 2) wiU be selected in conjunction with the focussing means so that objects located near the lower operating Hmit (i.e., closest to the camera) wiU be in focus or nearly in focus at the image sensor located farthest from the focussing means (image sensor lOh in Figure 2). Similarly, the shortest optical path length optical path length (D_a in Figure 2) will be selected so that objects located near the upper operating Hmit (i.e., farthest from the camera) wiU be in focus or nearly in focus at the image sensor located closest from the focussing means (image sensor 10a in Figure 2). Although the embodiment shown in Figure 2 spfits the incoming Hght into eight images, it is sufficient for estimating ranges to create as few as two images and as many as 64 or more. In theory, increasing the number of images (and corresponding image sensors) permits greater accuracy in range calculation.

However, intensity is lost each time a beam is spht, so the number of useful images that can be created is Hmited. In practice, good results can be obtained by creating as few as three images, preferably at least four images, more preferably about 8 images, to about 32 images, more preferably about 16 images. Creating about 8 images is most preferred.

Figure 2 illustrates a preferred binary cascading method of generating multiple images. In the method, Hght entering the beamsphtter system is divided into two substantiaUy identical images, each of which is divided again into two to form a total of four substantiaUy identical images. To make more images, each of the four substantiaUy identical images is again spHt divided into two, and so forth until the desired number of images has been created. In this embodiment, the number of times a beam is spht before reaching an image sensor is n, and the number of created images in 2ⁿ. The number of individual surfaces at which spHtting occurs is 2ⁿ-l. Thus, in Figure 2, Hght enters beamsphtter system 1 from focussing means 2 and contacts partiaUy reflective surface 3. As shown, partiaUy reflective surface 3 is oriented at 45° to the path of the incoming Hght, and is partiaUy reflective so that a portion of the incoming Hght passes through and most of the remainder of the incoming hght is reflected at an angle. In this manner, two beams are created that are oriented at an angle to each other. These two beams contact partiaUy reflective surfaces 4 and 7, respectively, where they are each spht a second time, forming four beams. These four beams then contact partiaUy reflective surfaces 5, 6, 8 and 9, where they are each spHt again to form the eight beams that are projected onto image sensors 10a- lOh. The spHtting is done such that the images formed on the image sensors are substantiaUy identical as described before. If desired, additional partiaUy reflective surfaces can be used to further subdivide each of these eight beams, and so forth one or more additional times until the desired number of images is created. It is most preferred that each of partiaUy reflective surfaces 3-9 reflect and transmit approximately equal amounts of the incoming Hght. To minimize overaU physical distances, the angle of reflection is in each case preferably about 45°.

The preferred binary cascading method of producing multiple substantiaUy identical images aUows a large number of images to be produced using relatively short overaU physical distances. This permits less bulky, Hghter weight equipment to be used, which increases the ease of operation. Having shorter path lengths also permits the field of view of the camera to be maximized without using supplementary optics such as a retrofocus lens. PartiaUy reflective surfaces 3-9 are at fixed physical distances and angles with respect to focussing means 2. Two preferred means for providing the partiaUy reflective surfaces are prisms having partiaUy reflective coatings on appropriate faces, and pellicle mirrors. In the embodiment shown in Figure 2, partiaUy reflective surface 3 is formed by a coating on one face of prism 12 or 13. SimUarly, partiaUy reflective surface 4 is formed by a coating on a face of prism 13 or 14, reflective surfaces 8 is formed by a coating on a face of prism 12 or 14, and partiaUy reflective surfaces 5, 6, 7 and 9 are formed by a coating on the bases of prisms 16 or 17, 18 or 19, 12 or 15 and 20 or 21, respectively. As shown, prisms 13-21 are right triangular in cross-section and prism 12 is trapezoidal in cross- section. However, two or more of the prisms can be made as a single piece, particularly when no partiaUy reflective is present at the interface. For example, prisms 12 and 14 can form a single piece, as can prisms 15 and 20, 13 and 16, and 14 and 18.

To reduce lateral chromatic aberration and standardize the physical path lengths, it is preferred that the refractive index of each of prisms 12-21 be the same. Any optical glass such as is useful for making lenses or other optical equipment is a useful material of construction for prisms 12-21. The most preferred glasses are those with low dispersion. An example of such a low dispersion glass is crown glass BK7. For appHcations over a wide range of temperature, a glass with a low thermal expansion coefficient such as fused quartz is preferred. Fused quartz also has low dispersion, and does not turn brown when exposed to ionizing radiation, which may be desirable in some appHcations.

If a particularly wide field of view is required, prisms having relatively high indices of refraction can be used. This has the effect of providing shorter optical path lengths, which permits shorter focal length whUe retaining the physical path length and the transverse dimensions of the image sensors. This combination increases the field of view. This tends to increase the overcorrected spherical aberration and may tend to increase the overcorrected chromatic aberration introduced by the materials of manufacture of the prisms. However, these aberrations can be corrected by the design of the focusing means, as discussed below.

Suitable partiaUy reflective coatings include metallic, dielectric and hybrid metaUic/dielectric coatings. The preferred type of coating is a hybrid metaUic/dielectric coating which is designed to be relatively insensitive to polarization and angle of incidence over the operating range of wavelength. MetaUic-type coatings are less suitable because the reflection and transmission coefficients for the two polarization directions are unequal. This causes the individual beams to have significantly different intensities foUowing two or more spHttings. In addition, metaUic-type coatings dissipate a significant proportion of the Hght energy as heat. Dielectric type coatings are less preferred because they are sensitive to the angle of incidence and polarization. When a dielectric coating is used, a polarization rotating device such as a half-wave plate or a circularly polarizing -wave plate can be placed between each pair of partiaUy reflecting surfaces in order to compensate for the polarization effects of the coatings. If desired, a polarization rotating or circularizing device can also be used in the case of metallic type coatings.

The beamspHtting system will also include a means for holding the individual partiaUy reflective surfaces into position with respect to each other. Suitable such means may be any kind of mechanical means, such as a case, frame or other exterior body that is adapted to hold the surfaces into fixed positions with respect to each other. When prisms are used, the individual prisms may be cemented together using any type of adhesive that is transparent to the wavelengths of Hght being monitored. A preferred type of adhesive is an ultraviolet-cure epoxy with an index of refraction matched to that of the prisms. Figure 3 Ulustrates how prism cubes such as are commerciaUy avaUable can be assembled to create a beamsphtter equivalent to that shown in Figure 2. Beamsphtter system 30 is made up of prism cubes 31-37, each of which contains a diagonaUy oriented partiaUy reflecting surface (38a-g, respectively). Focussing means 2, spacers lla-llh and image sensors lOa-lOh are as described in Figure 2. As before, the individual prism cubes are held in position by mechanical means, cementing, or other suitable method.

Figure 4 illustrates another alternative beamsphtter design, which is adapted from beamspHtting systems that are used for color separations, as described by Ray in Applied Photographic Optics, Second Ed., 1994, p. 560 (Fig 68.2). In Figure 4, incoming Hght enters the beamsphtter system through focussing means 2 and impinges upon partiaUy reflective surface 41. A portion of the Hght (the path of the Hght being indicated by the dotted Hnes) passes through partiaUy reflective surface 41 and impinges upon partiaUy reflective surface 43. Again, a portion of this Hght passes through partiaUy reflective surface 43 and strikes image sensor 45. The portion of the incoming Hght that is reflected by partiaUy reflective surface 41 strikes reflective surface 42 and is reflected onto image sensor 44. The portion of the Hght that is reflected by partiaUy reflective surface 43 strikes a reflective portion of surface 41 and is reflected onto image sensor 46. Image sensors 44, 45 and 46 are at different optical path lengths from focussing means 2, i.e. Dβo/nβo + Dβi/nβi + Dβ2/nβ2 ≠ Dβo/nβo + Dβ3/n63 + Dβ4/n64 ≠ Dβo/nβo + D63/nβ3 + D6s/n65 + Dββ/nββ, where nβo-nββ represent the refractive indices along distances Dβo-Dββ, respectively. It is preferred that the proportion of Hght that is reflected at surfaces 41 and 43 be such that images of approximately equal intensity reach each of image sensors 44, 45 and 46. Although specific beamsphtter designs are provided in Figures 2, 3 and 4, the precise design of the beamsphtter system is not critical to the invention, provided that the beamsphtter system dehvers substantiaUy identical images to multiple image sensors located at different path lengths from the focussing means. The embodiment in Figure 2 also incorporates a preferred means by which the image sensors are held at varying distances from the focussing means. In Figure 2, the various image sensors 10b- lOh are held apart from beamsphtter system 1 by spacers llb-llh, respectively. Spacers llb-llh are transparent to Hght, thereby permitting the various beams to pass through them to the corresponding image sensor. Thus, the spacer can be a simple air gap or another material that preferably has the same refractive index as the prisms. The use of spacers in this manner has at least two benefits. First, the thickness of the spacers can be changed in order to adjust operating Hmits of the camera, if desired. Second, the use of spacers permits the beamsphtter system to be designed so that the optical path length from the focussing means (i.e., the point of entrance of Hght into the beamspHtting system) to each spacer is the same, with the difference in total optical path length (from focussing means to image sensor) being due entirely to the thickness of the spacer. This aUows for simplification in the design of the beamsphtter system.

Thus, in the embodiment shown in Figure 2, Dι/ni2 + D2/1113 + D3Δ113 + D4 nιβ + D5/ni6 = Dι/ni2 + D2Λ113 + Dβ/ β + D4/nιβ +D6 nπ = Dι/ni2 + D2/ni3 + Ds/nu + Dθ/niβ +Dιo/ni8 = Dι/ni2 + D2Λ113 + Ds/nu + D9Λ118 +D12/1119 = Dι/ni2 + Di4 ni2 + Diδ/ni2 + Diβ/n = Dι/ni2 + Di4 ni2 + Dιs/ni2 + Dιβ/ni2 = Dι/ni2 + Du/ni2 + D2o/nis + D21/1120 +D22Λ121 = Dι/ni2 + Di4 ni2 + D2o/nιs + D2i/n2o +D24Λ120, and the thicknesses of spacers llb-llh (D7, Dπ, D13, D17, Dig, D23 and D25, respectively) are aU unique values, with the refractive indices of the spacers aU being equal values.

Of course, a spacer may be provided for image sensor 10a if desired. An alternative arrangement is to use materials having different refractive indices as spacers llb-llh. This aUows the thicknesses of spacers llb-llh to be the same or more nearly the same, while stiU providing different optical path lengths.

In another preferred embodiment, the various optical path lengths (D_a - Dh in Figure 2) differ from each other in constant increments. Thus, if the lengths of the shortest two optical path lengths differ by a distance X, then it is preferred that the differences in length between the shortest optical path length and any other optical path length be mX, where m is an integer from 2 to the number of image sensors minus one. In the embodiment shown in Figure 2, this is accomphshed by making the thickness of spacer lib equal to X, and those of spacers llc-llh being from 2X to 7X, respectively. As mentioned before, the thickness of spacer llh should be such that objects which are at the closest end of the operating range are in focus or nearly in focus on image sensor lOh. SimUarly, D_a (= D1 11₁₂ + D2Λ1₁3 + D3Λ1₁₃ + D4/h₁₆ +Ds/nι₆) should be such that the objects which are at the farthest end of the operating range are in focus or nearly in focus on image sensor 10a.

Focussing means 2 is any device that can focus Hght from a remote object being viewed onto at least one of the image sensors. Thus, focussing means 2 can be a single lens, a compound lens system, a mirror lens (such as a Schmidt- Cassegrain mirror lens), or any other suitable method of focussing the incoming hght as desired. If desired, a zoom lens, telephoto or wide angle lens can be used. The lens wUl most preferably be adapted to correct any aberration introduced by the beamsphtter. In particular, a beamsphtter as described in Figure 2 wiU function optically much like a thick glass spacer, and when placed in a converging beam, wiU introduce overcorrected spherical and chromatic aberrations. The focussing means should be designed to compensate for these.

SimUarly, it is preferred to use a compound lens that corrects for aberration caused by the individual lenses. Techniques for designing focussing means, including compound lenses, are weU known and described, for example, in Smith, "Modern Lens Design", McGraw-HUl, New York (1992). In addition, lens design software programs can be used to design the focussing system, such as OSLO Light (Optics Software for Layout and Optimization), Version 5, Revision 5.4, avaUable from Sinclair Optics, Inc. The focussing means may include an adjustable aperture. However more accurate range measurements can be made when the depth of field is smaU. Accordingly, it is preferable that a wide aperture be used. One corresponding to an f-number of about 5.6 or less, preferably 4 or less, more preferably 2 or less is especiaUy suitable. A particularly suitable focussing means is a 6-element Biotar (also known as double Gauss-type) lens. One embodiment of such a lens is fllustrated in Figure 5, and is designed to correct the aberrations created with a beamsphtter system as shown in Figure 2, which are equivalent to those created by a 75 mm plate of BK7 glass. Biotar lens 50 includes lens 51 having surfaces Li and L2 and thickness di; lens 52 having surfaces L3 and L₄ and thickness άa; lens 53 having surfaces L5 and Lβ and thickness d4," lens 54 having surfaces L7 and Ls and thickness de; lens 55 having surfaces Lg and Lio and thickness d₇ and lens 56 having surfaces L11 and L12 and thickness dθ. Lenses 51 and 52 are separated by distance d2, lenses 53 and 54 are separated by distance ds, and lenses 55 and 56 are separated by distance ds. Lens pairs 52-53 and 54-55 are cemented doublets. Parameters of this modified lens are summarized in the foUowing table:

Image sensors 10a- lOh can be any devices that record the incoming image in a manner that permits calculation of a focus metric that can in turn be used to calculate an estimate of range. Thus, photographic film can be used, although film is less preferred because range calculations must await film development and determination of the focus metric from the developed film or print. For this reason, it is more preferred to use electronic image sensors such as a vidicon tube, complementary metal oxide semiconductor (CMOS) devises or, especiaUy, charge- coupled devices (CCDs), as these can provide continuous information from which a focus metric and ranges can be calculated. CCDs are particularly preferred. Suitable CCDs are commerciaUy avaUable and include those types that are used in high-end digital photography or high definition television appHcations. The CCDs may be color or black-and-white, although color CCDs are preferred as they can provide more accurate range information as weU as more information about the scene being photographed. The CCDs may also be sensitive to wavelengths of Hght that He outside the visible spectrum. For example, CCDs adapted to work with infrared radiation may be desirable for night vision appHcations. Long wavelength infrared appHcations are possible using microbolometer sensors and LWIR optics (such as, for example, germanium prisms in the beamsphtter assembly).

Particularly suitable CCDs contain from about 500,000 to about 10 million pixels or more, each having a largest dimension of from about 3 to about 20, preferably about 8 to about 13 μm. A pixel spacing of from about 3-30 μm is preferred, with those having a pixel spacing of 10-20 μm being more preferred. CommerciaUy avaUable CCDs that are useful in this invention include Sony's ICX252AQ CCD, which has an array of 2088X1550 pixels, a diagonal dimension of 8.93 mm and a pixel spacing of 3.45 μm; Kodak's KAF-2001CE CCD, which has an array of 1732X1172 pixels, dimensions of 22.5X15.2mm and a pixel spacing of 13 μm; and Thomson-CSF TH7896M CCD, which has an array of 1024X1024 pixels and a pixel size of 19 μm.

In addition to the components described above, the camera wiU also include a housing to exclude unwanted Hght and hold the components in the desired spatial arrangement. The optics of the camera may include various optional features, such as a zoom lens; an adjustable aperture; an adjustable focus; filters of various types, connections to power supply, Hght meters, various displays, and the like.

Ranges of objects are estimated in accordance with the invention by developing a focus metrics from the images projected onto two or more of the image sensors that represent the same angular sector in object space. An estimate of the range of one or more objects within the field of view of the camera is then calculated from the focus metrics. Focus metrics of various types can be used, with several suitable types being described in Krotov, "Focusing", Int. J. Computer Vision 1:223-237 (1987), incorporated herein by reference, as weU as in U. S. Patent No. 5,151,609. In general, a focus metric is developed by examining patches of the various images for their high spatial frequency content. Spatial frequencies up to about 25 Hnes/mm are particularly useful for developing the focus metric. When an image is out of focus, the high spatial frequency content is reduced. This is reflected in smaUer brightness differences between nearby pixels. The extent to which these brightness differences are reduced due to an image being out-of-focus on a particular image sensor provides an indication of the degree to which the image is out of focus, and aUows calculation of range estimates.

The preferred method develops a focus metric and range calculation based on blur diameters or blur radfi, which can be understood with reference to Figure 6. Distances in Figure 6 are not to scale. In Figure 6, B represents a point on a remote object at is at distance x from the focussing means. Light from that object passes through focussing means 2, and is projected onto image sensor 60, which is shown at alternative positions a, b c and d. When image sensor 60 is at position b, point B is in focus on image sensor 60, and appears essentiaUy as a point. As image sensor 60 is moved so that point B is no longer in focus, point B is imaged as a circle, as shown on image sensors at positions a, c and d. The radius of this circle is the blur radius, and is indicated for positions a, c and d as ΓBH, ΓBC and rB . Twice this value is the blur diameter. As shown in Figure 6, blur radii (and blur diameters) increase as the image sensor becomes farther removed from having point B in focus. Because the various image sensors in this invention are at different optical path lengths from the focussing means, point objects such as point object B in Figure 6 wiU appear on the various image sensors as blurred circles of varying racHi.

This effect is Ulustrated in Figure 7, which is somewhat ideahzed for purposes of iUustration. In Figure 7, an 8 X 8 block of pixels from each of 3 CCDs are represented as 71, 72 and 73, respectively. These three CCDs are adjacent to each other in terms of being at consecutive optical path lengths from the focussing means, with the CCD containing pixel block 72 being intermediate to the others. Each of these 8 X 8 blocks of pixels receives Hght from the same angular sector in object space. For purposes of this iUustration, the object is a point source of Hght that is located at the best focus distance for the CCD containing pixel block 72, in a direction corresponding to the center of the pixel block. Pixel block 72 has an image nearly in sharp focus, whereas the same point image is one step out of focus in pixel blocks 71 and 73. Pixel blocks 74 and 75 represent pixel blocks on image sensors that are one-half step out of focus. The density of points 76 on a particular pixel indicates the intensity of Hght that pixel receives. When an image is in sharp focus in the center of the pixel block, as in pixel block 72, the Hght is imaged as high intensities on relatively few pixels. As the focus becomes less sharp, more pixels receive Hght, but the intensity on any single pixel decreases. If the focus is too far out of focus, as in pixel block 71, some of the Hght is lost to adjoining pixel blocks (points 77).

For any particular image sensor i, objects at certain distances xi wiU be in focus. In Figure 6, this is shown with respect to the image sensor a, which has point object A at distance Xa in focus. The diameter of a blur circle (DB) on image sensor i for an object at distance x is related to this distance xi, the actual distance of the object (x), the focal length of the focussing means (f) and the diameter of the entrance pupU (p) as foUows:

Although equation (1) suggests that the blur diameter wiU go to zero for an object in sharp focus (xi-x = 0), diffraction and optical aberrations wiU in practice cause a point to be imaged as a smaU fuzzy circle even when in sharp focus. Thus, a point object wiU be imaged as a circle having some minimum blur circle diameter due to imperfections in the equipment and physical hmitations related to the wavelength of the Hght, even when in sharp focus. This hmiting spot size can be added to equation (1) as a sum of squares to yield the foUowing relationship:

DB = {fp[| xi-x | /xXi]}² + (Dmin)² (2) where Dmin represents the minimum blur circle diameter. An image projected onto any two-image sensors Sj and Sk, which are focussed at distances XJ and xk, respectively, wiU appear as blurred circles having blur diameters Dj and D , respectively. The distance x of the point object can be calculated from the blur diameters, XJ and k using the equation

In equation (3), XJ and k are known from the optical path lengths for image sensors j and k, and f and p are constants for the particular equipment used. Thus, by measuring the diameter of the blur circles for a particular point object imaged on image sensors j and k, the range x of the object can be determined. In this invention, the range of an object is determined by identifying on at least two image sensors an area of an image corresponding to a point on said object, calculating the difference in the squares of the blur diameter of the image on each of the image sensors, and calculating the range x from the blur diameters, such as according to equation (3).

It is clear from equation (3) that a measurement of (Dj²-Dk²) is sufficient to calculate the range x of the object. Thus, it is not necessary to measure Dj and Dk directly if the difference of their squares (Dj²-Dk²) can be measured instead.

The accuracy of the range measurement improves significantly when the point object is in sharp focus or nearly in sharp focus on the image sensors upon which the measurement is based. Accordingly, this invention preferably includes the step of identifying the two image sensors upon which the object is most nearly in focus, and calculating the range of the object from the blur radn on those two image sensors.

Electronic image sensors such as CCDs image points as brightness functions. For a point image, these brightness functions can be modeled as Gaussian functions of the radius of the blur circle. A blur circle can be modeled as a Gaussian peak having a width (σ) equal to the radius of the blur circle divided by the square root of 2 (or diameter divided by twice the square root of 2). This is Illustrated in Figure 6, where blur circles on the image sensors as points a, c and d are represented as Gaussian peaks. The width of each peak (σ_a, σ and σd, corresponding to the blur circles at positions a, c and d) are taken as equal to 0.707, ΓBC/0.707 and rBd/0.707, respectively (or DBH/1.414, DBJ1.414 and Dβd/1.414). Substituting this relationship into equation (3) yields equation (4):

Figure 8 demonstrates how, by using a number of image sensors located at different optical path lengths, point objects at different ranges appear as blur circles of varying diameters on different image sensors. Curves 81-88 represent the values of σ of reach of eight image sensors as the distance of the imaged object increases. The data in Figure 8 is calculated for a system of lens and image sensors having focus distances Xi in meters of 4.5, 5, 6, 7.5, 10, 15, 30 and oo, respectively for the eight image sensors. An object at any distance x within the range of about 4 meters to infinity wiU be best focussed on the one of the image sensors (or in some cases, two of them), on which the value of σ is least. Line 80 indicates the σ value on each image sensor for an object at a range of 7 meters. To illustrate, in Figure 8, a point object at a distance x of 7 meters is best focussed on image sensor 4, where σ is about 14 μm. The same point object is next best focused on image sensor 3, where σ is about 24 μm. For the system illustrated by Figure 8, any point object located at distance x of about 4.5 meters to infinity wiU appear on at least one image sensor with a σ value of between about 7.9 and 15 μm. Except for objects located at a distance of less than 4.5 meters, the image sensor next best in focus wiU image the object with a σ value of from about 16 to about 32 μm.

Using equation (4), it is possible to determine the range x of an object by measuring σj and σk, or by measuring σj²-σ ². Using CCDs as the image sensors, the value of σj²-σk² can be estimated by identifying blocks of pixels on two CCDs that each correspond to a particular angular sector in space containing a given point object, and comparing the brightness information from the blocks of pixels on the two CCDs. A signal can then be produced that is representative of or can be used to calculate σj and σk or σj²— σk². This can be done using various types of transform algorithms including various forms of Fourier analysis, wavelets, finite difference approximations to derivatives, and the like, as described by Krotov and U. S. Patent No. 5,151,609, both mentioned above. However, a preferred method of comparing the brightness information is through the use of a Discrete Cosine Transformation (DOT) function, such as is commonly used in JPEG, MPEG and Digital Video compression methods. In this DCT method, the brightness information from a set of pixels

(typicaUy an 8 X 8 block of pixels) is converted into a matrix of typicaUy 64 cosine coefficients (designated as n, m, with n and m usuaUy ranging from 0 to 7). Each of the cosine coefficients corresponds to the Hght content in that block of pixels at a particular spatial frequency. The relationship is given by S(m, n) = ∑ ∑ c(/, /) cos cos " m=0 n=0 V 1M

wherein c(i,j) represents the brightness of pixel i,j. Increasing values of n and m indicate values for increasing spatial frequencies according to the relationship

^■ A L) (6)

where v_n,m represents the spatial frequency corresponding to coefficient n,m and L is the length of the square block of pixels.

The first of these coefficients (0,0) is the so-caUed DC term. Except in the unusual case where σ»L (i.e., the image is far out of focus), the DC term is not used for calculating σj²-σk², except perhaps as a normahzing value. However, each of the remaining coefficients can be used to provide an estimate of σ -σk², as a given coefficient S ,m generated by CCD, and the corresponding coefficient S_n,m generated by CCDk are related to σj²-σk² as foUows: σj²-σk² = -L²/π^{2 •} ln[Sn,m(CCD,)/S„,_m(CCDk)] (7)

Thus, the ratio of the coefficients between the two CCDs provides a direct estimate of σj²-σk². Thus, in principle, each of the last 63 DCT coefficients (the so-caUed "AC" coefficients) can provide an estimate of σj²-σk².

In practice, however, relatively few of the DCT coefficients provide meaningful estimates. As a result, it is preferred to use only a portion of the DCT coefficients to determine σj²-σk². Useful DCT coefficients are readily identified by a Modulation Transfer Function (MTF), defined as MTF = exp(-2π²v²σ²), wherein v is the spatial frequency expressed by the particular DCT coefficient and σ is as before. The MTF expresses the ratio of a particular DCT coefficient as measured with the value of the coefficient in the case of an ideal image; i.e. as would be expected if perfectly in focus and with "perfect" optics. When the MTF is about 0.2 or greater, the DCT coefficient is generaUy useful for calculating estimates of ranges.

When the MTF is below about 0.2, interference effects tend to come into play, making the DCT coefficient a less rehable metric for calculating estimated ranges. This effect is illustrated in Figure 9, in which MTF values are plotted against spatial frequency for a CCD in which an image is in sharp focus (Hne 90), a CCD in which an image is V step out of focus (line 91), and a CCD in which an image is one step out of focus (line 92). As seen from Hne 90 in Figure 9, the MTF for even a perfectly focussed image departs from 1.0 as the spatial frequency increases, due to diffraction and aberational effects of the optics. However, the MTF values remain high even at high spatial frequencies. When the image sensor is a step out of focus, as shown by Hne 92, the MTF faHs rapidly with increasing spatial frequency untU it reaches a point, indicated by region D in Figure 9, where the MTF value is dominated by interference effects. Thus, DCT coefficients relating to spatial frequencies to the left of region D are useful for calculating σj²— σk². This corresponds to an MTF value of about 0.2 or greater. For an image sensor that is one-half step out of focus, the MTF faUs less quickly, but reaches a value below about 0.2 when the spatial frequency reaches about 20 Hnes/mm, as shown in by Hne 91.

As shown in Figure 9, most useful DCT coefficients Sn,m are those in which n and m range from 0 to 4, more preferably 0 to 3, provided that n and m are not both 0. The remaining DCT coefficients may be and preferably are disregarded in the calculating the ranges. Once DCT coefficients are selected for use in calculate a range, ratios of corresponding DCT coefficients from each of two image sensors are determined to estimate σj and σk, which in turn are used to calculate the range of the object.

It wfll be noted that due to the relation MTF = exp(-2π²v²σ²), the MTF wiU be in the desired range of 0.2 or greater when 0.3 > v • σ.

When the preferred color CCDs are used, separate DCT coefficients are preferably generated for each of the colors red, blue and green. Again, each of these DCT coefficients can be used to determine σj²-σk² and calculate the range of the object.

Because a number of DCT coefficients are avaUable for each block of pixels, each of which can be used to provide a separate estimate of σj²-σk², it is preferred to generate a weighted average of these coefficients and use the weighted average to determine σj²-σk² and calculate the range of the object. Alternately, the various values of

are determined and these values are weighted to determine a weighted value for

that is used to compute a range estimate. Various weighting methods can be used. Weighting by the DCT coefficients themselves is preferred, because the ones for which the scene has high contrast wiU dominate and these high contrast coefficients are the ones that are most effective for estimating ranges.

One such weighting method is illustrated in Figure 10. In Figure 10, a particular DCT coefficient is represented by the term S(k,n,m,c), where k designates the particular image sensor, n and m designate the spatial frequency (in terms of the DCT matrix) and c represents the color (red, blue or green). In the weighting method in Figure 10, each of the DCT coefficients for image sensor 1 (k=l) are normaHzed in block 1002 by dividing it by the absolute value of the DC coefficient for that block of pixels, and that color of pixels (when color CCDs are used). The output of block 1002 is a series of normaHzed coefficients R(k,n,m,c), where k, n, m and c are as before, each normaHzed coefficient R representing a particular spatial frequency and color for a particular image sensor k. These normaHzed coefficients are used in block 1003 to evaluate the overaU sharpness of the image on image sensor k, in this case by adding them together to form a total, P(k). Decision block 1009 tests whether the corresponding block in aU image sensors has been evaluated; if not, the normaHzing and sharpness evaluations of blocks 1002 and 1003 are repeated for aU image sensors.

In block 1004, the values of P(k) are compared and used to identify the two image sensors having the greatest overaU sharpness. In block 1004, these image sensors are indicated by indices j and k, where k represents that having the sharpest focus. The normaHzed coefficients for these two image sensors are then sent to block 1005, where they are weighted. Decision block 1010 tests to be sure that the two image sensors identified in block 1004 have consecutive path lengths. If not, a default range x is calculated from the data from image sensor k alone. In block 1005, a weighting factor is developed for each normaHzed coefficient by multiplying together the normaHzed coefficients from the two image sensors that correspond to a particular spatial frequency and color. If the weighting factor is nonzero, then σf-σi? is calculated according to equation 7 using the normaHzed coefficients for that particular spatial frequency and color. If the weighting factor is zero, σj²-σk² is set to zero. Thus, the output of block 1005 is a series of calculations of σj²-σk² for each spatial frequency and color.

In block 1006, aU of the separate weighting factors are added to form a composite weight. In block 1007, aU of the separate calculations of

from block 1005 are multiphed by their corresponding weights. These multiples are then added and divided by the composite weight to develop a weighted average calculation of σj²-σk². This weighted average calculation is then used in block 1008 to compute the range x of the object imaged in the block of pixels under examination, using equation 4.

By repeating the process for each block of pixels in the image sensors, ranges can be calculated for each object within the field of view of the camera. This information is readily compUed to form a range map.

Thus, in a preferred embodiment of the invention, the image sensors provide brightness information to an image processor, which converts that brightness information into a set of signals that can be used to calculate σj²-σk² for corresponding blocks of pixels. This arrangement is illustrated in Figure 11. In Figure 11, Hght passes through focussing means 2 and is split into substantiaUy identical images by beamsphtter system 1. The images are projected onto image sensors 10a- lOh. Each image sensor is in electrical connection with a corresponding edge connector, whereby brightness information from each pixel is transferred via connections to a corresponding image processor 1101-1108. These connections can be of any type that permits accurate transfer of the brightness information, with analog video Hnes being satisfactory. The brightness information from each image sensor is converted by image processors 1101-1108 into a set of signals, such as DCT coefficients or other type of signal as discussed before. These signals are then transmitted to computer 1109, such as over high- speed serial digital cables 1110, where ranges are calculated as described before.

If desired, image processors 1101-1108 can be combined with computer 1109 into a single device.

Because a preferred method of generating signals for calculating

a discrete cosine transformation, image processors 1101-1108 are preferably programmed to perform this function. JPEG, MPEG2 and Digital Video processors are particularly suitable for use as the image processors, as those compression methods incorporate DCT calculations. Thus a preferred image processor is a JPEG, MPEG2 or Digital Video processor, or equivalent.

If desired, the image processors may compress the data before sending it to computer 1109, using lossy or lossless compression methods. The range calculation can be performed on the noncompressed data, the compressed data, or the decompressed data. JPEG, MPEG2 and Digital Video processors aU use lossy compression techniques. Thus, in an especially preferred embodiment, each of the image processors is a JPEG, MPEG2 or Digital Video processor and compressed DCT coefficients are generated and sent to computer 1109 for calculation of ranges. Computer 1109 can either use the compressed coefficients to perform the range calculations, or can decompress the coefficients and use the decompressed coefficients instead. However, any Huffman encoding that is performed must be decoded before performing range calculations. It is also possible to use the DCT coefficients generated by the JPEG processor via the DCT without compression.

The method of the invention is suitable for a wide range of appHcations. In a simple apphcation, the range information can be used to create displays of various forms, in which the range information is converted to visual or audible form. Examples of such displays include, for example: (a) a visual display of the scene, on which superimposed numerals represent the range of one or more objects in the scene;

(b) a visual display that is color-coded to represent objects of varying distance;

(c) a display that can be actuated, such as, for example, operation of a mouse or keyboard, to display a range value on command; (d) a synthesized voice indicating the range of one or more objects;

(e) a visual or aural alarm that is created when an object is within a predetermined range.

The range information can be combined with angle information derived from the pixel indices to produce three-dimensional coordinates of selected parts of objects in the images. This can be done with aU or substantiaUy aU of the blocks of pixels to produce a 'cloud' of 3D points, in which each point Hes on the surface of some object. Instead of choosing aU of the blocks for generating 3D points, it may be useful to select points corresponding to edges. This can be done by selecting those blocks of DCT coefficients with particularly large sum of squares. Alternatively, a standard edge-detection algorithm, such as the Sobel derivative, can be applied to select blocks that contain edges. See, e. g., Petrou et al., Image Processing, The Fundamentals, Wfley, Chichester, England, 1999. In any case, once a group of 3D points has been established, the information can be converted into a file format suitable for 3D computer-aided design (CAD). Such formats include the "Initial Graphics Exchange Specifications" (IGES) and "Drawing Exchange" (DXF) formats. The information can then be exploited for many purposes using commerciaUy avaUable computer hardware and software. For example, it can be used to construct 3D models for virtual reafity games and training simulators. It can be used to create graphic animations for, e.g., entertainment, commercials, and expert testimony in legal proceedings. It can be used to estabhsh as-buUt dimensions of buUdings and other structures such as ofl refineries. It can be used as topographic information for designing civil engineering projects. A wide range of surveying needs can be served in this manner. In factory and warehouse settings, it is frequently necessary to measure the locations of objects such as parts and packages in order to control machines that manipulate them. The 3D edge detection and location method described above can be adapted to these purposes. Another factory appHcation is inspection of manufactured items for quahty control. In other appHcations, the range information is used to control a mobUe robot. The range information is fed to the controUer of the robotic device, which is operated in response to the range information. An example of a method for controlling a robotic device in response to range information is that described in U. S. Patent No. 5,793,900 to Nourbakhsh, incorporated herein by reference. Other methods of robotic navigation into which this invention can be incorporated are described in Borenstein et al., Navigating Mobile Robots, A K Peters, Ltd., WeUesley, Mass., 1996. Examples of robotic devices that can be controUed in this way are automated dump trucks, tractors, orchard equipment like sprayers and pickers, vegetable harvesting machines, construction robots, domestic robots, machines to puU weeds and volunteer corn, mine clearing robots, and robots to sort and manipulate hazardous materials.

Another appHcation is in microsurgery, where the range information produced in accordance with the invention is used to guide surgical lasers and other targeted medical devices. Yet another appHcation is in the automated navigation of vehicles such as automobUes. A substantial body of Hterature has been developed pertaining to automated vehicle navigation and can be referred to for specific methods and approaches to incorporating the range information provided by this invention into a navigational system. Examples of this Hterature include Advanced Guided Vehicles, Cameron et al, eds., World Scientific Press, Singapore, 1994; Advances in Control Systems and Signal Processing, Vol. 7: Contributions to Autonomous Mobile Systems, I. Hartman, ed., Vieweg, Braunschweig, Germany 1992; and Vision and Navigation, Thorpe, ed., Kluwer Academic Pubhshers, NorweU, Mass., 1990. A simplified block diagram of such a navigation system is shown in Figure 12. In Figure 12, multiple image sensors on camera 19 send signals over connections to image processors 1201, which generate the focus metrics and forward them to computer 1202 for calculation of ranges. Computer 1202 receives tUt and pan information from tUt and pan mechanism 1205, which it uses to adjust the range calculations in response to the field of view of camera 19 at any given time. Computer 1202 forwards the range information to a display means 1206 and/or vehicle control system 1207. Vehicle navigation computer 1207 operates one or more control mechanisms of the vehicle, including for example, acceleration, braking, or steering, in response to range information provided by computer 1203. Artificial intelligence (Al) software (see, e.g., Dickmans, "Improvements in Visual Autonomous Road Vehicle Guidance 1987-94", Visual Navigation, From Biological Systems to Unmanned Ground Vehicles, Aloimonos, Ed., Lawrence Erlbaum Associates, Pub., Mahwah, New Jersey 1997), is used by vehicle navigation computer 1207 to control camera 19 as weU as the vehicle. Operating parameters of camera 19 controUed by vehicle navigation computer 1207 may include the tUt and pan angles, the focal length (zoom) and overaU focus distance.

The Al software mimics certain aspects of human thinking in order to construct a "mental" model of the location of the vehicle on the road, the shape of the road ahead anά the location and speed of other vehicles, pedestrians, landmarks, etc., on and near the road. Camera 19 provides much of the information needed to create and frequently update this model. The area-based processing can locate and help to classify objects based on colors and textures as weU as edges. The MPEG2 algorithm, if used, can provide velocity information for sections of the image that can be used by vehicle navigation computer 1207, in addition to the range and bearing information provided by the invention, to improve the dynamic accuracy of the Al model. Additional inputs into the Al computer might include, for example, speed and mUeage information, position sensors for vehicle controls and camera controls, a Global Positioning System receiver, and the like. The Al software should operate the vehicle in a safe and predictable manner, in accordance with the traffic laws, whUe accompHshing the transportation objective.

Many benefits are possible with this form of driving. These include safety improvements, freeing drivers for more production activities while commuting, increased freedom for people who are otherwise unable to drive due to disabUity, age or inebriation, and increased capacity of the road system due to a decrease in the required foUowing distance. Yet another appHcation is the creation of video special effects. The range information generated according to this invention can be used to identify portions of the image in which the imaged objects faU within a certain set of ranges. The portion of the digital stream that represents these portions of the image can be identified by virtue of the calculated ranges and used to replace a portion of the digital stream of some other image. The effect is one of superimposing part of one image over another. For example, a composite image of a broadcaster in front of a remote background can be created by recording the video image of the broadcaster in front of a set, using the camera of the invention. Using the range estimations provided by this invention, portions of the video image that correspond to the broadcaster can be identified because the range of the broadcaster wiU be different than that of the set. To provide a background, a digital stream of some other background image is separately recorded in digital form. By replacing a portion of the digital stream of the background image with the digital stream corresponding to the image of the broadcaster, a composite image is made which displays the broadcaster seemingly in front of the remote background. It wiU be readily apparent that the range information can be used in si Uar manner to create a large number of video special effects.

The method of the invention can also be used to construct images with much larger depth of field than the focus means ordinarily would provide. First, images are coUected from each image sensor. For each section of the images, the sharpest and second sharpest images are identified, such as by the method shown in Figure 10, and these images are used to estimate the distance of the object corresponding to that section of the images. Equation 1 and the relationship σ = DB 1.414 permits the calculation of σ. For each DCT coefficient, the factor in the MTF due to defocus is given by exp(-2π²v²σ²), as described before. To deblur the image, each DCT coefficient is divided by the MTF to provide an estimate the coefficient that would have been measured for a perfectly focused image. The estimated "corrected" coefficients then can be used to create a deblurred image. The corrected image is assembled from the sections of corrected coefficients that are potentiaUy derived from aU the source ranges, where the sharpest images are used in each case. If aU the objects in the field of view art at distances greater than or equal to the smaUest xi or and less than or equal to the largest i, then the corrected image wUl be nearly in perfect focus almost everywhere. The only significant departures from perfect focus wiU be cases where a section of pixels straddles two or more objects that are at very different distances. In such cases at least part of the section wUl be out of focus. Since the sections of pixels are smaU (typicaUy 8X8 blocks when the preferred JPEG, MPEG2 or Digital Video algorithms are used to determine a focus metric), this effect should have only a minor impact on the overaU appearance of the corrected image.

The invention may be very useful in microscopy, because most microscopes are severely Hmited in depth of field. In addition, there are purely photographic appHcations of the invention. For example, the invention permits one to use a long lens to frame a distant subject in a foreground object such as a doorway. The invention permits one to create an image in which the doorway and the subject are both in focus. Note that this can be achieved using a wide aperture, which ordinarily creates a very smaU depth of field.

In cinematography, a speciaHst caUed a focus puUer has the job of adjusting the focus setting of the lens during the shot to shift the emphasis from one part of the scene to another. For example, the focus is often thrown back and forth between two actors, one in the foreground and one in the background, according to which one is dehvering Hnes. Another example is foUow focus, an example of which is an actor walking toward the camera on a crowded city sidewalk. It is desired to keep the actor in focus as the center of attention of the scene. The work of the focus puller is somewhat hit or miss, and once the scene is put onto film or tape, there is Httle that can be done to change or sharpen the focus. Conventional editing techniques make it possible to artificiaUy blur portions of the image, but not to make them significantly sharper.

Thus, the invention can be used as a tool to increase creative control by aUowing the focus and depth of field to be determined in post-production. These parameters can be controUed by first synthesizing a fuUy sharp image, as described above, and then computing the appropriate MTF for each part of the image and applying it to the transform coefficients (i.e., DCT coefficients).

It wUl be appreciated that many modifications can be made to the invention as described herein without departing from the spirit of the invention, the scope of which is defined by the appended claims.

Claims

WHAT IS CLAIMED IS:

1. A camera comprising (a) a focusing means (b) multiple image sensors which receive two-dimensional images, said image sensors each being located at different optical path lengths from the focusing means and,

(c) a beamspHtting system for spHtting Hght received though the focusing means into two or more beams and projecting said beams onto multiple image sensors to form multiple, substantiaUy identical images on said image sensors.

2. The camera of claim 1, wherein said image sensors are CMOSs or CCDs.

3. The camera of claim 2, wherein said beamspHtting system projects substantiaUy identical images onto at least three image sensors.

4. The camera of claim 3, wherein said beamspHtting system is a binary cascading system providing n levels of spHtting to form 2ⁿ substantiaUy identical images.

5. The camera of claim 4, wherein n is 3, and eight substantiaUy identically images are projected onto eight image sensors.

6. The camera of claim 3, wherein said focussing system is a compound lens.

7. The camera of claim 6, wherein said image sensors are each in electrical connection with a JPEG, MPEG2 or Digital Video processor.

8. The camera of claim 7, wherein said JPEG, MPEG2 or Digital Video processors are in electrical connection with a computer programmed to calculate range estimates from output signals from said JPEG, MPEG2 or Digital Video processors.

9. A method for determining the range of an object, comprising

(a) framing the object within the field of view of camera having a focusing means, (b) spHtting Hght received through and focussed by the focusing means and projecting substantiaUy identical images onto multiple image sensors that are each located at different optical path length from the focusing means,

(c) for at least two of said multiple image sensors, identifying a section of said image corresponding to substantiaUy the same angular sector in object space and that includes at least a portion of said object, and for each of said sections, calculating a focus metric indicative of the degree to which said section of said image is in focus on said image sensor, and

(d) calculating the range of the object from said focus metrics.

10. The method of claim 9 wherein steps (c) and (d) are repeated for multiple sections of said substantiaUy identical images to provide a range map.

11. A beamspHtting system for spHtting a focused Hght beam through n levels of spHtting to form multiple, substantiaUy identical images, comprising an arrangement of 2ⁿ-l beamsphtters which are each capable of spHtting a focussed beam of incoming Hght into two beams, said beamsphtters being hierarchicaUy arranged such that said focussed Hght beam is divided into 2ⁿ beams, n being an integer of 2 or more.

12. The device of claim 11 wherein said 2n-l beamspHtting means are each a partiaUy reflective surface oriented diagonaUy to the direction of the incoming Hght.

13. The device of claim 12 wherein said partiaUy reflective surface is a surface of a prism which is coated with a hybrid metaUic/dielectric partiaUy reflective coating.

14. The device of claim 13 wherein n is 3.

15. The device of claim 14 including means for projecting eight substantiaUy identical images onto eight image sensors.

16. A method for determining the range of one or more imaged objects comprising (a) spHtting a focused image into a plurahty of substantiaUy identical images and projecting each of said substantiaUy identical images onto a corresponding image sensors having an array of Hght-sensing pixels, wherein each of said image sensors is located at a different optical path length than the other image sensors;

(b) for each image sensor, identifying a set of pixels that detect a given portion of said focused image, said given portion including at least a portion of said imaged object;

(c) identifying two of said image sensors in which said given portion of said focused image is most nearly in focus;

(d) for each of said two image sensors identified in step c), generating a set of one or more signals that can be compared with one or more corresponding signals from the other of said two image sensors to determine the difference in the squares of the blur diameters of a point on said object;

(e) calculating the difference in the squares of the blur diameters of a point on said object from the signals generated in step d) and (f) calculating the range of said object from the difference in the squares of the blur diameters.

17. The method of claim 16 wherein steps c, d, e and f are performed using a computer.

18. The method of claim 17 wherein said blur diameters are expressed as widths of a Gaussian brightness function.

19. The method of claim 18 wherein in step d, said signals are generated using a discrete cosine transformation.

20. The method of claim 19 wherein said signals are in JPEG, MPEG2 or Digital Video format.

21. The method of claim 20 wherein for each of said image sensors, a pluraHty of signals are generated that can be compared with one or more corresponding signals from the other of said two image sensors to determine the difference in the squares of the blur diameters of a point on said object, and the range of said object is determined using a weighted average of said signals.

22. A method for creating a range map of aU objects within the view of view of a camera, comprising

(a) framing an object space within the field of view of camera having a focusing means (b) spHtting Hght received through and focussed by the focusing means and projecting substantiaUy identical images onto multiple image sensors that are each located at a different optical path length from the focusing means,

(c) identifying a section of said image on at least two of said multiple image sensors that correspond to substantiaUy the same angular sector of the object space

(e) calculating the range of an object within said angular sector of the object space from said focus metrics, and (f) repeating steps (c) - (e) for aU sections of said images.

23. A method for determining the range of an object, comprising

(a) forming at least two substantiaUy identical images of at least a portion of said object on one or more image sensors, where said substantiaUy identical images are focussed differently; φ) for sections of said substantiaUy identical images that correspond to substantiaUy the same angular sector in object space and include an image of at least a portion of said object, analyzing the brightness content of each image at one or more spatial frequencies by performing a discrete cosine transformation to calculate a focus metric, and

(c) calculating the range of the object from the focus metrics.