WO2015019208A1 - Apparatus and method for correcting perspective distortions of images - Google Patents

Apparatus and method for correcting perspective distortions of images Download PDF

Info

Publication number
WO2015019208A1
WO2015019208A1 PCT/IB2014/062727 IB2014062727W WO2015019208A1 WO 2015019208 A1 WO2015019208 A1 WO 2015019208A1 IB 2014062727 W IB2014062727 W IB 2014062727W WO 2015019208 A1 WO2015019208 A1 WO 2015019208A1
Authority
WO
WIPO (PCT)
Prior art keywords
point
projection
image
center
plane
Prior art date
Application number
PCT/IB2014/062727
Other languages
French (fr)
Other versions
WO2015019208A9 (en
Inventor
Pietro Porzio Giusto
Original Assignee
Sisvel Technology S.R.L.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sisvel Technology S.R.L. filed Critical Sisvel Technology S.R.L.
Publication of WO2015019208A1 publication Critical patent/WO2015019208A1/en
Publication of WO2015019208A9 publication Critical patent/WO2015019208A9/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T5/80
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence

Definitions

  • the present invention relates to an apparatus and a method for correcting images, so as to reduce the deformations that appear, in both bidimensional and stereoscopic vision, when the images are viewed from a point of view not corresponding to the center of projection of the perspective according to which they have been produced.
  • Linear perspective also called simply perspective
  • Fig. 1 The basic criterion of perspective construction, shown in Fig. 1, consists in projecting onto a plane 101, referred to as “projection plane” or “projection frame” or simply “frame”, the points of the three-dimensional space as viewed from a "center of projection” C.
  • the straight line extending from the center of projection in the direction towards which the viewer's sight, or the camera lens, is oriented is called “optical axis”.
  • a Cartesian reference is defined, such as the one shown in Fig.
  • the axes originate from the center of projection C, the z-axis coinciding with the optical axis, the j-axis being vertical, oriented upwards, and the -axis being horizontal, oriented from left to right for the viewer.
  • the following will generally refer to the case wherein the frame is perpendicular to the optical axis (this is the case of the perspective referred to as "vertical frame perspective"), but the man skilled in the art will understand that the method of the present invention is not limited to such a case, but is also applicable to cases wherein the frame is not perpendicular to the optical axis.
  • the z-axis is also called “depth axis", since the "depth" of a point of the three-dimensional space is defined as the distance of that given point from the y-plane.
  • the projection plane 101 is at a distance / from the center of projection C and is perpendicular to the optical axis, which intersects the projection plane (101) at the point Ic.
  • the projection Q of a point of the space d results from the intersection between the projection plane 101 and the "projective straight line", i.e. the straight line that passes through the point to be projected A and the center of projection C.
  • Photo and video cameras produce images that are theoretically compliant with linear perspective, but in fact real lenses often introduce more or less visible distortions, e.g. like the so-called “barrel” and “cushion” distortions.
  • the present invention will not deal with such kinds of distortions, which are the subjects of many studies and correction techniques (see for example European Patent EP1333498 Bl to Agilent Technologies Inc. and International patent application WO 98/57292 Al by Apple Computer).
  • the present invention will not even tackle those image deformations which are caused by errors in the positioning of the lenses with respect to the desired position and orientation, such as tapering of images of tall buildings taken from below and deformation of images of documents photographed obliquely or from point of views offset from the document axis.
  • image deformations have been amply discussed in the literature as well (see for example documents US 7,990,412 Al, US 2009/0103808 A1, BR PI0802865 A2, US 2004/0022451 Al, US 2006/0210192 Al,
  • the correction technique of the present invention processes the images as if they were perfectly compliant with linear perspective. Any distortions with respect to the linear perspective, and in particular those mentioned above, will not be taken into account and will be present in the processing results.
  • the present invention relates to deformations that appear in perspective images when said images are viewed from a point of view not corresponding to the center of projection, e.g. if the image of the frame 101 (Fig. 1) is viewed from the point V, not from the center of projection C.
  • FIG. 2 An example of such deformations is shown in Fig. 2.
  • This shows the image of a solid 202 drawn by a calculation program rigorously in accordance with the linear perspective rules.
  • the image of the solid 202 is located in the lower right corner of a frame, of which the figure only shows the lower right quadrant (also called fourth quadrant) 201 in order to enlarge the elements of interest (i.e. the solid 202 and the center Ic of the projection plane) compared to the dimensions that they would have if the entire frame were reproduced.
  • the lower right quadrant also called fourth quadrant
  • Fig. 2 is observed from a point close to the corresponding center of projection, which is located on the straight line passing through Ic, orthogonal to the figure plane, at a distance from the sheet (i.e. the image plane) equal to 25 times the cube edge, then the solid 202 will appear correctly with a cube shape, not with the deformed parallelepiped shape.
  • Fig. 3 represents the horizon plane of a perspective, i.e. the z-plane of Fig. 1, in which a square ABDE is drawn, with its front side adjacent to the projection plane.
  • the element 301 represents the projection plane 101 of Fig. 1, as viewed from the direction of the j-axis.
  • the projection plane and the image plane coincide, although in reality the image is normally reproduced on a support which is distinct from the projection plane. If the two planes are distinct, they can be transferred one over the other, with all their respective geometric elements, by means of a homothetic transformation known to those skilled in the art.
  • the linear perspective reproduces the reality well if the images are viewed from the point corresponding to the center of projection, but the images will turn out to be deformed if the viewer moves away from such point.
  • Pannini painted suggestive architectural views with wide angles of view, with no visible perspective deformations (v. Thomas K. Sharpless, Bruno Postle, and Daniel M. German, “Pannini: A New Projection for Rendering Wide Angle Perspective Images”, Computational Aesthetics in Graphics, Visualization, and Imaging (2010), The Eurographics Association 2010, http://vedutismo.net/Pannini/panini.pdf).
  • Denis Zorin, Alan H. Barr see Zorin D., Barr A. H., "Correction of geometric perceptual distortions in pictures", SIGGRAPH '95: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques (1995), pp.
  • Robert Carroll et al. (Carroll R., Agrawal M., Agarwala A., "Optimizing content-preserving projections for wide-angle images", SIGGRAPH '09: ACM SIGGRAPH 2009 papers (New York, NY, USA, 2009), ACM, pp. 1-9) have proposed to minimize deformations by adapting the projection to the contents.
  • a man- machine interface is provided, through which the user can characterize the areas and elements of the images to be corrected in particular ways, such as straight lines that must remain as such, people's faces, etc. This method is however time-consuming and uncomfortable, and requires specific calibrations for various types of elements.
  • Image recognition and correction are made easier by determining the orientation of the video cameras by means of sensors (accelerometers, gyroscopes). As can be guessed, this method is very complex and is only applicable in particular circumstances. Furthermore, it does not solve the problem of deformations that arise, in general, when a perspective image is viewed from a point of view not corresponding to the center of projection.
  • the present invention provides an adequate solution to the above-described problem by disclosing a method, and the associated apparatus, for correcting the deformations that appear on images when the latter are viewed from a point not corresponding to the center of projection of the perspective.
  • the apparatus and the associated method are applicable to both the bidimensional reproduction of a single image and the reproduction of a pair of stereoscopic images.
  • Said apparatus comprises suitable means for acquiring a bidimensional image or a pair of stereoscopic images, with sufficient data to determine the coordinates of the point corresponding to the center of projection of the perspective (e.g. image center and focal length) and with the associated depth map.
  • the latter is defined as the set of depths of the points of the scene represented in the image, that is, with reference to Fig. 1, the set of ⁇ -coordinates of the points of the three-dimensional scene.
  • data may be provided from which it can then be obtained.
  • this apparatus comprises storage means and processing means (e.g. a processor executing a suitable software code) configured to correct the position of the points of the acquired images, according to a technique called "Partial Perspective Gradient" (PPG).
  • PPG Partial Perspective Gradient
  • the corrections made by this technique tend to represent, in the image plane, the position of each point as if it had been captured with the lens pointed at it.
  • Such technique numerous variants of which can be defined, is based on the calculation of a gradient of the position of the points in the image plane. By integrating the components of this gradient, the functions can be found according to which the points will be positioned in order to make said correction.
  • each pixel represents, in an approximate manner, a small area of the image; for simplicity, however, in some parts of this description a geometric point will be identified as a pixel, accepting the approximation according to which the pixel's discrete coordinates are assumed to correspond to those of the geometric point it is intended to represent.
  • said gradient is calculated by taking into account a generic point A of the three-dimensional space, corresponding, according to linear perspective, to the point Q of the image plane 401, and to the point P resulting from the intersection between the projective straight line of A and the plane 402, which will be called ⁇ (or auxiliary ⁇ - plane), and which, in the preferred embodiment of the invention, is orthogonal to the projective straight line of A.
  • An incremental displacement of A corresponds, in the ⁇ -plane 402, to an incremental displacement of P (hereafter, an incremental displacement of A is meant to be an infinitesimal virtual increment, whether positive or negative, of one or more coordinates of the point A; such increment is hypothetically applied, according to the mathematical method of infinitesimal calculation, to determine the mathematical functions that bind the position of the points in the neighbourhood of A in the three- dimensional space to the positions of the points in the neighbourhood of the projection of A on the projection planes, as will be explained below). From the components of the incremental displacement the gradient components, i.e. the partial derivatives, of the functions are calculated, with which to represent, in the image plane 401, the correct coordinates of the image of the point A.
  • Fig. 1 geometrically illustrates the operation of linear perspective
  • Fig. 2 illustrates the perspective representation of a cube in an offset position relative to the optical axis
  • Fig. 3 qualitatively illustrates the deformations inherent in linear perspective when viewing images from a point not corresponding to the center of projection
  • Fig. 4 geometrically illustrates one embodiment of the invention
  • Fig. 5 geometrically illustrates the operation of linear perspective in the stereoscopic case
  • Fig. 6 illustrates a plan view of a part of Fig. 4
  • Fig. 7 illustrates a representation of Cartesian references
  • Fig. 8 illustrates the trend of the raised cosine function and its complement that are used in the method and apparatus according to the invention
  • Fig. 9 illustrates stereoscopic images of the cube of Fig. 2;
  • Fig. 10 illustrates a block diagram of an apparatus according to the invention
  • Fig. 11 illustrates a flow chart of a process wherein the perspective correction of the present invention is applied.
  • the present invention relates to the correction of single bidimensional images or pairs of stereoscopic images, aimed at reducing the deformations that appear in perspective images when they are viewed from a point not corresponding to the center of projection of the perspective.
  • the following will describe an apparatus, and the method it implements, with reference to a preferred embodiment and some exemplary but non- limiting variants thereof.
  • the apparatus of the present invention processes images by using a technique called "Partial Perspective Gradient” (PPG), which corrects the position of the points (pixels) of the images in such a way as to locate them as if each point of the scene represented in the image had been captured by a lens pointed at it.
  • PPG Partial Perspective Gradient
  • said technique uses the coordinates of the point the representation of which has to be corrected, and the data defining the geometry of the perspective according to which the image has been generated.
  • the coordinates of the point A taken into account are obtained from the image, i.e. from the x, y coordinates of the point Q, and from the distance of the point s from the y-plane.
  • this distance is called “depth”
  • the set of distances of the points of the three-dimensional space from the y-plane is called “depth map”.
  • the geometry of the perspective according to which the image has been generated is essentially defined by the focal length and by the frame dimensions.
  • the interoptical distance b i.e. the distance between the centers of projection from which the two stereoscopic images have been generated, is also considered in addition to the focal length and the frame dimensions.
  • the data sufficient to determine the depth of the points represented in the image also known as depth data, may comprise the depth map, and can be obtained by using various methods known to those skilled in the art, whether for an image intended for bidimensional vision or for a pair of stereoscopic images. In the case of drawings or paintings, such data are implicit in the artist's project.
  • the depth map can be obtained from the disparity map, which represents the difference between the horizontal coordinates of homologous points of the two images, as shown in Fig. 5.
  • the two centers of projection are horizontally aligned, and that the horizontal axes of the Cartesian references lie in the horizon plane, i.e. the horizontal plane that contains the centers of projection.
  • Fig. 5 shows one example wherein a point A of the three-dimensional space is projected from two distinct centers of projection CL and CR, onto two distinct projection planes, i.e. plane 501 and plane 503, whereon the coordinates are referred to the Cartesian references Ic L x y L and Ic R x R y R , respectively.
  • the optical axes starting off from the centers of projection, i.e. z L from C L and z R from C R are parallel to each other and are located at a distance b (interoptical distance) from each other.
  • the planes 501 and 503 are orthogonal to said optical axes and equidistant from the respective centers of projection by a distance /, whereas their horizontal axes, respectively xL and R , lie on one same straight line, which is parallel to the line joining the two centers of projection C L and C R , and intersects the optical axes at the points Ic L and Ic R , respectively.
  • the homologous points Q L and Q R resulting from the projection of A onto the planes x y L and R y R , respectively, have the same vertical coordinate, which for simplicity is not indicated in Fig. 5, and have the horizontal coordinates X QL and X QR , respectively.
  • z A "depth" of the point taken into account i.e. the z coordinate of the point A of Fig. 4 or the coordinates of the point A of Fig. 5 on the axes z L and z R ;
  • interoptical distance i.e. the distance between the two centers of projection C L and C R , f focal length, i.e. the distance between the projection center C L or C R and the respective projection plane 501 or 503; disp disparity, i.e. the difference between the horizontal coordinate of one point of the left image (Q L ) and the corresponding coordinate of the homologous point of the right image (Q R ), the coordinates being referred to the centers of the respective images (Ic L and Ic R ) or to other homologous reference points.
  • the equation (2) is defined with reference to the centers of projection and the projection plane, but, as is known to the man skilled in the art, the same equation, mutatis mutandis, can also be used for relating the depth of a point represented on the two images of a stereoscopy with the disparity measured on the pair of stereoscopic images and with the value of the equivalent interoptical distance of the geometrical configuration according to which the stereoscopy has been generated.
  • the center of projection of the perspective according to which an image has been generated can be located by providing the center of the projection plane and the distance between the center of projection and the projection plane (i.e. the focal length in the case of video cameras), or by providing the dimensions of the projection plane and the angle of view, since the angle of view, the image diagonal and the distance of the center of projection from the projection plane are bound by the equation (1).
  • the apparatus claimed by the present invention comprises suitable means for acquiring, in numerical form, a bidimensional image or a pair of stereoscopic images, along with depth or disparity data and other data sufficient to determine the coordinates of the center of projection C corresponding to the center of projection of the perspective, as mentioned above.
  • the apparatus claimed by the present invention applies the method ("Partial Perspective Gradient") according to the invention, wherein said method has been developed on the basis of the principle of representing the neighbourhood of each point of an image as if it had been pointed at while shooting or drawing. This principle takes into account that, in such conditions of sight orientation, small displacements of the point in question (e.g. the point A in the annexed figures) would be perceived.
  • the plane 402 lies at the same distance from the center of projection C as the projection plane 401. This configuration should however only be considered as a non-limiting explanatory example of the preferred embodiment. As the man skilled in the art will guess, and as will be explained below, the plane 402 can be set at any distance from the center of projection and with any orientation.
  • this projection Ap can be defined by using the C ⁇ y Cartesian reference.
  • This reference has its origin at C, which is coincident with the origin of the Cxyz reference, and its y -axis develops along a direction that coincides with that of the straight line passing through C and through A.
  • the ⁇ -plane is orthogonal to the y -axis.
  • the ⁇ -axis is defined by the intersection between the ⁇ -plane and the xz-plane (the ⁇ -axis lies in the xz-plane), whereas the ⁇ -axis, also passing through C, is orthogonal to both the ⁇ -axis and the y -axis.
  • a displacement parallel to the -axis of the point A implies, in the plane 402, a displacement of P with components in both the direction of the a-axis and the direction of the ⁇ -axis.
  • the component in the direction of the ⁇ -axis is null, and hence only the component in the direction of the ⁇ -axis, referred to as , will remain to be treated. Since with we want to determine the partial derivative that the component along the x-axis of the point Q, also referred to as x Q , must take, is related with .
  • the calculation can be made by using the common rules of geometry and mathematics, which are known to the man skilled in the art. They essentially provide for changing the reference system, switching the representation of the displacement from the Cxyz Cartesian reference to the C ⁇ y Cartesian reference.
  • the formulae for changing the reference systems can be found in school books and on various Internet sites, among which, for example, the following:
  • the expression (3 ⁇ y) indicates that the incremental displacement of the point P in the direction of the ⁇ -axis does not depend on the displacement component of the point Q in the direction of the y-axis, because the y-axis is orthogonal to the plane in which the ⁇ -axis lies, as aforesaid when commenting on Fig. 7 (the ⁇ -axis lies in the xz- plane).
  • the formula (4 ⁇ xa) provides results that are only slightly different from the integral calculation (4 ⁇ x). For example, in the case of application to the cube shown in Fig. 2, in which there are points corresponding to angle of views of 90°, the calculations made with the formula (4 ⁇ xa) will differ by less than 1% from those made with the formula (4 ⁇ x). As can be noticed, the use of the formula (4 ⁇ xa) does not require, unlike that of the formula (4 ⁇ x), any steps of numerical integration, thus advantageously reducing the processing time and load.
  • the formulae (4..) constitute the first embodiment of the "Partial Perspective Gradient” technique of the present invention, consisting of representing, in terms approximated in the image plane, what would appear from the center of the perspective, in the neighbourhood of each point of the scene to be reproduced, if the optical axis passed through that point.
  • a certain number of variants of the above technique can be taken into consideration.
  • formulae may be used which represent dependence from the "disparity" between homologous points of stereoscopic images, coherently with the formula (2).
  • a second variant uses formulae determined by assuming that the point P (Fig. 4, Fig. 6, Fig. 7), instead of being at a fixed distance of f from the center of projection C, is located at a distance from C that depends on some parameter. In particular, it can be imposed that , so that P will coincide with Q.
  • the formulae are determined by imposing that the point P is located on a segment VA, instead of the segment CA, with V distinct from C (Fig. 1, Fig. 3, Fig. 4).
  • the point V may preferably be located on the optical axis passing through the points C and I c , or away from said optical axis.
  • the distance of P from V may either be preset to a constant value or be variable depending on some parameter. For example, it can be imposed that .
  • a second embodiment of the idea consists of applying partly the formulae (4..) and partly the linear perspective formulae.
  • linear perspective reproduces images well within certain limits, and therefore within such limits it may be profitable to maintain the reproduction provided by linear perspective, combining the formulae (4..) with the linear perspective ones.
  • such combination may be made in such a way as to gradually switch from exclusive application of the formulae (4..) to exclusive application of the linear perspective formulae, but it may also be carried out in other manners that the man skilled in the art will be able to imagine.
  • One way to make such combination is to multiply the results of the formulae (4..) by a first factor, preferably comprised between the unitary value and the null value, and to multiply the results of the linear perspective formulae by a second factor, preferably complementary to the first factor; the results of the products thus obtained are then added up.
  • one example of a function suitable for creating the multiplicative factors is the raised cosine function represented by the curve 801, together with its complement 802.
  • the curve 801 stays at the unitary value for abscissa values between zero and a limit t s ; afterwards, in the interval from t s to t f , it gradually decreases to zero, and then it stays at the null value. Instead, its complementary function 802 has the opposite trend.
  • the abscissas of Fig. 8, and hence the limits t s to t f , can be related with the offset angle at which the point to be represented is seen from the center of projection, or with the distance of the point from the center of projection, or with other parameters or combinations of parameters allowing the man skilled in the art to meet the requirements of a specific application of the method according to the invention.
  • Said second embodiment of the invention is also liable to all variations that may be conceived for the first embodiment.
  • a F is the enlarged image of the cube 202 (cube projected from the center of projection Ic);
  • a E is the image of the same cube 202 projected from a center of projection with an abscissa equal to twice the cube side;
  • b F is the image obtained by processing the image a F with the "Partial
  • the figure In order to obtain the stereoscopic vision of the images of Fig. 9, the figure needs to be reproduced in such a way that the distance between the vertical dashes hanging from the upper horizontal line is about equal to, or slightly shorter than, the viewer's interpupillary distance.
  • An adequate dimension is normally obtained by reproducing the sheet containing Fig. 9 in the A4 format (21cm wide). After making sure that the straight line joining the centers of the viewer's pupils is parallel to the straight lines that delimit the figure at the top and at the bottom, it is then necessary to look fixedly at the figure, so as to obtain the merging of the right and left images.
  • Such merging can be facilitated by initially looking fixedly at the arrows running from the lower images to the upper images, or, better still, by placing a card near the viewer's forehead in a position orthogonal to said delimiting horizontal straight lines, so that the right eye will not see the left image, or at least most of it, and vice versa.
  • the image "b" (obtained from the merging of the images b F and b E ) well represents the shapes of a cube
  • the image "a” does not even look like a parallelepipedon, because the dimensions of the rear face appear to be bigger than those of the front face.
  • the horizontal and vertical lines maintain their own directions; in particular, the segments that lie in the projection plane, such as the edges of the front face of the cube reproduced in Fig. 9, maintain their length and their orientation, so that the front face of the cube will appear perfectly square.
  • the result of the processing carried out by using the PPG method instead, shows a cube which is seen obliquely, coherently with the fact that it is offset from the optical axis by 21° horizontally and by -15° vertically.
  • the PPG technique can be applied to stereoscopic images with even more advantage than to monoscopic images.
  • the PPG method can be applied to each one of the images of the stereoscopic pair, with all the possible variants mentioned above, but in the stereoscopy case there exist additional variants and expedients.
  • An image processing apparatus 1 like for example a photo camera or a video camera or the like, comprises image acquisition means 1001, input/output means 1002, a central processing unit (CPU) 1003, a read-only memory 1004, a random access memory (RAM) 1005, and means for producing processed images 1006. All of these elements of the apparatus 1 are in signal communication with one another, so that data can be mutually exchanged between any two of such elements.
  • CPU central processing unit
  • RAM random access memory
  • the image acquisition means 1001 can acquire both bidimensional images and pairs of stereoscopic images. If the images are equipped with the respective depth map and with data allowing to go back to the geometry with which they have been generated (e.g. focal length, angle of view, sensor inclination, or the like), the image acquisition means 1001 will acquire such data as well. Otherwise, some data can be set through a user interface (not shown in the drawings) in signal communication with the input/output means 1002, whereas for stereoscopic images the depth map may even be produced by the apparatus 1 itself, as will be explained below.
  • the user interface also allows the user to set options and variants, along with their parameters, that he/she may prefer to use in the specific case. For example, one can choose from the following settings:
  • the point of view relative to which the processing of the method of the invention is applied may be different from the center of projection C (see Fig. 1) relative to which the image to be processed has been generated.
  • the coordinates of Q and the depth of A being known, and having calculated with them the position of A in the three- dimensional space, one can in fact determine the projection of A onto any plane and from any point of view. For example, assuming as a center of projection a point K (the point corresponding, for example, to the point of view from which the viewer is supposed to be looking at the image) different from the center of projection C, one can determine the projection of A onto a plane perpendicular to the straight line that joins A to V, or onto a different plane. Processing the image from a point of view other than the center of projection C and onto various planes allows to make particular corrections to the image and to produce useful artificial images, such as those which can be used in stereoscopy for filling blanks, as will be discussed below.
  • the central processing unit (CPU) 1003 is that part of the apparatus which executes the calculation algorithms, including complementary operations that, after the application of the PPG technique, are useful for completing the correction of the images to be returned. These operations will be discussed while commenting on Fig. 11.
  • the central processing unit 1003 may actually comprise specially developed integrated circuits, one or more microprocessors, programmable logic circuits (e.g. CPLD, FPGA), and the like. These and other implementation possibilities neither anticipate nor make obvious the teachings of the present invention.
  • the read-only memory 1004 is preferably used for permanently storing some instructions for managing the apparatus and the instructions that implement the calculation algorithms, while the random access memory (RAM) 1005 is typically used for temporarily storing the images and the intermediate processing results.
  • RAM random access memory
  • the means for producing processed images 1006 return the processed images, e.g. by transferring them from the RAM memory 1005 to the input/output means 1002, so that said means 1002 can save said processed images into a permanent memory (e.g. a hard disk or a type of Flash memory, such as Secure Digital, MMC or the like), display them on one or more screens or other display means (not shown in the annexed drawings), print them, and the like.
  • a permanent memory e.g. a hard disk or a type of Flash memory, such as Secure Digital, MMC or the like
  • display them e.g. a screen or other display means (not shown in the annexed drawings), print them, and the like.
  • the process that implements the invention may comprise the following steps:
  • - start step 1101 during which the apparatus 1 is configured for processing at least one image
  • - setting step 1102 for acquiring the settings and the data that the user intends to manually provide for processing at least said image
  • step 1 103 during which the image to be processed is acquired by the image acquisition means 1001 and is preferably transferred into the RAM memory 1005 (for simplicity, it is assumed that, downstream of step 1103, the images are in numerical form, and that, in the event that they should be available in another form, the man skilled in the art will know how to convert them into numerical form); it is understood that the image acquired at step 1103 is equipped with data defining the geometry according to which the image has been generated, and that it is preferably also equipped with the depth map, or with data allowing to create it; the subsequent steps 1104 and 1105 will consider the case wherein the depth map is not provided at this step 1103;
  • step 1 104 depth map presence verification step 1 104, during which it is determined whether the depth map is available or not;
  • step 1106 during which the apparatus 1, by using the depth map acquired at step 1103 or calculated at step 1105, determines the position in the three- dimensional space of the points corresponding to the pixels of the image acquired at step 1103 and applies thereto the position correction algorithm according to the invention;
  • the result of step 1106 is a matrix indicating the position that the pixels of the image acquired at step 1103 must take after said processing; at this stage no pixel shifting occurs yet, in order to avoid having to repeat this kind of operation after the additional processing carried out in the next steps;
  • the resizing process consists of recalculating, starting from the result of step 1106, the position that the pixels of the processed image must take after the resizing applied at this step 1107;
  • - processed image returning step 1110 during which the image is preferably stored into an area of the RAM memory 1005, or another memory, and is made available for display, printing, transfer to other apparatuses, and other operations;
  • the apparatus 1 When the apparatus 1 is in an operating condition, after the start step 1101 said apparatus 1 enters the setting step 1102, and then, simultaneously or afterwards, the image acquisition step 1103. At the end of step 1103, the apparatus 1 verifies the availability of a depth map (step 1104) for what has been acquired at step 1103; if the depth map is available, the apparatus 1 enters the processing step 1106; otherwise, if the map is not present, the apparatus enters the depth map calculation step 1105 prior to proceeding with the processing step 1106. After step 1106, the apparatus may optionally carry out the resizing step 1107. Then the apparatus 1 carries out the overlap elimination step 1108, followed by the pixel shifting and blank filling step 1109 and by the processed image returning step 1110. The process ends at the final step 1111. At step 1106 the PPG technique of the present invention is applied in one of the above- described embodiments thereof.
  • the processing means 1003 and the storage means 1004 and 1005 of the apparatus 1 are configured to correct the image represented in the plane 401, which has been generated in compliance with the linear perspective rules relative to the center of projection C and comprises at least one first point Q, wherein said first point Q is the result of the perspective projection of the second point A of a region of the three- dimensional space from the center of projection C; to do so, the processing means 1003 and the storage means 1004 and 1005 execute the method according to the invention, which comprises the following steps: a) calculating the position of the second point A in the three-dimensional space;
  • some pixels of this area may overlap the pixels of areas that have not been shifted, or that have been shifted less, or that have been shifted in different directions.
  • one may, for example, have the pixel with less depth occupy the contended position (i.e. the pixel located at the smaller distance from the xy- plane will prevent seeing the farther one).
  • stereoscopy uses two images of the same scene viewed from two different points of view.
  • the corrective shift of a point of the left image is generally different from the corrective shift of the homologous point of the right image.
  • the blanks in either one of the two processed images can be at least partially filled by appropriately processing the other image.
  • one way of filling the blanks in the left (right) image is to superimpose it on an "artificial" image obtained by processing the right (left) image, assuming as a center of projection the same center of projection of the left (right) image.
  • step 1109 It is advantageous to carry out the pixel shifting and blank filling operations during step 1109 on already resized images and after having resolved any conflicts, i.e. downstream of step 1108, because resizing may also cause overlaps and blanks.
  • the application of the PPG method as described in the present invention will improve the vision of the images of objects located in offset positions relative to the pointing direction of the capturing device. Such improvements are evident in monoscopic vision, and are even more evident in stereoscopic vision.
  • the PPG technique which can be applied in real time while shooting or afterwards on acquired images, allows to capture scenes with angles of view exceeding the currently recommended limits, leading to significant advantages for both stereoscopic and monoscopic shooting.
  • the PPG technique turns out to be very versatile and can be optimized for different types of applications and apparatuses, which may even be characterized by very different processing capabilities.
  • the PPG technique can be used for correcting deformations in video streams, by applying it to every image it is composed of. This applies to both 2D and 3D video streams; in the latter case, the technique will have to be applied to each image of the stereoscopic pairs forming the 3D stream.

Abstract

The present invention relates to an apparatus and a method for correcting the deformations that appear, in both bidimensional and stereoscopic vision, when images generated in compliance with the linear perspective rules are viewed from a point of view not corresponding to the center of projection of the perspective. The corrections are determined by a technique that tends to represent the images as if each point that constitutes them were captured with the lens aimed at it. In such hypothesis of lens aiming, the position gradient of the points to be represented is calculated, and the coordinates of the positions that the points must take in the image plane are obtained by integrating the components of such gradient. The application range of the method is very wide, due to the various embodiments and numerous variants it allows.

Description

"APPARATUS AND METHOD FOR CORRECTING PERSPECTIVE DEFORMATIONS OF IMAGES"
The present invention relates to an apparatus and a method for correcting images, so as to reduce the deformations that appear, in both bidimensional and stereoscopic vision, when the images are viewed from a point of view not corresponding to the center of projection of the perspective according to which they have been produced.
Linear perspective, also called simply perspective, is known to be a geometrical- mathematical method that allows to reproduce a three-dimensional scene in a plane. The basic criterion of perspective construction, shown in Fig. 1, consists in projecting onto a plane 101, referred to as "projection plane" or "projection frame" or simply "frame", the points of the three-dimensional space as viewed from a "center of projection" C. The straight line extending from the center of projection in the direction towards which the viewer's sight, or the camera lens, is oriented is called "optical axis". Generally a Cartesian reference is defined, such as the one shown in Fig. 1, wherein the axes originate from the center of projection C, the z-axis coinciding with the optical axis, the j-axis being vertical, oriented upwards, and the -axis being horizontal, oriented from left to right for the viewer. For simplicity's sake, the following will generally refer to the case wherein the frame is perpendicular to the optical axis (this is the case of the perspective referred to as "vertical frame perspective"), but the man skilled in the art will understand that the method of the present invention is not limited to such a case, but is also applicable to cases wherein the frame is not perpendicular to the optical axis.
In the terminology commonly used in this technical field, the z-axis is also called "depth axis", since the "depth" of a point of the three-dimensional space is defined as the distance of that given point from the y-plane. In Fig. 1, the projection plane 101 is at a distance / from the center of projection C and is perpendicular to the optical axis, which intersects the projection plane (101) at the point Ic. The projection Q of a point of the space d results from the intersection between the projection plane 101 and the "projective straight line", i.e. the straight line that passes through the point to be projected A and the center of projection C.
Photo and video cameras produce images that are theoretically compliant with linear perspective, but in fact real lenses often introduce more or less visible distortions, e.g. like the so-called "barrel" and "cushion" distortions. The present invention will not deal with such kinds of distortions, which are the subjects of many studies and correction techniques (see for example European Patent EP1333498 Bl to Agilent Technologies Inc. and International patent application WO 98/57292 Al by Apple Computer).
The present invention will not even tackle those image deformations which are caused by errors in the positioning of the lenses with respect to the desired position and orientation, such as tapering of images of tall buildings taken from below and deformation of images of documents photographed obliquely or from point of views offset from the document axis. These deformations have been amply discussed in the literature as well (see for example documents US 7,990,412 Al, US 2009/0103808 A1, BR PI0802865 A2, US 2004/0022451 Al, US 2006/0210192 Al,
US 2011/0149094 Al).
The correction technique of the present invention processes the images as if they were perfectly compliant with linear perspective. Any distortions with respect to the linear perspective, and in particular those mentioned above, will not be taken into account and will be present in the processing results.
In fact, the present invention relates to deformations that appear in perspective images when said images are viewed from a point of view not corresponding to the center of projection, e.g. if the image of the frame 101 (Fig. 1) is viewed from the point V, not from the center of projection C.
An example of such deformations is shown in Fig. 2. This shows the image of a solid 202 drawn by a calculation program rigorously in accordance with the linear perspective rules. The image of the solid 202 is located in the lower right corner of a frame, of which the figure only shows the lower right quadrant (also called fourth quadrant) 201 in order to enlarge the elements of interest (i.e. the solid 202 and the center Ic of the projection plane) compared to the dimensions that they would have if the entire frame were reproduced.
If Fig. 2 is observed as if it were a normal photograph, e.g. from a point lying along the perpendicular to the center of the figure, the solid 202 seems to have the shape of a parallelepipedon with a square front face and the side edges longer than the front edges; however, the solid 202 is a perfect cube, the front face lying in the projection plane and the coordinates of the center of said face being x = 20 units and j = -14 units, as measured with reference to the length of one edge of the cube. In the same unit of measure, the point corresponding to the center of projection lies 25 units away from the image plane, on the vertical of the figure plane passing through the point Ic.
If Fig. 2 is observed from a point close to the corresponding center of projection, which is located on the straight line passing through Ic, orthogonal to the figure plane, at a distance from the sheet (i.e. the image plane) equal to 25 times the cube edge, then the solid 202 will appear correctly with a cube shape, not with the deformed parallelepiped shape.
A qualitative explanation of the reason for such deformation is given in Fig. 3. It represents the horizon plane of a perspective, i.e. the z-plane of Fig. 1, in which a square ABDE is drawn, with its front side adjacent to the projection plane. In Fig. 3 the element 301 represents the projection plane 101 of Fig. 1, as viewed from the direction of the j-axis.
Here and below it will be assumed that, for simplicity, the projection plane and the image plane coincide, although in reality the image is normally reproduced on a support which is distinct from the projection plane. If the two planes are distinct, they can be transferred one over the other, with all their respective geometric elements, by means of a homothetic transformation known to those skilled in the art.
In Fig. 3, the projections of the vertices B and D from the center of projection C coincide with the actual vertices B and D, whereas the projection of the vertex A is represented by the point Q. Now, assuming to view the projection plane from the point V, it is found that, by attributing to the point Q the representation of a vertex located on the straight line passing through AB, the viewer bestows on the point Q the representation of the point A ', so that the square ABDE will appear, from the point V, as if it were the rectangle A 'BDE' .
This kind of deformation is generally more apparent when viewing stereoscopic images, as is known to those skilled in the art and as will be discussed hereafter.
Therefore, the linear perspective reproduces the reality well if the images are viewed from the point corresponding to the center of projection, but the images will turn out to be deformed if the viewer moves away from such point.
Such deformations increase with the offset of the image viewpoint from the position corresponding to the center of projection. Since images are normally observed from a point which is somewhat offset from the straight line corresponding to the optical axis of the perspective, in practice the component that mainly causes the deformation derives from the distance of the viewer's point of view from the point corresponding to the center of projection. Instead of this distance, the angle of view is often taken into account in practice (i.e. the angle that subtends the diagonal of the projection plane from the center of projection), considering that the following relation exists between the angle of view and the distance of the center of projection from the projection plane:
Figure imgf000006_0001
where the symbols have the following meanings:
angle of view;
d diagonal of the projection plane (diagonal of the photosensitive element, in the case of a video camera);
f distance between the center of projection and the projection plane
(focal length, in the case of a video camera).
By applying the formula (1), it is found that the angle of view corresponding to the point of the image of Fig. 2 which is farthest from Ic corresponds to an angle of view of 90°.
The deformations illustrated in Fig. 2 and Fig. 3 were first noticed by Leonardo da Vinci. In architecture and painting, in fact, the perspective was introduced in the fifteenth century by Brunelleschi, with contributions from Donatello and Masaccio. In order to prevent such deformations from appearing, Leonardo recommended to paint at a distance at least twenty times greater than the dimensions of the object to be portrayed, corresponding to angles of view of approx. 3° (see "Leonardo da Vinci on Painting - A lost book (Libro A) - by Carlo Pedretti - Foreword by Sir Kenneth Clark, University of California Press, Berkeley and Los Angeles, California, © 1964 by the Recent of the University of California, Library of Congress Catalog Card Number: 64- 17171).
In the eighteenth century Giovanni Paolo Pannini, aiming at reducing the linear perspective deformations already noticed by Leonardo, conceived a projection method consisting of two projections in succession. With the first projection, the scene in projected from a first center of projection onto a vertical cylindrical surface with the axis passing through the center of projection. With the second projection, from a second center of projection different from the first one, the image obtained on the cylindrical surface is projected onto a planar surface. This technique yields good results when representing scenes containing vertical lines that must remain as such, and wherein there is a central vanishing point towards which lines of many elements converge. Instead, it does not yield good results when representing generic images. By using this technique, Pannini painted suggestive architectural views with wide angles of view, with no visible perspective deformations (v. Thomas K. Sharpless, Bruno Postle, and Daniel M. German, "Pannini: A New Projection for Rendering Wide Angle Perspective Images", Computational Aesthetics in Graphics, Visualization, and Imaging (2010), The Eurographics Association 2010, http://vedutismo.net/Pannini/panini.pdf). In 1995 Denis Zorin, Alan H. Barr (see Zorin D., Barr A. H., "Correction of geometric perceptual distortions in pictures", SIGGRAPH '95: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques (1995), pp. 257-264) proposed to prevent said deformations from arising by first projecting the images onto a spherical surface, centered at the projection center, and then transferring such spherical surface onto a plane. However, when reproducing a spherical surface on a plane distortions inevitably arise, and therefore the Zorin-Barr method can only provide some improvement with narrow angles of view.
More recently, Robert Carroll et al. (Carroll R., Agrawal M., Agarwala A., "Optimizing content-preserving projections for wide-angle images", SIGGRAPH '09: ACM SIGGRAPH 2009 papers (New York, NY, USA, 2009), ACM, pp. 1-9) have proposed to minimize deformations by adapting the projection to the contents. To this end, a man- machine interface is provided, through which the user can characterize the areas and elements of the images to be corrected in particular ways, such as straight lines that must remain as such, people's faces, etc. This method is however time-consuming and uncomfortable, and requires specific calibrations for various types of elements.
We will finally mention United States patent application US 2011/0090303 Al by Apple Inc., which, with reference to video conferencing systems, describes a method for correcting deformations in images captured by video cameras. This method is based on locating particular elements contained in the captured images (e.g. faces of people participating in a video conference), and on applying specific corrections to the deformations of such elements. These elements can be located by using recognition algorithms and indications provided by the user through suitable man-machine interfaces. The user can, for example, indicate the central point of a face. Deformations and distortions are then corrected by comparing the captured image with reference images, such as, for example, photos of the participants taken before the start of the video conference. Image recognition and correction are made easier by determining the orientation of the video cameras by means of sensors (accelerometers, gyroscopes). As can be guessed, this method is very complex and is only applicable in particular circumstances. Furthermore, it does not solve the problem of deformations that arise, in general, when a perspective image is viewed from a point of view not corresponding to the center of projection.
The present invention provides an adequate solution to the above-described problem by disclosing a method, and the associated apparatus, for correcting the deformations that appear on images when the latter are viewed from a point not corresponding to the center of projection of the perspective. The apparatus and the associated method are applicable to both the bidimensional reproduction of a single image and the reproduction of a pair of stereoscopic images.
Said apparatus comprises suitable means for acquiring a bidimensional image or a pair of stereoscopic images, with sufficient data to determine the coordinates of the point corresponding to the center of projection of the perspective (e.g. image center and focal length) and with the associated depth map. The latter is defined as the set of depths of the points of the scene represented in the image, that is, with reference to Fig. 1, the set of ^-coordinates of the points of the three-dimensional scene. As an alternative to the depth map, data may be provided from which it can then be obtained.
In addition, this apparatus comprises storage means and processing means (e.g. a processor executing a suitable software code) configured to correct the position of the points of the acquired images, according to a technique called "Partial Perspective Gradient" (PPG). The corrections made by this technique tend to represent, in the image plane, the position of each point as if it had been captured with the lens pointed at it. Such technique, numerous variants of which can be defined, is based on the calculation of a gradient of the position of the points in the image plane. By integrating the components of this gradient, the functions can be found according to which the points will be positioned in order to make said correction.
It should be noted that the image representation to be subjected to correction techniques often consists of a limited number of discrete elements having finite dimensions, called "pixels". As a rule, each pixel represents, in an approximate manner, a small area of the image; for simplicity, however, in some parts of this description a geometric point will be identified as a pixel, accepting the approximation according to which the pixel's discrete coordinates are assumed to correspond to those of the geometric point it is intended to represent.
Considering the reference system shown in Fig. 4, said gradient is calculated by taking into account a generic point A of the three-dimensional space, corresponding, according to linear perspective, to the point Q of the image plane 401, and to the point P resulting from the intersection between the projective straight line of A and the plane 402, which will be called π (or auxiliary π- plane), and which, in the preferred embodiment of the invention, is orthogonal to the projective straight line of A.
An incremental displacement
Figure imgf000010_0001
of A corresponds, in the τ-plane 402, to an incremental displacement
Figure imgf000010_0002
of P (hereafter, an incremental displacement of A is meant to be an infinitesimal virtual increment, whether positive or negative, of one or more coordinates of the point A; such increment is hypothetically applied, according to the mathematical method of infinitesimal calculation, to determine the mathematical functions that bind the position of the points in the neighbourhood of A in the three- dimensional space to the positions of the points in the neighbourhood of the projection of A on the projection planes, as will be explained below). From the components of the incremental displacement
Figure imgf000011_0002
the gradient components, i.e. the partial derivatives, of the functions
Figure imgf000011_0001
are calculated, with which to represent, in the image plane 401, the correct coordinates of the image of the point A.
By integrating said partial derivatives, one determines the coordinates where the representation of the point A will be located in the image plane.
These features as well as further advantages of the present invention will become more apparent from the following description of an embodiment thereof as shown in the annexed drawings, which are supplied by way of non-limiting example, wherein:
Fig. 1 geometrically illustrates the operation of linear perspective;
Fig. 2 illustrates the perspective representation of a cube in an offset position relative to the optical axis;
Fig. 3 qualitatively illustrates the deformations inherent in linear perspective when viewing images from a point not corresponding to the center of projection;
Fig. 4 geometrically illustrates one embodiment of the invention;
Fig. 5 geometrically illustrates the operation of linear perspective in the stereoscopic case;
Fig. 6 illustrates a plan view of a part of Fig. 4; Fig. 7 illustrates a representation of Cartesian references;
Fig. 8 illustrates the trend of the raised cosine function and its complement that are used in the method and apparatus according to the invention;
Fig. 9 illustrates stereoscopic images of the cube of Fig. 2;
Fig. 10 illustrates a block diagram of an apparatus according to the invention;
Fig. 11 illustrates a flow chart of a process wherein the perspective correction of the present invention is applied.
The present invention relates to the correction of single bidimensional images or pairs of stereoscopic images, aimed at reducing the deformations that appear in perspective images when they are viewed from a point not corresponding to the center of projection of the perspective. The following will describe an apparatus, and the method it implements, with reference to a preferred embodiment and some exemplary but non- limiting variants thereof.
The apparatus of the present invention, with the associated method, processes images by using a technique called "Partial Perspective Gradient" (PPG), which corrects the position of the points (pixels) of the images in such a way as to locate them as if each point of the scene represented in the image had been captured by a lens pointed at it. In order to determine this correction, said technique uses the coordinates of the point the representation of which has to be corrected, and the data defining the geometry of the perspective according to which the image has been generated.
With reference to Fig. 4 and to the preferred embodiment, the coordinates of the point A taken into account are obtained from the image, i.e. from the x, y coordinates of the point Q, and from the distance of the point s from the y-plane. According to the terminology used in the art, this distance is called "depth", and the set of distances of the points of the three-dimensional space from the y-plane is called "depth map".
The geometry of the perspective according to which the image has been generated is essentially defined by the focal length and by the frame dimensions.
Also with reference to Fig. 5, in the case of stereoscopic images the interoptical distance b, i.e. the distance between the centers of projection from which the two stereoscopic images have been generated, is also considered in addition to the focal length and the frame dimensions. The data sufficient to determine the depth of the points represented in the image, also known as depth data, may comprise the depth map, and can be obtained by using various methods known to those skilled in the art, whether for an image intended for bidimensional vision or for a pair of stereoscopic images. In the case of drawings or paintings, such data are implicit in the artist's project.
In the case of a pair of stereoscopic images, the depth map can be obtained from the disparity map, which represents the difference between the horizontal coordinates of homologous points of the two images, as shown in Fig. 5. In this figure it is assumed that the two centers of projection are horizontally aligned, and that the horizontal axes of the Cartesian references lie in the horizon plane, i.e. the horizontal plane that contains the centers of projection. In fact, Fig. 5 shows that the depth axes zL and zR originate on the straight line that passes through the centers of projection CL and CR, whereas the x- axis and the y-axis are represented in the image planes in order to make the graphics simpler; it is however clear to the man skilled in the art that this representation in the image planes is wholly equivalent to the representation of the same axes in the plane passing through said centers of projections CL and CR and orthogonal to the z-axis.
Fig. 5 shows one example wherein a point A of the three-dimensional space is projected from two distinct centers of projection CL and CR, onto two distinct projection planes, i.e. plane 501 and plane 503, whereon the coordinates are referred to the Cartesian references IcLx yL and IcRxRyR, respectively. The optical axes starting off from the centers of projection, i.e. zL from CL and zR from CR, are parallel to each other and are located at a distance b (interoptical distance) from each other. The planes 501 and 503 are orthogonal to said optical axes and equidistant from the respective centers of projection by a distance /, whereas their horizontal axes, respectively xL and R, lie on one same straight line, which is parallel to the line joining the two centers of projection CL and CR, and intersects the optical axes at the points IcL and IcR, respectively.
In the respective reference systems, the homologous points QL and QR, resulting from the projection of A onto the planes x yL and RyR, respectively, have the same vertical coordinate, which for simplicity is not indicated in Fig. 5, and have the horizontal coordinates XQL and XQR, respectively.
The disparity "disp" of the homologous points QL and QR is given by disp = XQL - XQR.
Between "depth" and "disparity" the following mathematical relation applies:
Figure imgf000014_0001
where the symbols have the following meanings:
zA "depth" of the point taken into account, i.e. the z coordinate of the point A of Fig. 4 or the coordinates of the point A of Fig. 5 on the axes zL and zR;
b interoptical distance, i.e. the distance between the two centers of projection CL and CR, f focal length, i.e. the distance between the projection center CL or CR and the respective projection plane 501 or 503; disp disparity, i.e. the difference between the horizontal coordinate of one point of the left image (QL) and the corresponding coordinate of the homologous point of the right image (QR), the coordinates being referred to the centers of the respective images (IcL and IcR) or to other homologous reference points.
The equation (2) is defined with reference to the centers of projection and the projection plane, but, as is known to the man skilled in the art, the same equation, mutatis mutandis, can also be used for relating the depth of a point represented on the two images of a stereoscopy with the disparity measured on the pair of stereoscopic images and with the value of the equivalent interoptical distance of the geometrical configuration according to which the stereoscopy has been generated. The center of projection of the perspective according to which an image has been generated can be located by providing the center of the projection plane and the distance between the center of projection and the projection plane (i.e. the focal length in the case of video cameras), or by providing the dimensions of the projection plane and the angle of view, since the angle of view, the image diagonal and the distance of the center of projection from the projection plane are bound by the equation (1).
The apparatus claimed by the present invention comprises suitable means for acquiring, in numerical form, a bidimensional image or a pair of stereoscopic images, along with depth or disparity data and other data sufficient to determine the coordinates of the center of projection C corresponding to the center of projection of the perspective, as mentioned above. Once the geometry of the image has been acquired, the apparatus claimed by the present invention applies the method ("Partial Perspective Gradient") according to the invention, wherein said method has been developed on the basis of the principle of representing the neighbourhood of each point of an image as if it had been pointed at while shooting or drawing. This principle takes into account that, in such conditions of sight orientation, small displacements of the point in question (e.g. the point A in the annexed figures) would be perceived.
Also with reference to Fig. 6, the following will describe this calculation, for simplicity's sake, by considering the representation of a point of the space that lies in the horizon plane (the z-plane in the annexed figures) and an incremental displacement of said point in a direction parallel to the -axis. The man skilled in the art will be able to extend the calculation to the case wherein the point A is located in any position in space and wherein the incremental displacement of said point occurs in any direction.
The plane 402 lies at the same distance from the center of projection C as the projection plane 401. This configuration should however only be considered as a non-limiting explanatory example of the preferred embodiment. As the man skilled in the art will guess, and as will be explained below, the plane 402 can be set at any distance from the center of projection and with any orientation.
As the man skilled in the art knows, an incremental displacement can be defined, in
Figure imgf000016_0001
general, from a composition of a plurality of components, wherein these components, which are preferably three (three-dimensional space), are oriented along various directions. The most common decomposition is the one which is made by using the three directions corresponding to the axes of the Cartesian reference, to which reference will be made in this description. Since in the explanatory example
Figure imgf000017_0001
has a direction which is parallel to the x-axis,
Figure imgf000017_0002
is completely characterized by its component along this axis, which will be called
Figure imgf000017_0003
. Likewise, in this example the incremental displacement
Figure imgf000017_0004
q of Q is completely characterized by its component along the -axis, which will be called . Note that, at any rate, a displacement parallel to the x-
Figure imgf000017_0005
Figure imgf000017_0006
axis of the point s will cause a displacement
Figure imgf000017_0007
, also parallel to the -axis, of the point
Q
In order to deal with the incremental displacement that a viewer would perceive if he/she were looking at the point A, it is necessary to represent a projection
Figure imgf000017_0009
of the incremental displacement
Figure imgf000017_0008
in the plane 402.
Also with reference to Fig. 7, this projection Ap can be defined by using the C αβy Cartesian reference. This reference has its origin at C, which is coincident with the origin of the Cxyz reference, and its y -axis develops along a direction that coincides with that of the straight line passing through C and through A. The αβ-plane is orthogonal to the y -axis. The α-axis is defined by the intersection between the αβ-plane and the xz-plane (the α-axis lies in the xz-plane), whereas the β-axis, also passing through C, is orthogonal to both the α-axis and the y -axis.
In general, a displacement
Figure imgf000017_0010
parallel to the -axis of the point A implies, in the plane 402, a displacement
Figure imgf000017_0011
of P with components in both the direction of the a-axis and the direction of the β-axis. In the particular case of Fig. 6, wherein the point A is located in the horizon plane, and consequently the β-axis is orthogonal to the -axis, the component in the direction of the β-axis is null, and hence only the component in the direction of the α-axis, referred to as , will remain to be treated. Since with
Figure imgf000017_0012
we want to determine the partial derivative that the component along the x-axis of the point Q, also referred to as xQ, must take, is related with
Figure imgf000018_0002
. The calculation can be made by using the common rules of geometry and mathematics, which are known to the man skilled in the art. They essentially provide for changing the reference system, switching the representation of the displacement
Figure imgf000018_0003
from the Cxyz Cartesian reference to the C αβy Cartesian reference. The formulae for changing the reference systems can be found in school books and on various Internet sites, among which, for example, the following:
http://www.cns.gatech.edu/~predrag/courses/PHYS-4421-10/Lautrup/space.pdf.
The result of this processing is the expression contained in the formula (3 αx), which is shown below together with the expressions of the other displacement components, which are derived in a similar manner.
Figure imgf000018_0001
Figure imgf000019_0001
In the formulae (3..) the symbols have the following meanings:
incremental displacement of the point P in the plane 402 in the direction of the α-axis; incremental displacement of the point P in the plane 402 in the direction of the β -axis; incremental displacement of the point Q in the projection plane in the
Figure imgf000019_0002
direction of the x-axis ; incremental displacement of the point Q in the projection plane in the direction of the y-axis; incremental displacement of the point A in the direction of the z-axis; f distance between the center of projection and the projection plane; xQ abscissa of the point Q; yQ ordinate of the point Q;
The expression (3 αy) indicates that the incremental displacement
Figure imgf000019_0003
of the point P in the direction of the α-axis does not depend on the displacement component
Figure imgf000019_0004
of the point Q in the direction of the y-axis, because the y-axis is orthogonal to the plane in which the α-axis lies, as aforesaid when commenting on Fig. 7 (the α-axis lies in the xz- plane).
As is known to the man skilled in the art, as the increments tend to zero the formulae (3..) will provide derivatives corresponding to the respective incremental ratios. The expressions of the second members of the formulae (3..) will be used below as expressions of such derivatives, omitting to rewrite them for simplicity's sake.
By calculating the integrals of said derivatives, the coordinates FX(xQ, yQ zA) and Fy(xQ, yQ, zA) will be obtained, which will be used for representing, in the image plane, the position of the point A according to the gradient criterion of the present invention. Such integrals are given by the following formulae:
Figure imgf000020_0001
(*)
For yQ tending to zero, the expression (4βχ) will tend to zero, while the arguments of the logarithms of the expressions (4 αz) and (4βz) will always be positive. The symbols of the formulae (4..) have the same meaning as those of the formulae (3..). The expression (4 αx) can be approximated by means of the following formula:
Figure imgf000021_0001
having approximated the integrand function (3 αx) by means of the following expression:
Figure imgf000021_0002
The formula (4 αxa) provides results that are only slightly different from the integral calculation (4 αx). For example, in the case of application to the cube shown in Fig. 2, in which there are points corresponding to angle of views of 90°, the calculations made with the formula (4 αxa) will differ by less than 1% from those made with the formula (4 αx). As can be noticed, the use of the formula (4 αxa) does not require, unlike that of the formula (4 αx), any steps of numerical integration, thus advantageously reducing the processing time and load.
The formulae (4..) constitute the first embodiment of the "Partial Perspective Gradient" technique of the present invention, consisting of representing, in terms approximated in the image plane, what would appear from the center of the perspective, in the neighbourhood of each point of the scene to be reproduced, if the optical axis passed through that point. In accordance with this criterion, a certain number of variants of the above technique can be taken into consideration.
For example, instead of the components (3 αz) and (4βz), which represent dependence from the "depth" of the image points, formulae may be used which represent dependence from the "disparity" between homologous points of stereoscopic images, coherently with the formula (2).
A second variant uses formulae determined by assuming that the point P (Fig. 4, Fig. 6, Fig. 7), instead of being at a fixed distance of f from the center of projection C, is located at a distance
Figure imgf000022_0001
from C that depends on some parameter. In particular, it can be imposed that
Figure imgf000022_0002
, so that P will coincide with Q.
According to a third variant, the formulae are determined by imposing that the point P is located on a segment VA, instead of the segment CA, with V distinct from C (Fig. 1, Fig. 3, Fig. 4). In this case, the point V may preferably be located on the optical axis passing through the points C and Ic, or away from said optical axis. Furthermore, the distance of P from V
Figure imgf000022_0003
may either be preset to a constant value or be variable depending on some parameter. For example, it can be imposed that
Figure imgf000022_0004
.
The above variants of the "Partial Perspective Gradient" technique are illustrated herein as non-exhaustive and non-limiting examples, since the man skilled in the art will be able to image many other variants without departing from the teachings of the present invention.
A second embodiment of the idea consists of applying partly the formulae (4..) and partly the linear perspective formulae.
The prior art has shown, in fact, that linear perspective reproduces images well within certain limits, and therefore within such limits it may be profitable to maintain the reproduction provided by linear perspective, combining the formulae (4..) with the linear perspective ones. Typically such combination may be made in such a way as to gradually switch from exclusive application of the formulae (4..) to exclusive application of the linear perspective formulae, but it may also be carried out in other manners that the man skilled in the art will be able to imagine.
One way to make such combination is to multiply the results of the formulae (4..) by a first factor, preferably comprised between the unitary value and the null value, and to multiply the results of the linear perspective formulae by a second factor, preferably complementary to the first factor; the results of the products thus obtained are then added up.
Also with reference to Fig. 8, one example of a function suitable for creating the multiplicative factors is the raised cosine function represented by the curve 801, together with its complement 802. As shown in the drawing, the curve 801 stays at the unitary value for abscissa values between zero and a limit ts; afterwards, in the interval from ts to tf, it gradually decreases to zero, and then it stays at the null value. Instead, its complementary function 802 has the opposite trend.
The abscissas of Fig. 8, and hence the limits ts to tf, can be related with the offset angle at which the point to be represented is seen from the center of projection, or with the distance of the point from the center of projection, or with other parameters or combinations of parameters allowing the man skilled in the art to meet the requirements of a specific application of the method according to the invention.
The simplest way to combine the results of the formulae 4(..) with the linear perspective results is to:
a) multiply the function expressed by the formula (4xx) by a first function 801; b) multiply the formula that provides the abscissas of the image points according to linear perspective by a second function 802, which is preferably complementary to said first function 801; c) obtain the function that expresses the abscissa of the image points as the sum of the results of the operations a) and b); d) obtain the function that expresses the ordinate of the image points as was done for the function expressing the abscissas in steps a)-c).
The man skilled in the art will however be able to create this combination in other manners as well. For example, one may:
e) multiply each one of the terms (4α.) and (4β.) by a distinct function; f) obtain the function Fx(xQ, yQ, ZA) as the sum of the terms obtained from the above multiplications; g) calculate a weighted mean of said terms with which the function FX(XQ, yQ, ZA) was calculated at the previous step f); h) multiply the formula that provides the abscissas of the image points according to linear perspective by the complement of said weighted mean obtained from the calculation made at step g); i) obtain the function that expresses the abscissa of the image points as the sum of the results of the operations f) and h); j) obtain the function that expresses the ordinate of the image points as was done for the function expressing the abscissas at steps e)-i).
These two combination examples are neither exhaustive nor limiting. The man skilled in the art will be able, in fact, to propose other combinations without departing from the teachings of the present invention. Also the functions represented in Fig. 8 are to be considered only as non-exhaustive and non-limiting examples. The man skilled in the art will know, in fact, many other types of functions which could be used for combining the terms of the formulae (4..) with those of linear perspective.
Said second embodiment of the invention is also liable to all variations that may be conceived for the first embodiment.
Also with reference to Fig. 9,
aF is the enlarged image of the cube 202 (cube projected from the center of projection Ic);
aE is the image of the same cube 202 projected from a center of projection with an abscissa equal to twice the cube side;
bF is the image obtained by processing the image aF with the "Partial
Perspective Gradient" technique;
bE is the image obtained by processing the image aΕ with the "Partial
Perspective Gradient" technique. By applying the above-described method according to the invention (PPG) to the image of the cube 202 of Fig. 2, which is shown enlarged in the image "aF" of Fig. 9, one obtains the image "bF" of Fig. 9. A comparison of said images "aF" and "bF " clearly shows the improvement brought to bidimensional images by the present invention. It can also be seen that the image shows a cube with no face perpendicular to the optical axis, as it would in reality if the eyes were pointed at it, since said cube is seen with a horizontal offset of 21° and a vertical offset of -15°.
The improvement is even more apparent in stereoscopic vision, which can be obtained by viewing Fig. 9 as indicated below, the images "aF" and "bF" of Fig. 9 being intended for the left eye and their homologous images "aE" and "bE" being intended for the right eye. The latter have been generated from a center of projection located at an interoptical distance ("b" in Fig. 5) equal to twice the cube side from the center of projection of and "bF".
In order to obtain the stereoscopic vision of the images of Fig. 9, the figure needs to be reproduced in such a way that the distance between the vertical dashes hanging from the upper horizontal line is about equal to, or slightly shorter than, the viewer's interpupillary distance. An adequate dimension is normally obtained by reproducing the sheet containing Fig. 9 in the A4 format (21cm wide). After making sure that the straight line joining the centers of the viewer's pupils is parallel to the straight lines that delimit the figure at the top and at the bottom, it is then necessary to look fixedly at the figure, so as to obtain the merging of the right and left images. Such merging can be facilitated by initially looking fixedly at the arrows running from the lower images to the upper images, or, better still, by placing a card near the viewer's forehead in a position orthogonal to said delimiting horizontal straight lines, so that the right eye will not see the left image, or at least most of it, and vice versa.
While in the stereoscopic vision of Fig. 9 the image "b" (obtained from the merging of the images bF and bE) well represents the shapes of a cube, the image "a" does not even look like a parallelepipedon, because the dimensions of the rear face appear to be bigger than those of the front face. Moreover, in the linear perspective the horizontal and vertical lines maintain their own directions; in particular, the segments that lie in the projection plane, such as the edges of the front face of the cube reproduced in Fig. 9, maintain their length and their orientation, so that the front face of the cube will appear perfectly square. The result of the processing carried out by using the PPG method, instead, shows a cube which is seen obliquely, coherently with the fact that it is offset from the optical axis by 21° horizontally and by -15° vertically.
Therefore, the PPG technique can be applied to stereoscopic images with even more advantage than to monoscopic images.
In the stereoscopic vision case as well, however, linear perspective images correctly reproduce the shape of a cube when they are viewed from the point of view corresponding to the center of projection. In the case of the images aF and aE (with the sheet containing Fig. 9 being reproduced in the A4 format), the point corresponding to the center of projection is located at approximately 48 cm on the right and 34 cm above the center of the image "aF", and at a distance of 60 cm from the figure plane.
Typically, the PPG method can be applied to each one of the images of the stereoscopic pair, with all the possible variants mentioned above, but in the stereoscopy case there exist additional variants and expedients.
With reference to Fig. 5, it is assumed that the same embodiment, possibly with the same variants, is applied to both images of the pair, by projecting the scene from the centers of projection CL and CR. For simplicity's sake, it is assumed that the two optical axes zL and zR are parallel, but the man skilled in the art will know how to treat stereoscopic images produced with non-parallel optical axes, particularly converging ones, just like those of human eyes in real vision.
The main alternatives dealing with stereoscopic images, to be added to the above- described embodiments, are the following:
1. keeping the centers of projection CL and CR (Fig. 5) in their real position; in this case, also the interoptical distance will remain that of the real position, but the distances of each point of the scene from the two distinct centers of projection will be different and may cause an annoying diversity in the vertical correction of the two images, as the formulae (4βχ) and (4βζ) indicate; in order to avoid this diversity, in said two formulae, instead of the abscissas XQL and XQR (Fig. 5), which should take the place of XQ for the left image and for the right image, respectively, a common value may be used, e.g. their mean value XQm, which is given by:
Figure imgf000028_0001
By applying the formula (4yy), i.e. the formulae (4βχ) and (4βz), with XQm replacing XQ, a negligible error is made as concerns the value of the vertical coordinate, because in practice the square of the difference between XQL or XQR and their mean value XQM is much smaller than f2, to which said square is always added in the formulae (4β.);
2. rotating the segment CLCR, together with the same centers of projections CL and CR, about its mean point C, keeping it in the plane to which the axes zL and zR belong, so that it will take a direction orthogonal to the projective straight line that joins C to A; in this case, the distances of each point of the scene from the two centers of projection will be made even, but the projective straight lines will no longer correspond to the original ones and the interoptical distance will increase;
3. rotating the segment CLCR as in the preceding paragraph, moving the centers of projection CL and CR towards the center of the segment CLCR, so as to keep the projective straight lines unchanged; this alternative provides better approximation to the criterion of representing the neighbourhood of each point of the scene as if it had been pointed at while shooting or drawing.
These three alternatives are to be considered as non-exhaustive and non-limiting examples of the ways in which, in practice, one can set the reference geometry for the calculation method of the present invention. The man skilled in the art will be able to propose other equally applicable alternatives. In the example shown in Fig. 9 the first one of the above three alternatives has been applied.
A simplified block diagram is shown in Fig. 10, which relates to an apparatus that implements the correction method according to the present invention. An image processing apparatus 1 according to the invention, like for example a photo camera or a video camera or the like, comprises image acquisition means 1001, input/output means 1002, a central processing unit (CPU) 1003, a read-only memory 1004, a random access memory (RAM) 1005, and means for producing processed images 1006. All of these elements of the apparatus 1 are in signal communication with one another, so that data can be mutually exchanged between any two of such elements.
The image acquisition means 1001 can acquire both bidimensional images and pairs of stereoscopic images. If the images are equipped with the respective depth map and with data allowing to go back to the geometry with which they have been generated (e.g. focal length, angle of view, sensor inclination, or the like), the image acquisition means 1001 will acquire such data as well. Otherwise, some data can be set through a user interface (not shown in the drawings) in signal communication with the input/output means 1002, whereas for stereoscopic images the depth map may even be produced by the apparatus 1 itself, as will be explained below. The user interface also allows the user to set options and variants, along with their parameters, that he/she may prefer to use in the specific case. For example, one can choose from the following settings:
- the type of combination between the PPG method and linear perspective, assigning the parameters that characterize it (ts and tf in the case of the function shown in Fig. 8);
- the alternative between the formula (4αxa) and the formula (4αx);
- if the formula (4αx) is used, one can define the precision to be achieved in the numerical integration;
- the position of the point of view relative to which the processing of the PPG method is applied;
- if the position of the point of view relative to which the processing of the PPG method is applied is variable, the parameters that characterize its variability; - in the stereoscopy case, the chosen alternative in regard to the positioning of the two centers of projection.
The point of view relative to which the processing of the method of the invention is applied may be different from the center of projection C (see Fig. 1) relative to which the image to be processed has been generated. The coordinates of Q and the depth of A being known, and having calculated with them the position of A in the three- dimensional space, one can in fact determine the projection of A onto any plane and from any point of view. For example, assuming as a center of projection a point K (the point corresponding, for example, to the point of view from which the viewer is supposed to be looking at the image) different from the center of projection C, one can determine the projection of A onto a plane perpendicular to the straight line that joins A to V, or onto a different plane. Processing the image from a point of view other than the center of projection C and onto various planes allows to make particular corrections to the image and to produce useful artificial images, such as those which can be used in stereoscopy for filling blanks, as will be discussed below.
The central processing unit (CPU) 1003 is that part of the apparatus which executes the calculation algorithms, including complementary operations that, after the application of the PPG technique, are useful for completing the correction of the images to be returned. These operations will be discussed while commenting on Fig. 11.
As is known to those skilled in the art, the central processing unit 1003 may actually comprise specially developed integrated circuits, one or more microprocessors, programmable logic circuits (e.g. CPLD, FPGA), and the like. These and other implementation possibilities neither anticipate nor make obvious the teachings of the present invention.
The read-only memory 1004 is preferably used for permanently storing some instructions for managing the apparatus and the instructions that implement the calculation algorithms, while the random access memory (RAM) 1005 is typically used for temporarily storing the images and the intermediate processing results.
Finally, the means for producing processed images 1006 return the processed images, e.g. by transferring them from the RAM memory 1005 to the input/output means 1002, so that said means 1002 can save said processed images into a permanent memory (e.g. a hard disk or a type of Flash memory, such as Secure Digital, MMC or the like), display them on one or more screens or other display means (not shown in the annexed drawings), print them, and the like.
With reference to Fig. 11, the following will describe the process for correcting a bidimensional image or a pair of stereoscopic images. For simplicity, in Fig. 11 no distinction is made between monoscopic images and stereoscopic images because, in light of the above description, it will be apparent to the man skilled in the art how the method of the invention will have to be applied to both cases. In the case of stereoscopic images, the processing can take place either distinctly for each image, in successive processing stages, or in parallel on both images. No substantial difference exists between these two alternatives, considering that in any case the method of the present invention is applied distinctly to single images.
The process that implements the invention may comprise the following steps:
- start step 1101, during which the apparatus 1 is configured for processing at least one image; - setting step 1102, for acquiring the settings and the data that the user intends to manually provide for processing at least said image;
- image acquisition step 1 103, during which the image to be processed is acquired by the image acquisition means 1001 and is preferably transferred into the RAM memory 1005 (for simplicity, it is assumed that, downstream of step 1103, the images are in numerical form, and that, in the event that they should be available in another form, the man skilled in the art will know how to convert them into numerical form); it is understood that the image acquired at step 1103 is equipped with data defining the geometry according to which the image has been generated, and that it is preferably also equipped with the depth map, or with data allowing to create it; the subsequent steps 1104 and 1105 will consider the case wherein the depth map is not provided at this step 1103;
- depth map presence verification step 1 104, during which it is determined whether the depth map is available or not;
- depth map calculation step 1105, during which the apparatus 1 generates the depth map according to one of the methods known to those skilled in the art;
- processing step 1106, during which the apparatus 1, by using the depth map acquired at step 1103 or calculated at step 1105, determines the position in the three- dimensional space of the points corresponding to the pixels of the image acquired at step 1103 and applies thereto the position correction algorithm according to the invention; the result of step 1106 is a matrix indicating the position that the pixels of the image acquired at step 1103 must take after said processing; at this stage no pixel shifting occurs yet, in order to avoid having to repeat this kind of operation after the additional processing carried out in the next steps;
- resizing step 1107, during which the apparatus 1 can enlarge or reduce the image processed at step 1106; the resizing process consists of recalculating, starting from the result of step 1106, the position that the pixels of the processed image must take after the resizing applied at this step 1107;
- overlap elimination step, during which, as will be explained below, any conflicts are resolved between overlapping pixels;
- pixel shifting and blank filling step 1109, during which the pixels are shifted and, as will be explained below, blanks are filled;
- processed image returning step 1110, during which the image is preferably stored into an area of the RAM memory 1005, or another memory, and is made available for display, printing, transfer to other apparatuses, and other operations;
- final step 1111, denoting the end of the process.
When the apparatus 1 is in an operating condition, after the start step 1101 said apparatus 1 enters the setting step 1102, and then, simultaneously or afterwards, the image acquisition step 1103. At the end of step 1103, the apparatus 1 verifies the availability of a depth map (step 1104) for what has been acquired at step 1103; if the depth map is available, the apparatus 1 enters the processing step 1106; otherwise, if the map is not present, the apparatus enters the depth map calculation step 1105 prior to proceeding with the processing step 1106. After step 1106, the apparatus may optionally carry out the resizing step 1107. Then the apparatus 1 carries out the overlap elimination step 1108, followed by the pixel shifting and blank filling step 1109 and by the processed image returning step 1110. The process ends at the final step 1111. At step 1106 the PPG technique of the present invention is applied in one of the above- described embodiments thereof.
In summary, the processing means 1003 and the storage means 1004 and 1005 of the apparatus 1 are configured to correct the image represented in the plane 401, which has been generated in compliance with the linear perspective rules relative to the center of projection C and comprises at least one first point Q, wherein said first point Q is the result of the perspective projection of the second point A of a region of the three- dimensional space from the center of projection C; to do so, the processing means 1003 and the storage means 1004 and 1005 execute the method according to the invention, which comprises the following steps: a) calculating the position of the second point A in the three-dimensional space;
b) calculating, based on a first incremental displacement
Figure imgf000035_0001
of the second point A in the three-dimensional space, a second incremental displacement
Figure imgf000035_0002
of a third point P, wherein said third point P is the result of the perspective projection of said second point A onto an auxiliary π-plane from the point of view V, and wherein said auxiliary π- plane is different from the plane in which the image 401 lies; c) obtaining, based on said second incremental displacement calculated
Figure imgf000035_0003
at step b), a gradient of a new position where said second point A will be represented in the image plane 401;
d) obtaining, by calculating an integral of at least one of the components of said gradient obtained at step c), a new position where the second point A will be represented in the image plane 401;
e) moving the first point Q, in the image plane 401, from the position resulting from the perspective projection relative to the center of projection C to said new position obtained at step d).
It must be pointed out that the corrections made by the PPG method determine pixel shifts that typically move closer to the center of the frame those parts of the image which are significantly offset from the optical axis, implying the possibility of creating blanks and pixel overlaps.
In fact, when after perspective corrections or a resizing operation an area of the image is shifted, some pixels of this area may overlap the pixels of areas that have not been shifted, or that have been shifted less, or that have been shifted in different directions. In order to resolve these conflicts, one may, for example, have the pixel with less depth occupy the contended position (i.e. the pixel located at the smaller distance from the xy- plane will prevent seeing the farther one).
The man skilled in the art will be able to resolve pixel conflicts by using techniques other than the one described herein, without however departing from the teachings of the present invention.
Just like a correction of a pixel's position might create an overlap, the same correction, or another correction, might generate a blank, if the pixel leaving a position to occupy another position is not replaced by another pixel moving into the position left available. Although blank filling is a problem for which the man skilled in the art will know appropriate solutions (e.g. interpolations according to "inpainting" methods), in the stereoscopy case the method of the present invention can be advantageously used for providing a new solution. In fact, stereoscopy uses two images of the same scene viewed from two different points of view. When processing the two images with the PPG method during step 1109, the corrective shift of a point of the left image is generally different from the corrective shift of the homologous point of the right image. Moreover, the contents of the two images are different (let us think, for example, that there might be some areas of the scene which are covered by objects in front of them, and that areas which are invisible from either one of the two point of views might be visible from the other one). Therefore, the blanks in either one of the two processed images can be at least partially filled by appropriately processing the other image. For example, one way of filling the blanks in the left (right) image is to superimpose it on an "artificial" image obtained by processing the right (left) image, assuming as a center of projection the same center of projection of the left (right) image. Once the blanks have been treated with this technique, if there are still any empty pixel positions prior- art methods can be applied (e.g. "inpainting" methods) for completing the filling by starting from known pixels of the image.
It is advantageous to carry out the pixel shifting and blank filling operations during step 1109 on already resized images and after having resolved any conflicts, i.e. downstream of step 1108, because resizing may also cause overlaps and blanks.
In common circumstances wherein the images being reproduced on screens or printouts are viewed from points not corresponding to the center of projection from which they have been generated, the application of the PPG method as described in the present invention will improve the vision of the images of objects located in offset positions relative to the pointing direction of the capturing device. Such improvements are evident in monoscopic vision, and are even more evident in stereoscopic vision. In painting, photography and cinematography, in the absence of the PPG method of the present invention people try to keep the deformations which are typical of linear perspective within acceptable limits by confining the angles of view within stringent values. However, this implies undesirable conditioning, because it is not always possible to shoot scenes with telescopic lenses having a narrow angle of view. The PPG technique, which can be applied in real time while shooting or afterwards on acquired images, allows to capture scenes with angles of view exceeding the currently recommended limits, leading to significant advantages for both stereoscopic and monoscopic shooting.
The approximation of integrals for which no primitive exists with expressions in closed form, such as the approximation of the integral 4 αx with the formula 4 αxa, substantially reduces the processing load required for the application of the PPG method, thereby producing significant advantages in terms of applicability of the method, thus making it particularly interesting for applications where images need to be processed in real time (e.g. live television broadcasts or the like).
Thanks to numerous variants that can be applied in full compliance with the novelty principles inherent in the inventive idea of the present invention, the PPG technique turns out to be very versatile and can be optimized for different types of applications and apparatuses, which may even be characterized by very different processing capabilities.
For example, the PPG technique can be used for correcting deformations in video streams, by applying it to every image it is composed of. This applies to both 2D and 3D video streams; in the latter case, the technique will have to be applied to each image of the stereoscopic pairs forming the 3D stream.
The present description has tackled some of the possible variants, but it will be apparent to the man skilled in the art that other embodiments may also be implemented, wherein some elements may be replaced with other technically equivalent elements. The present invention is not therefore limited to the explanatory examples described herein, but may be subject to many modifications, improvements or replacements of equivalent parts and elements without departing from the basic inventive idea, as set out in the following claims.

Claims

1. An apparatus (1) comprising
- image acquisition means (1001) adapted to acquire an image (401) lying in a given projection plane, wherein said image (401) has been generated in compliance with the linear perspective rules by projecting a region of the three- dimensional space onto said projection plane from a center of projection (C), and wherein said image (401) comprises at least one first point (Q) representing the projection, according to said linear perspective, of a second point (A) comprised in said three-dimensional region and having a depth with respect to said center of projection (C),
- storage means (1004, 1005),
- processing means (1003),
- data acquisition means adapted to acquire a set of data,
and wherein said processing means (1003) and storage means (1004, 1005) are configured to determine, by processing said set of data, the position of said center of projection (C) with respect to said projection plane and the depth of said second point (A) in the three-dimensional space,
characterized in that
said processing means (1003) and storage means (1004, 1005) are also configured to correct the image (401) so as to reduce the deformations of the representation of said three-dimensional region appearing when said image (401) is viewed from a point of view not corresponding to said center of projection (C), by carrying out the phases of a) calculating the position of said second point (A) in the three-dimensional space on the basis of the position of the first point (Q), the position of the center of projection (C), and the depth of said second point (A),
b) calculating, based on a first incremental displacement
Figure imgf000041_0001
which, in the three- dimensional space, could be perceived by a viewer of said second point (A), a second incremental displacement
Figure imgf000041_0002
of a third point (P), wherein said third point (P) is the result of the perspective projection of said second point (A) onto an auxiliary plane (402) from a point of view (V) of said viewer,
c) obtaining, based on said second incremental displacement
Figure imgf000041_0003
calculated during phase b), a gradient of a new position where said second point (A) will be represented in said projection plane,
d) obtaining, by calculating an integral of at least one of the components of said gradient obtained during phase c), a new position where the second point (A) will be represented in said projection plane,
e) moving the first point (Q), in the image (401), into the new position obtained during phase d).
2. An apparatus (1) according to claim 1, wherein, in the calculation of at least one integral carried out during phase d), the integral function is approximated through a function of the integration variable of said integral function.
3. An apparatus (1) according to claim 1 or 2, wherein said auxiliary plane (402) is orthogonal to the straight line passing through the second point (A) and the point of view (V).
4. An apparatus (1) according to any one of the preceding claims, wherein the position of said point of view (V) is constant and coincides with the position of said center of projection (C).
5. An apparatus (1) according to any one of claims 1 to 3, wherein the position of said point of view (V) is variable as a function of the position of the second point (A), or as a function of the position of the first point (Q), or as a function of the distance of the first point (Q) from the center of projection (C), or as a function of the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C).
6. An apparatus (1) according to any one of the preceding claims, wherein the distance of said point of view (V) from said auxiliary plane (402) is preset to a constant value.
7. An apparatus (1) according to any one of claims 1 to 5, wherein the distance of said point of view (V) from said auxiliary plane (402) is variable as a function of the distance of said point of view (V) from the first point (Q), or as a function of the distance of said point of view (V) from said second point (A), or as a function of the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C).
8. An apparatus (1) according to any one of the preceding claims, wherein at least one of the components of the new position where the second point (A) will be represented, calculated during phase d), is multiplied by a variable which is a function of one or more parameters that may include the distance of the second point (A), or of said first point (Q), from the center of projection (C), the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C), other parameters belonging to the space context represented in said image (401) and/or the context in which said image (401) is generated and/or the reproduction context for which said image (401) is intended.
9. An apparatus (1) according to any one of the preceding claims, wherein at least one of the components of the new position where the second point (A) will be represented, calculated during phase d), is combined with at least one component of the previous position of said second point (A).
10. An apparatus (1) according to any one of the preceding claims, wherein the image (401) comprises a pair of stereoscopic images, and wherein phases a)-e) are applied to the two images of said pair of stereoscopic images.
11. An apparatus (1) according to claim 10, wherein at least one of the components of the gradients calculated during phase b) is calculated by assuming a common horizontal coordinate.
12. An apparatus (1) according to claim 1 1, wherein said common horizontal coordinate corresponds to the mean of the horizontal coordinates of at least two points, which are respectively comprised in the two images of the pair of stereoscopic images, and wherein said at least two points represent the same part of the three-dimensional region.
13. An apparatus (1) according to any one of claims 10 to 12, wherein said processing means (1003) and storage means (1004,1005) are configured to correct the image (401) by also carrying out a blank filling step, wherein, if the execution of phase e) on a first image of said pair of stereoscopic images generates a first processed image that comprises at least one blank, this blank filling step will attempt to fill said at least one blank by superimposing said first processed image on a second processed image, wherein the second processed image is obtained, at least in the area of said blank, by applying phases a)-e) to a second image of said pair of stereoscopic images, assuming as a center of projection the same center of projection of the first image of the pair of stereoscopic images.
14. A method for correcting an image (401) lying in a projection plane and generated in compliance with the linear perspective rules by projecting onto said projection plane a region of the three-dimensional space from a center of projection (C), so as to reduce the deformations of the representation of said three-dimensional region appearing when said image (401) is viewed from a point of view not corresponding to said center of projection (C), the image (401) comprising at least one first point (Q) and the three-dimensional region comprising at least one second point (A), wherein said first point (Q) is the result of the perspective projection of the second point (A) onto said projection plane from the center of projection (C),
characterized in that it comprises the phases of:
a) calculating the position of said second point (A) in the three-dimensional space on the basis of the position of the first point (Q), the position of the center of projection (C), and the depth of said second point (A),
b) calculating, based on a first incremental displacement ( Δα ) of said second point (A) in the three-dimensional space, a second incremental displacement (Δρ ) of a third point (P), wherein said third point (P) is the result of the perspective projection of said second point (A) onto an auxiliary plane (402) from a point of view (V), and wherein said auxiliary plane (402) is different from the plane in which the image (401) lies, c) obtaining, based on said second incremental displacement calculated
Figure imgf000045_0001
during phase b), a gradient of a new position where said second point (A) will be represented in said projection plane,
d) obtaining, by calculating an integral of at least one of the components of said gradient obtained during phase c), a new position where the second point (A) will be represented in said projection plane,
e) moving the first point (Q), in the image (401), into the new position obtained during phase d).
15. A method according to claim 14, wherein, in the calculation of at least one integral carried out during phase d), the integral function is approximated through a function of the integration variable of said integral function.
16. A method according to claim 14 or 15, wherein said auxiliary plane (402) is orthogonal to the straight line passing through the second point (A) and the point of view (V).
17. A method according to any one of claims 14 to 16, wherein the position of said point of view (V) is constant and coincides with the position of said center of projection (C).
18. A method according to any one of claims 14 to 16, wherein the position of said point of view (V) is variable as a function of the position of the second point (A), or as a function of the position of the first point (Q), or as a function of the distance of the first point (Q) from the center of projection (C), or as a function of the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C).
19. A method (1) according to any one of claims 14 to 18, wherein the distance of said point of view (V) from said auxiliary plane (402) is preset to a constant value.
20. A method (1) according to any one of claims 14 to 18, wherein the distance of said point of view (V) from said auxiliary plane (402) is variable as a function of the distance of said point of view (V) from the first point (Q), or as a function of the distance of said point of view (V) from said second point (A), or as a function of the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C).
21. A method (1) according to any one of claims 14 to 20, wherein at least one of the components of the new position where the second point (A) will be represented, calculated during phase d), is multiplied by a variable which is a function of one or more parameters that may include the distance of the second point (A), or of said first point (Q), from the center of projection (C), the angle formed by a straight line, passing through the center of projection (C) and the second point (A), and an optical axis perpendicular to said projection plane and passing through the center of projection (C), other parameters belonging to the space context represented in said image (401) and/or the context in which said image (401) is generated and/or the reproduction context for which said image (401) is intended.
22. A method (1) according to any one of claims 14 to 21, wherein at least one of the components of the new position where the second point (A) will be represented, calculated during phase d), is combined with at least one component of the previous position of said second point (A).
23. A method (1) according to any one of claims 14 to 22, wherein the image (401) comprises a pair of stereoscopic images, and wherein phases a)-e) are applied to the two images of said pair of stereoscopic images.
24. A method (1) according to claim 23, wherein at least one of the components of the gradients calculated during phase b) is calculated by assuming a common horizontal coordinate.
25. A method (1) according to claim 24, wherein said common horizontal coordinate corresponds to the mean of the horizontal coordinates of at least two points, which are respectively comprised in the two images of the pair of stereoscopic images, and wherein said at least two points represent the same part of the three-dimensional region.
26. A method (1) according to any one of claims 23 to 25, further comprising a blank filling step wherein, if the execution of phase e) on a first image of said pair of stereoscopic images generates a first processed image that comprises at least one blank, this blank filling step will attempt to fill said at least one blank by superimposing said first processed image on a second processed image, wherein the second processed image is obtained, at least in the area of said blank, by applying phases a)-e) to a second image of said pair of stereoscopic images, assuming as a center of projection the same center of projection of the first image of the pair of stereoscopic images.
27. A computer program product which can be loaded into the memory of a computer, comprising portions of software code for executing the phases of the method according to any one of claims 14 to 26
PCT/IB2014/062727 2013-08-08 2014-06-30 Apparatus and method for correcting perspective distortions of images WO2015019208A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IT000683A ITTO20130683A1 (en) 2013-08-08 2013-08-08 APPARATUS AND METHOD FOR THE CORRECTION OF PROSPECTIVE DEFORMATIONS OF IMAGES
ITTO2013A000683 2013-08-08

Publications (2)

Publication Number Publication Date
WO2015019208A1 true WO2015019208A1 (en) 2015-02-12
WO2015019208A9 WO2015019208A9 (en) 2015-07-16

Family

ID=49354843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2014/062727 WO2015019208A1 (en) 2013-08-08 2014-06-30 Apparatus and method for correcting perspective distortions of images

Country Status (2)

Country Link
IT (1) ITTO20130683A1 (en)
WO (1) WO2015019208A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3072812A1 (en) 2015-03-27 2016-09-28 Airbus Helicopters A method and a device for marking the ground for an aircraft in flight, and an aircraft including the device
CN110246169A (en) * 2019-05-30 2019-09-17 华中科技大学 A kind of window adaptive three-dimensional matching process and system based on gradient
DE102021103323A1 (en) 2021-02-12 2022-08-18 Carl Zeiss Ag Pannini lens and imaging optical device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0583060A2 (en) * 1992-07-24 1994-02-16 The Walt Disney Company Method and system for creating an illusion of three-dimensionality
WO1998057292A1 (en) 1997-06-12 1998-12-17 Apple Computer, Inc. A method and system for creating an image-based virtual reality environment utilizing a fisheye lens
WO2000035200A1 (en) * 1998-12-07 2000-06-15 Universal City Studios, Inc. Image correction method to compensate for point of view image distortion
US20040022451A1 (en) 2002-07-02 2004-02-05 Fujitsu Limited Image distortion correcting method and apparatus, and storage medium
US20060210192A1 (en) 2005-03-17 2006-09-21 Symagery Microsystems Inc. Automatic perspective distortion detection and correction for document imaging
EP1333498B1 (en) 2002-01-31 2007-08-29 Agilent Technologies, Inc. Solid state image sensor array for correcting curvilinear distortion of a camera lens system and method for fabricating the image sensor array
US20090103808A1 (en) 2007-10-22 2009-04-23 Prasenjit Dey Correction of distortion in captured images
US20110090303A1 (en) 2009-10-16 2011-04-21 Apple Inc. Facial Pose Improvement with Perspective Distortion Correction
US20110149094A1 (en) 2009-12-22 2011-06-23 Apple Inc. Image capture device having tilt and/or perspective correction
US7990412B2 (en) 2004-11-01 2011-08-02 Hewlett-Packard Development Company, L.P. Systems and methods for correcting image perspective
BRPI0802865A2 (en) 2008-08-14 2011-11-22 Audaces Automacao E Informatica Ind Ltda image file generation process with automated optical distortion correction method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0583060A2 (en) * 1992-07-24 1994-02-16 The Walt Disney Company Method and system for creating an illusion of three-dimensionality
WO1998057292A1 (en) 1997-06-12 1998-12-17 Apple Computer, Inc. A method and system for creating an image-based virtual reality environment utilizing a fisheye lens
WO2000035200A1 (en) * 1998-12-07 2000-06-15 Universal City Studios, Inc. Image correction method to compensate for point of view image distortion
EP1333498B1 (en) 2002-01-31 2007-08-29 Agilent Technologies, Inc. Solid state image sensor array for correcting curvilinear distortion of a camera lens system and method for fabricating the image sensor array
US20040022451A1 (en) 2002-07-02 2004-02-05 Fujitsu Limited Image distortion correcting method and apparatus, and storage medium
US7990412B2 (en) 2004-11-01 2011-08-02 Hewlett-Packard Development Company, L.P. Systems and methods for correcting image perspective
US20060210192A1 (en) 2005-03-17 2006-09-21 Symagery Microsystems Inc. Automatic perspective distortion detection and correction for document imaging
US20090103808A1 (en) 2007-10-22 2009-04-23 Prasenjit Dey Correction of distortion in captured images
BRPI0802865A2 (en) 2008-08-14 2011-11-22 Audaces Automacao E Informatica Ind Ltda image file generation process with automated optical distortion correction method
US20110090303A1 (en) 2009-10-16 2011-04-21 Apple Inc. Facial Pose Improvement with Perspective Distortion Correction
US20110149094A1 (en) 2009-12-22 2011-06-23 Apple Inc. Image capture device having tilt and/or perspective correction

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Rendering Techniques '98", 1 January 2000, SPRINGER VIENNA, Vienna, ISBN: 978-3-70-916453-2, ISSN: 0946-2767, article MANEESH AGRAWALA ET AL: "Artistic Multiprojection Rendering", pages: 125 - 136, XP055119830, DOI: 10.1007/978-3-7091-6303-0_12 *
CARLO PEDRETTI: "Leonardo da Vinci on Painting - A lost book (Libro A", 1964, UNIVERSITY OF CALIFORNIA PRESS
CARROLL R.; AGRAWAL M.; AGARWALA A.: "SIGGRAPH '09: ACM SIGGRAPH 2009 papers", 2009, ACM, article "Optimizing content-preserving projections for wide-angle images", pages: 1 - 9
THOMAS K. SHARPLESS; BRUNO POSTLE; DANIEL M. GERMAN: "Pannini: A New Projection for Rendering Wide Angle Perspective Images", COMPUTATIONAL AESTHETICS IN GRAPHICS, VISUALIZATION, AND IMAGING, 2010, Retrieved from the Internet <URL:http://vedutismo.net/Pannini/panini.pdf>
ZORIN D.; BARR A. H.: "Correction of geometric perceptual distortions in pictures", SIGGRAPH '95: PROCEEDINGS OF THE 22ND ANNUAL CONFERENCE ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 1995, pages 257 - 264, XP000546235, DOI: doi:10.1145/218380.218449

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3072812A1 (en) 2015-03-27 2016-09-28 Airbus Helicopters A method and a device for marking the ground for an aircraft in flight, and an aircraft including the device
KR101825571B1 (en) 2015-03-27 2018-02-05 에어버스 헬리콥터스 A method and a device for marking the ground for an aircraft in flight, and an aircraft including the device
US9944405B2 (en) 2015-03-27 2018-04-17 Airbus Helicopters Method and a device for marking the ground for an aircraft in flight, and an aircraft including the device
CN110246169A (en) * 2019-05-30 2019-09-17 华中科技大学 A kind of window adaptive three-dimensional matching process and system based on gradient
DE102021103323A1 (en) 2021-02-12 2022-08-18 Carl Zeiss Ag Pannini lens and imaging optical device

Also Published As

Publication number Publication date
ITTO20130683A1 (en) 2015-02-09
WO2015019208A9 (en) 2015-07-16

Similar Documents

Publication Publication Date Title
US10609282B2 (en) Wide-area image acquiring method and apparatus
US10460459B2 (en) Stitching frames into a panoramic frame
CA3018965C (en) Efficient determination of optical flow between images
US9774837B2 (en) System for performing distortion correction and calibration using pattern projection, and method using the same
WO2020063100A1 (en) Augmented reality image display method and apparatus, and device
US6204876B1 (en) Stereoscopic computer graphics moving image generating apparatus
JP2020061137A (en) Transition between binocular and monocular views
US20160301868A1 (en) Automated generation of panning shots
US20070248260A1 (en) Supporting a 3D presentation
WO2015048694A2 (en) Systems and methods for depth-assisted perspective distortion correction
JP6810873B2 (en) Systems, methods, and software for creating virtual 3D images that appear projected in front of or above an electronic display.
KR20110015452A (en) Blur enhancement of stereoscopic images
WO2006100805A1 (en) Stereoscopic image display unit, stereoscopic image displaying method and computer program
EP3526639A1 (en) Display of visual data with a virtual reality headset
EP3189493B1 (en) Depth map based perspective correction in digital photos
CA2540538C (en) Stereoscopic imaging
WO2015019208A1 (en) Apparatus and method for correcting perspective distortions of images
JP5233868B2 (en) Image cutting device
US20180213215A1 (en) Method and device for displaying a three-dimensional scene on display surface having an arbitrary non-planar shape
US20120081520A1 (en) Apparatus and method for attenuating stereoscopic sense of stereoscopic image
CN107798703B (en) Real-time image superposition method and device for augmented reality
JP5931062B2 (en) Stereoscopic image processing apparatus, stereoscopic image processing method, and program
JP2011211551A (en) Image processor and image processing method
US9602708B2 (en) Rectified stereoscopic 3D panoramic picture
KR20110025020A (en) Apparatus and method for displaying 3d image in 3d image system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14752400

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14752400

Country of ref document: EP

Kind code of ref document: A1