WO2008147999A1

WO2008147999A1 - Shear displacement depth of field

Info

Publication number: WO2008147999A1
Application number: PCT/US2008/064709
Authority: WO
Inventors: Robert Cook
Original assignee: Pixar
Priority date: 2007-05-25
Filing date: 2008-05-23
Publication date: 2008-12-04
Also published as: GB2460994B; GB2483386A; GB2483386B; GB0918780D0; GB2460994A; GB201119388D0

Abstract

Depth of field effects for computer generated images are generated by assigning lens positions to image sample points. For each image sample point, scene geometry shears towards the center of the sample point's aperture to account for its assigned lens positions. The sheared scene geometry is sampled from a single point of the aperture, such as the aperture center image samples with different assigned lens positions sample different sheared versions of the scene, producing a depth of field effect. Scene geometry is sheared according to a function of its depth and the image sample point's assigned lens position. The depth of field effect can be characterized by any arbitrary function of depth, including static or varying aperture size and focal length, which allows for depth of field effects not possible with typical real-world optical systems image sample points, lens positions specified in a pseudo-random and/or stratified manner.

Description

SHEAR DISPLACEMENT DEPTH OF FIELD

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional Patent Application 60/940,379, filed May 25, 2007, which is incorporated by reference herein for all purposes.

BACKGROUND

[0002] The present invention relates to the field of computer graphics, and in particular to methods and apparatus for creating, modifying, and using components to create computer graphics productions. Many computer graphic images are created by mathematically modeling the interaction of light with a three dimensional scene from a given viewpoint. This process, called rendering, generates a two-dimensional image of the scene from the given viewpoint, and is analogous to taking a photograph of a real-world scene. Animated sequences can be created by rendering a sequence of images of a scene as the scene is gradually changed over time. A great deal of effort has been devoted to making realistic looking and artistically compelling rendered images and animations.

[0003] In conventional photography and cinematography, artists often use the physical limitations of conventional optical and photographic equipment to achieve aesthetic effects. For example, depth of field is a visual effect in which some portions of an image are in focus and other portions of the image are out of focus. Depth of field is the result of the aperture size of a camera and the focal length of its lens system. Typical cameras can only focus on objects within a range of distances, referred to as the depth of field. Objects that are within the camera's depth of field appear in focus, while object in front of or behind the depth of field appear out of focus.

[0004] Although a camera with an infinite depth of field, in which everything in an image is in focus regardless of its distance from the camera, may be ideal from a technical perspective, artists often use the depth of field of conventional cameras emphasize or deemphasize objects in a scene. Also, the out of focus objects in a scene may be aesthetically pleasing, enhancing the artistic impact of images. [0005] For these reasons, there have been many attempts to add depth of field effects to real-time and non-real-time computer graphics images and animations. One prior approach places a virtual lens and/or virtual camera aperture in front of each image sample, such as a pixel or sub-pixel sample. Each image sample is assigned to a different portion of the aperture. Rendering algorithms such as ray tracing can be used to trace the path of light through the aperture into the scene, thereby simulating depth of field effects. Because this approach requires light rays to be traced from the image samples, through different portions of the aperture, and into the scene, it can only be used with ray tracing and other similar rendering algorithms modeling light transport. Ray tracing in rendering consumes large amounts of computational time and resources, which makes it unsuitable for many real-time and non-real-time applications where computational time and resources are limited.

[0006] Another prior approach renders the scene as multiple depth layers. The images from these depth layers are selectively blurred and combined to approximate the effect of depth of field. Although this approach is applicable to many rendering algorithms, it requires rendering many depth layers to produce high quality images, which dramatically increases the time and memory required for rendering. The blurring and blending of images to produce a composite image with depth of field adds further processing overhead to rendering. Additionally, the amount of blurring due to depth of field cannot change continuously within an object with this prior approach. Thus, there will be visual discontinuities at the between depth layers. Each object must reside entirely within a single depth layer or visual discontinuities will occur at the boundaries between depth layers.

[0007] It is therefore desirable for a system and method to provide improved depth of field effects for computer generated images and animations. It is desirable for the system and method to be suitable for use with a wide variety of rendering algorithms. It is further desirable for the system and method to provide artists with greater aesthetic control over depth of field effects than what is possible with typical real- world optical systems.

BRIEF SUMMARY

[0008] An embodiment of the invention generates depth of field effects for computer generated images by assigning lens positions to image sample points, so that at least some image sample points view the scene from different parts of the aperture. For each image sample point, the embodiment shears scene geometry towards the center of the sample point's aperture to account for its assigned lens position. The sheared scene geometry can then be sampled from a single point of the aperture, such as the aperture center. Image samples with different assigned lens positions sample different sheared versions of the scene geometry, producing a depth of field effect.

[0009] In an embodiment, the scene geometry is sheared according to a function of the image sample point's assigned lens position and the depth of the scene geometry. The depth of field effect can be characterized by any arbitrary function of depth. In an embodiment, the depth of field effect can be characterized by a virtual aperture size and focal length that is constant or changes with depth. The latter characterization allows users to specify depth of field effects that are not possible with typical real- world optical systems.

[0010] In an embodiment, image sample points and lens positions are specified in a pseudo-random manner, so as to minimize the appearance of aliasing. In a further embodiment, image sample points and lens positions are assigned stratified positions to minimize variance -related noise. In still a further embodiment, image sample points are mapped from a square region of the aperture plane to a circular aperture using a transformation that preserves angular position.

BRIEF DESCRIPTION OF THE DRAWINGS [0011] The invention will be described with reference to the drawings, in which:

Figures IA and IB illustrate the cause of depth of field effects in real-world optical systems; Figures 2A-2E illustrates the application of an embodiment of the invention;

Figure 3 illustrates a method of generating depth of field effects according to an embodiment of the invention;

Figure 4 illustrates the specification of depth of field parameters according to an embodiment of the invention;

Figures 5A-5C illustrate the distribution of image samples for depth of field effects according to an embodiment of the invention; and

Figure 6 illustrates a computer system suitable for implementing an embodiment of the invention. DETAILED DESCRIPTION

[0012] Figures IA and IB illustrate the cause of depth of field effects in real-world optical systems. Figure IA illustrates an example optical system 100 including an image plane 105 for receiving light that forms an image and an aperture plane 110 to admit light from a scene onto the image plane 105. Aperture plane 110 include an aperture or opening 115 to admit light from the scene. In example system 100, aperture 115 is a pinhole aperture, which ideally is infinitesimally small. System 100 has an infinite depth of field. The light from any object, regardless of its distance from the aperture plane 110, will pass through a single point, the pinhole aperture 115, and be focused perfectly on the image plane 100. For example, light from objects 125a and 125b are both focused to a single point 130 on image plane. An optical axis 135 is perpendicular to the image plane 105 and passes through the aperture 115.

[0013] In contrast, figure IB illustrates an optical system 150 with a finite depth of field. System 150 includes an image plane 155 and an aperture plane 160. Aperture plane 160 includes an aperture 165 to admit light on to the image plane 155. Aperture 165 is relatively large compared to the pinhole aperture 115 of system 100. Although omitted for clarity from figure IB, system 150 also includes lenses, mirrors, or other optical elements to focus light onto the image plane 155. An optical axis 190 is perpendicular to the image plane 155 and passes through the center of the aperture 165.

[0014] In system 150, light from any point in the scene may pass through different points of the aperture 165. For objects at or near focal distance 170, such as object 175 A, light will be focused on to a single point of the image plane 155 (or a small, relatively imperceptible circle). For example, light from object 175A passes through many points of aperture 165 and is focused to point 185A on image plane 155.

[0015] Light from objects further away from focal distance 170 (either in front of or behind the focal distance 170) will be focused to points ahead or behind to the image plane 155. As a result, the light hitting the image plane 155 will form a circle or other shape, rather than a point. This out of focus light on the image plane 115, referred to as a circle of confusion, appears as a blur in the image. For example, the light from object 175B passes through many points of the aperture 165 and is focused on point 180, which is behind image plane 165. The light from object 175B forms the circle of confusion 185B on the image plane 155. Thus, object 175B appears blurred or out of focus in the image. Typically, the size of the circle of confusion or the amount of blurring increases as the size of the aperture increases relative to the focal distance 175. Thus, increasing the aperture size decreases the depth of field, or range of distances in which objects are in focus.

[0016] Figures 2A-2D illustrates the problem with rendering depth of field with prior techniques. Figure 2 A illustrates an example scene 200 in world space viewed through a pinhole aperture. Scene 200 includes an image plane 202 including image sample points 203, such as image sample points 203a- 203e. Scene 200 also includes an object 210. Image plane

202 views the scene 200 through an aperture plane 205 that includes a pinhole aperture 207.

Light passes through only one point in pinhole aperture 207. Thus, the image sample points

203a-203e view the scene along rays 209a- 209e, respectively. Because light can only pass through the pinhole aperture 207 at one point, there will be no depth of field effects when rendering scene 200 with pinhole aperture 207.

[0017] Figure 2B illustrates an example scene 215 in screen space, which corresponds to the scene 200 after a perspective transformation. A perspective transformation typically divides each geometric primitive's vertices coordinates by their distance from the aperture plane. Scene 215 includes image plane 217, which is a transformed version of image plane 202. Image plane 217 includes image sample points 219a-219e, which correspond with image sample points 203a-203e. Following the perspective transformation, the rays 209a- 209e in scene 200 are transformed into the set of parallel rays 221a-221e, respectively. Rays 221a-221e are perpendicular to the image plane 217 and transformed aperture plane 220. Object 223 is a perspective-transformed version of object 210.

[0018] Figure 2C illustrates an example scene 225 in world space viewed through a lens aperture. Scene 225 includes an image plane 227 including image sample points 228, such as image sample points 228a and 228b. Scene 225 includes object 235 and an aperture plane 229, which includes a lens aperture 230. Each image sample is assigned a lens position within the lens aperture 230 and views the scene 225 along a ray passing through its assigned lens position.

[0019] For example, image sample point 228a is assigned a lens position 231, which corresponds to the center of the lens aperture 230. Image sample point 228a views the scene 225 along ray 232a. Image sample point 228b is assigned lens position 233 and views the scene 225 along ray 232b, which is bent by the lens at lens position 233. It should be noted that regardless of the lens location assigned to an image sample point, the image sample point's ray will intersect a focal plane 236 at the same point. For example, ray 232a intersects the focal plane 236 at point 237a, while ray 232b intersects the focal plane 236 at point 237b. If image sample point 228b were assigned lens position 231 rather than lens position 233, image sample point 228b would view the scene 225 along ray 234 instead of ray 232b. However, ray 234 still intersects focal plane 236 at point 237b.

[0020] Figure 2D illustrates an example scene 240 in screen space, which corresponds to scene 225 after a perspective transformation. Scene 240 includes an image plane 242 including image sample points 243, such as image sample points 243a and 243b. Scene 243 includes perspective transformed object 250 and an aperture plane 244. After the perspective transformation, lens aperture 230 in world space is transformed into multiple lens apertures in screen space, each of which is centered over a different image sample point. For example, lens aperture 245a is centered over image sample point 243a and lens aperture 245b is centered over image sample point 243b.

[0021] Image sample point 243a is assigned a lens position 246a, which corresponds to the center of the lens aperture 245a. Image sample point 243a views the scene 240 along ray 247a. Similarly, image sample point 243b is assigned lens position 248 and views the scene 240 along ray 247b. Ray 247b intersects the transformed focal plane 251 at point 252. Regardless of the location of assigned lens position 248, the ray 247b associated with image sample point 243b will always intersect the focal plane 251 at point 252. For example, if image sample point 243b were assigned the lens location 246b instead of lens location 248, the ray 249 defined by image sample point 243b and lens location 246b still intersects the focal plane 251 at point 252.

[0022] Figures 2A-2D illustrates the problem with rendering depth of field with prior techniques. In example scene 200, all of the rays 209 representing the image sample points' views of the scene pass through the same point in the pinhole aperture. As a result, a perspective transformation of scene 200 results in scene 215, in which all of these rays 221 are parallel. Scene 215 can be easily rendered using a Z-buffer or painter's algorithm to determine occlusion. For example, assume that the screen space includes a coordinate system 224 with the X and Y axes parallel to the image plane and the Z axis perpendicular to the image plane. Because the rays 221 are parallel to each other and the Z axis, determining the portions of the scene that intersect each of the rays 221 is trivial. The scene geometry can be projected into the screen space defined by the image plane. Any geometric primitive in the scene that overlaps the X and Y coordinates in screen space of one of the rays 221 intersects that ray. In this example, determining the intersection of a three-dimensional ray with objects in a three-dimensional scene is reduced to a two-dimensional test. Occlusion can be determined by comparing Z coordinate values of geometric primitives at each ray intersection. Additionally, the scene geometry can be easily bounded and subdivided along any axis, which allows memory optimizations, such as bucketing, and parallel processing to be easily implemented.

[0023] In contrast to the examples of scenes 200 and 215, the rays 232 representing the image sample points' views of the scene 225 pass may through the different points of a lens aperture. Because of this, scene 240, which is the perspective transformation of scene 225, includes rays 247 that are not parallel with each other. Under prior approaches, rendering techniques such as a Z-buffer cannot be used with scene 240. Instead, the Tenderer must determine the intersection of different three-dimensional rays with objects in three- dimensional space for each image sample. This is a complex and time-consuming operation. Additionally, it is difficult to bound scene geometry, making it difficult to optimize memory and processor usage.

[0024] For example, determining the intersection of rays 247 with objects in scene 240 is not trivial, because the rays 247 are not parallel. To determine the value of image sample point 243b, a Tenderer must determine how ray 247b intersects objects in scene 240. Because ray 247b is not perpendicular to the image plane 242, the X and Y coordinates in screen space of ray 247b may vary along the length of ray 247b as a function of the Z coordinate. Thus, a Tenderer must determine a three-dimensional line equation with three variables corresponding with ray 247b and then determine if this line equation intersects any geometric primitives in the scene 240. This determination requires extensive computing resources.

[0025] Figure 2E illustrates the application of an embodiment of an invention to an example scene 260. Example scene 260 represents a scene in screen space, similar to scene

240 discussed above. Scene 260 includes an image plane 262 with at least one image sample point 263. Image sample point 263 is assigned a lens position 266 in lens aperture plane 264 and views the scene 260 along ray 265. If image sample point 263 had been assigned a lens position in the center of the lens aperture plane 264, then image sample point would instead view the scene 260 along ray 267.

[0026] An embodiment of the invention allows all of the image samples to view the scene along parallel rays in screen space, while shearing the scene differently for the image samples to account for each image sample's assigned lens position. Because all of the image samples view the scene along parallel rays in screen space, a wide variety of different rendering techniques, such as rasterization, scanline rendering, z-buffer rendering, painters algorithm, and micropolygon or REYES rendering, can be used to render depth of field effects without relying on computationally expensive three-dimensional ray / three-dimensional object intersection tests.

[0027] As discussed above, it is much easier to render a scene if image samples points view the scene along rays that are parallel to each other and perpendicular to the image plane 262, such as ray 267, rather than rays with arbitrary orientations, such as ray 265. This allows reduces the intersection test between image sample points and scene geometry to a two- dimensional test. In an embodiment of the invention, this can be accomplished by shearing the scene so that portions of the scene that originally intersect ray 265 will, after shearing, intersect ray 267 instead. The image sample point 263 will have the same value when viewing the sheared version of scene 260 along ray 267 as when viewing the original version of scene 260 along ray 265.

[0028] Because ray 267 is perpendicular to the image plane 262, determining which portions of the sheared version of the scene 260 intersect ray 267 is trivial. The sheared version of the scene 260 can be projected into the screen space defined by the image plane 262. Assuming a screen space coordinate system with the X and Y axes parallel to the image plane and the Z axis perpendicular to the image plane, then any geometric primitive in the scene that overlaps the X and Y screen space coordinates of ray 267 intersects ray 267, and thus intersects the corresponding image sample point 263. Thus, determining the intersection of the three-dimensional ray 265 with objects in a three-dimensional scene is reduced to a two-dimensional test intersection test between image sample point 263 and the sheared version of the objects in a scene. Occlusion can be determined by comparing Z coordinate values of geometric primitives at each intersection with ray 267, using a hit test, or by drawing primitives in order of decreasing depth.

[0029] To determine how to shear the scene 260, an embodiment determines the relative distance between ray 265 and ray 267 at a given depth and then shears geometric primitives at that depth by the same relative amount. The rays 265 and 267 will always intersect at point 268 of focal plane 269, regardless of the lens position associated with ray 265. Because of this, an embodiment of the invention can express the shear applied to the scene for an image sample point as a function of the distance from the focal plane and the lens position assigned to that image sample point.

[0030] For example, object 270a includes vertices 271a and 274a. The Tenderer would like to determine if ray 265 intersects object 270a, and if so, the exact point of intersection. At the depth of vertex 271a, ray 265 has a distance of 275 from ray 267. Thus, vertex 271a should be sheared by distance 275, forming sheared vertex 271b. Similarly, at the depth of vertex 274a, ray 265 has a distance of 277 from ray 267. Thus, vertex 274a should be sheared by distance 277 to form sheared vertex 274b.

[0031] An embodiment of the invention shears all of the points or vertices in a portion of the scene potentially visible to an image sample point by the amount specified by the depth of the points and the selected lens position for the image sample point. For example, all of the vertices of object 270a may be sheared according to their respective depths to form a sheared version 270b of the object. The sheared object 270b is then tested for intersection with ray

267. As discussed above, the intersection test between ray 267 and the sheared version 270b of the object is a trivial two-dimensional intersection test. From this intersection test with ray

267, it is determined that points 272b and 274b intersect ray 267. In this example, points

272a and 274a of the unsheared object 270a correspond with points 272b and 274b. Because points 272b and 274b intersect ray 267, points 272a and 274a must intersect corresponding ray 265. Thus, the Tenderer has determined the intersection of ray 265 with unsheared object 270a by determining the intersection of ray 267 with the sheared version of the object.

[0032] Moreover, although the relative shear displacement is in the direction from ray 265 to ray 267, not all objects will move toward ray 267 after shearing. For example, object 280 includes points 282a and 284a. In this example, points 282a and 284a are at the same depth as points 271a and 274a, respectively. Thus, points 282a and 284a should be sheared according to shear displacements 275 and 277, respectively, to form a sheared version 280b of the object 280a that includes sheared points 282b and 284b. Similarly, although an unsheared object may intersects ray 267, after shearing the scene geometry, the corresponding sheared version of the object may not intersect ray 267 and thus not be visible to that image sample point.

[0033] It should be noted that because different image sample points may be assigned different lens positions, the scene must be sheared by different amounts for each different lens position. However, the increased computational resources required to apply different amounts of shear to the scene for each lens position is typically much smaller than determining the intersections of geometric primitives with rays of arbitrary orientations.

[0034] Figure 3 illustrates a method 300 of generating depth of field effects according to an embodiment of the invention. Step 305 receives three-dimensional scene data. The scene data can be defined using any three-dimensional modeling and/or animation technique known in the art. Scene data can include programs and data describing the geometry, lighting, shading, motion, and animation of a scene to be rendered.

[0035] Step 310 selects an image sample point. In an embodiment, a Tenderer produces one or more images from scene data by sampling optical attributes of the scene data at a number of different image sample points located on a virtual image plane. Image sample points can correspond with pixels in the rendered image or with sub-pixel samples that are combined by the Tenderer to form pixels in the rendered image. Image samples such as pixels and sub- pixel samples may be distributed in regular locations on the image plane or at random or pseudo-random locations on the image plane to reduce aliasing and other sampling artifacts. [0036] Step 315 assigns a lens position to the selected image sample point. In an embodiment, a single virtual aperture is centered over the image plane in world space. After a perspective transformation, this arrangement corresponds with each image sample point having its own virtual aperture located in screen space. In screen space, the virtual aperture is centered around the selected image sample point and is aligned with an optical axis passing through the center of the image sample point and perpendicular to the image plane. The virtual aperture plane is typically positioned (within the screen space defining the scene) at a focal distance from the virtual image plane.

[0037] An embodiment of step 315 selects an arbitrary point within the aperture on the virtual aperture plane as a lens location for the selected image sample point. In a further embodiment, step 315 selects lens locations randomly or pseudo-randomly within the virtual aperture, so that the distribution of lens positions assigned to the set of image sample points covers the entire aperture without any undesirable visually discernable patterns. In other embodiments, other low discrepancy sampling methods such as Halton or Hammersley sampling can be used to select the lens locations. In still further embodiments, step 315 can select lens positions using a regular pattern or sequence, if aliasing is not a concern.

[0038] Typically, Tenderers form images by determining the portion of the scene, such as portions of objects, intersecting or projecting onto each image sample and determining the optical attributes of these intersecting portions. Renderers typically identify intersecting image samples by casting or tracing rays from image samples into the scene or by projecting the geometry of the scene on to the image plane.

[0039] Step 320 identifies and selects a portion of the scene that, when projected into the image plane, potentially intersects or overlaps the selected image sample point. This is done to reduce the amount of data to be evaluated for each image pixel. However, alternate embodiments of method 300 may omit step 320 and process the entire scene for each image sample point.

[0040] In an embodiment, geometric primitives within the scene data, such as polygons, micropolygons, polygon fragments, splines and or other curved surfaces, subdivision surfaces, and particles, are associated with bounding boxes or other bounding volumes. Step 320 projects these bounding volumes in the image plane to identify bounding volumes intersecting with the area associated with the selected image sample (or with an area associated with multiple image samples), and hence the geometric primitives potentially intersecting the selected image sample. In a further embodiment of step 320, discussed in detail below, the depth of geometric primitives and the lens position modify the bounding volumes used to identify a portion of the scene potentially intersecting the selected image sample.

[0041] Step 325 shears the position of the selected portion of the scene according to its depth values and the lens position of the selected image sample. In an embodiment, the selected portion of the scene typically includes numerous geometric primitives located at various depths and positions relative to the image sample point.

[0042] For each geometric primitive, an embodiment of step 325 moves the geometric primitive in a plane parallel to the image plane by a shear displacement associated with the selected image sample. An embodiment of step 325 moves each vertex of the geometric primitive by a distance equal to the lens position divided by the focal distance and multiplied by the depth of the vertex of the geometric primitive relative to the image plane. The direction of this shear displacement is the direction from the lens position to the center of the virtual aperture. Alternatively, this shear displacement may be defined by similar triangles, with a first ratio, between the shear displacement parallel to the image plane and the depth, equal to a second ratio, between the lens position and the focal distance. This movement corresponds to the distance at the depth of a vertex of geometric primitive between a first ray, defined from the selected image sample point and passing through the selected lens position, and a second ray, defined from the selected image sample point and passing through the center of the virtual aperture (i.e. the optical axis of the virtual aperture). This second ray is equivalent to the image sample's field of view through a pinhole aperture.

[0043] In another embodiment, step 325 can shear the geometric primitive in a plane that is not parallel to the image plane. In this embodiment, the amount of shear displacement is equal to the distance from a first ray, defined from the selected image sample point and passing through the selected lens position, and a second ray, defined from the selected image sample point and passing through the center of the virtual aperture, within the specified plane.

[0044] Step 325 is repeated for all of the geometric primitives of the selected portion of the scene. Because the selected portion of the scene typically includes numerous geometric primitives located at various depths and positions relative to the image sample point, step 325 will move each geometric primitive of the selected portion of the scene towards the center of the virtual aperture by a different amount, according to each geometric primitive's vertices' depths and the selected image sample's assigned lens position.

[0045] After shearing all or a portion of the scene in step 325, the portions of the unsheared version of the scene that intersected a first ray from the selected image sample point and passing through the selected lens position will now be intersected by a second ray from the selected image sample point and passing through the center of the virtual aperture. After a perspective transformation, the second ray will be perpendicular to the image plane. As a result, the sheared version of the scene intersected by the second ray can be evaluated using a z-buffer.

[0046] Step 330 samples the sheared scene to determine the value of one or more attributes of the selected sample point. In an embodiment, step 330 identifies sheared geometric primitives, which when projected onto the image plane, intersect the selected sample point. Step 330 determines the attributes of the geometric primitives intersecting the image sample point, which can include color and transparency and attributes used for rendering special effects, including depth values, normal vectors, and other arbitrary shading attributes; and combines these values according to the depth values of the geometric primitives. Step 330 may also store depth information for each image sample to use with Z-buffering or any other depth compositing technique known in the art. [0047] Step 330 can use any rendering technique known in the art to determine attributes of geometric primitives, including applying texture, environment, lighting and shadow maps and executing lighting and shading programs. In an embodiment, the attributes of geometric primitives are determined and stored prior to performing method 300. As a result, step 330 can retrieve and combine the stored optical attribute values of the intersected geometric primitives to determine one or more attribute values of the selected image sample.

[0048] Following step 330, method 300 may optionally proceed back to step 310 to select another image sample point. Steps 310 to 330 may be repeated as often as necessary to evaluate any number of image sample points to produce one or more rendered images from scene 305. In an embodiment of method 300, for each iteration of steps 310 to 330, the location of the image sample, its virtual aperture, and often the assigned lens position are different. As a result, the scene or a selected portion thereof is sheared by a different amount and/or in a different direction for each image sample point. However, once the scene or a portion thereof is sheared for a given image sample point, the image sample point can sample its sheared version of the scene along a ray from the image sample point and passing through the center of the virtual aperture.

[0049] As discussed above, step 320 selects a portion of the scene potentially intersecting an image sample to be sheared and evaluated. In an embodiment, step 320 uses the shear value of a geometric primitive, which is based on the geometric primitive's depth and the image sample's lens position, to determine the size and/or orientation of a bounding volume of the geometric primitive. For example, a bounding box of a geometric primitive may be generated using any bounding technique known in the art. An embodiment of step 320 can then scale the bounding box of the geometric primitive by the magnitude of the shear displacement to be applied to geometric primitives at that depth for the image sample. This expanded bounding box can then be projected against the image plane to determine if the geometric primitive potentially intersects the image sample.

[0050] Figure 4 illustrates the specification of depth of field parameters according to an embodiment of the invention. In an embodiment, users can specify depth of field parameters using traditional camera parameters, such as the focal length and aperture size, or as the ratio between the focal length and aperture size, commonly referred to as an f-number or f-stop.

[0051] In another embodiment, depth of field may be specified by an arbitrary function of shear and depth. For example, graph 400 illustrates a first function 405 and a second function 410 of shear as a function of depth. For functions 405 and 410, the shear of a geometric primitive for an image sample is determined by scaling the value of the function 405 by the ratio of the image sample's lens position from the aperture center to the aperture radius.

[0052] In function 405, the maximum possible shear of a geometric primitive towards the optical axis or center of the virtual aperture increases linearly as the depth increases, up to a maximum blur limit. The maximum shear specified by the function at a given depth corresponds with the amount of depth of field blurring applied to objects at that depth.

Function 405 corresponds to a typical optical system. It should be noted that the maximum shear specified by function 405 is approximately zero at the location of the depth of the focal plane 415. In contrast, function 410 specifies that the maximum shear of geometric primitives may increase, decrease or stay constant as function of depth in a continuous or discontinuous manner.

[0053] Functions such as function 410 allows users to specify any arbitrary depth of field. For example, users can specify that very distant objects and near objects are in focus, while objects in between are out of focus. Function 410 corresponds with using a different aperture size for objects at different depths, which is not possible with conventional cameras and optical systems, but is sometimes aesthetically or cinematically desirable. In the example function 410, there is a wide depth range 420 in which the maximum shear is at or close to zero; thus, objects or portions thereof within range 420 will appear in focus. Objects or portions thereof outside of range 420 will be have depth of field blurring.

[0054] Figure 4 illustrates the specification of depth of field as maximum shear as a function of depth. In other embodiments, similar functions can be used to specify depth of field in terms of other parameters. For example, a function can specify depth of field in terms of the aperture size or f-number as a function of depth.

[0055] As discussed above, image samples in the image plane can be distributed in a random or pseudo-random manner to prevent unwanted visual artifacts and aliasing. Figures 5A-5C illustrate the distribution of image samples for depth of field effects according to an embodiment of the invention. Figure 5A illustrates a stratified assignment of lens positions to image samples. In an embodiment, a region of the image plane 505, which may correspond with the region of a pixel, is subdivided into a number of sub-regions 507A-507I. Each sub-region 507 includes one image sample. Within a sub-region 507, the image sample is assigned a random, pseudo-random, or other arbitrary or irregular location. For different pixels, image samples may be assigned different locations within corresponding sub-regions. By stratifying image samples into sub-regions, the variance of the distribution of image samples in region 505 decreases. This reduces the amount of visual noise in images.

[0056] Similarly, a region of the aperture plane 510 is subdivided into a number of sub- regions 512A-512I. Each sub-region 512 of the aperture plane includes one or more lens positions. Within an aperture sub-region 512, an embodiment assigns lens positions to random, pseudo-random, other arbitrary or irregular location. By stratifying lens positions into sub-regions, the variance of the distribution of lens positions in region 512 decreases. This reduces the amount of visual noise in images.

[0057] Each sub-region 507 of the image plane is associated with one of the sub-regions of the aperture plane. For example, sub-region 507A is associated with aperture plane sub- region 512H, as indicated by ray 515B. Similarly, sub-region 507B is associated with the aperture plane sub-region 512E, as indicated by ray 515A. The number of sub-regions 512 of the aperture plane may be the same or different from the number of sub-regions 507 of the image plane. Thus, there may be zero, one, or more image samples associated with each aperture plane sub-region 512.

[0058] In an embodiment, there is a correlation between sub-regions of each pixel on the image plane and the assigned aperture plane sub-region. Figure 5B illustrates an example of this correlation. Figure 5B illustrates two adjacent pixels, 530A and 530B. Each pixel includes a number of sub-regions, such as sub-regions 535A and 540A in pixel 530A and sub-regions 535B and 540B in pixel 530B. Each sub-region is associated with a sub-region of the aperture plane. In this example, corresponding sub-regions of pixels are associated with the sample region of the aperture plane. For example, both sub-regions 535A and 535B are associated with the same aperture plane sub-region, referred to here as L2. Similarly, both sub-regions 540A and 540B are associated with the same aperture plane sub-region, referred to here as L7.

[0059] Although corresponding sub-regions of pixels are associated with the same aperture plane sub-region, in an embodiment, image samples are assigned different lens positions within the sub-region. For example, the image sample associated with sub-region 535A is assigned to lens position 545 A, located in the bottom right corner of the aperture plane sub- region L2. In contrast, the image sample associated with sub-region 535B is assigned lens position 545B, which is located in the upper left corner of the aperture plane sub-region L2. By displacing or jittering lens positions and image sample positions within aperture plane and image plane sub-regions, respectively, aliasing and other unwanted visual artifacts are reduced. Moreover, as discussed above, the stratification of image samples and lens positions into sub-regions reduces variance-related visual noise.

[0060] Pixels, pixel sub-regions, and lens apertures may be any shape. As shown in figures 5A and 5B, lens positions are distributed over a square region of the aperture plane. However, real-world apertures are often circular or closely approximate circles using mechanical irises. Figure 5C illustrates an example transformation of a lens position in a square aperture to a corresponding lens position in a circular aperture. A lens position 585 is defined within a square aperture 575. A circular aperture 580 is defined within the square aperture. To map the lens position 585 to the circular aperture, an embodiment of the invention measures the distance Ll 587 from the center of the apertures to the lens position 585. The distance L2 589 from the center of the apertures to the edge of square aperture 580, passing through lens position 585, is also measured. In an embodiment, Ll, the radial distance between the center of the aperture 575 and lens position 585, is scaled by the ratio between the radius R of the circular aperture 580 and L2 589. Thus, lens position 585 is moved to a location Ll * R/L2 591.

[0061] Figure 6 illustrates a computer system 2000 suitable for implementing an embodiment of the invention. Computer system 2000 typically includes a monitor 2100, computer 2200, a keyboard 2300, a user input device 2400, and a network interface 2500. User input device 2400 includes a computer mouse, a trackball, a track pad, graphics tablet, touch screen, and/or other wired or wireless input devices that allow a user to create or select graphics, objects, icons, and/or text appearing on the monitor 2100. Embodiments of network interface 2500 typically provides wired or wireless communication with an electronic communications network, such as a local area network, a wide area network, for example the Internet, and/or virtual networks, for example a virtual private network (VPN).

[0035] Computer 2200 typically includes components such as one or more processors 2600, and memory storage devices, such as a random access memory (RAM) 2700, disk drives 2800, and system bus 2900 interconnecting the above components. Processors 2600 can include one or more general purpose processors and optional special purpose processors for processing video data, audio data, or other types of data. RAM 2700 and disk drive 2800 are examples of tangible media for storage of data, audio / video files, computer programs, applet interpreters or compilers, virtual machines, and embodiments of the herein described invention. Other types of tangible media include floppy disks; removable hard disks; optical storage media such as DVD-ROM, CD-ROM, and bar codes; non- volatile memory devices such as flash memories; read-only-memories (ROMS); battery-backed volatile memories; and networked storage devices.

[0054] Further embodiments can be envisioned to one of ordinary skill in the art after reading the attached documents. In other embodiments, combinations or sub-combinations of the above disclosed invention can be advantageously made. The block diagrams of the architecture and flow charts are grouped for ease of understanding. However it should be understood that combinations of blocks, additions of new blocks, re-arrangement of blocks, and the like are contemplated in alternative embodiments of the present invention.

[0055] The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereunto without departing from the broader spirit and scope of the invention as set forth in the claims.

Claims

WHAT IS CLAIMED IS:

1. A method of rendering an image with depth of field, the method comprising: defining a first and second image sample points in an image plane; receiving scene data including geometric primitives; assigning a first lens position to the first image sample point and a second lens position to the second image sample point, wherein the second lens position is different from the first lens position; shearing at least a first portion of the geometric primitives to form first sheared scene data, wherein each of the first portion of geometric primitives is sheared by a first shear amount in response to the first lens position and a depth of each of the first portion of geometric primitives ; sampling the first sheared scene data to determine an attribute value of the first image sample point; shearing at least a second portion of the geometric primitives to form second sheared scene data, wherein each of the second portion of geometric primitives are sheared by a second shear amount in response to the second lens position and a depth of each of the second portion of geometric primitives; and sampling the second sheared scene data to determine an attribute value of the second image sample point.

2. The method of claim 1 , wherein sampling the first sheared data and the second sheared data is performed using a scanline renderer.

3. The method of claim 1, wherein sampling the first sheared scene data comprises: identifying at least one geometric primitive projected from its sheared position in the first sheared scene data onto the image plane that intersects the first image sample point; determining an attribute value for the identified geometric primitive; and combining the attribute value for the identified geometric primitive with the attribute value of the first image sample point; and sampling the second sheared scene data comprises: identifying at least a second geometric primitive projected from its sheared position in the second sheared scene data onto the image plane that intersects the second image sample point; determining an attribute value for the second identified geometric primitive; and combining the attribute value for the second identified geometric primitive with the attribute value of the second image sample point.

4. The method of claim 3, wherein combining the attribute value for the identified geometric primitive with the attribute value of the first image sample point comprises: performing a visibility hit test to determine if the identified geometric primitive is visible to the first image sample point; combining the attribute value for the identified geometric primitive with the attribute value of the first image sample point in response to the determination that the identified geometric primitive is visible to the first image sample point; and ignoring the attribute value for the identified geometric primitive in response to the determination that the identified geometric primitive is not visible to the first image sample point.

5. The method of claim 3, wherein combining the attribute value for the identified geometric primitive with the attribute value of the first image sample point comprises: comparing a depth value of the identified geometric primitive at a point of intersection with the first image sample point with a stored depth value associated with the first image sample point to determine if the identified geometric primitive is visible to the first image sample point; combining the attribute value for the identified geometric primitive with the attribute value of the first image sample point in response to the determination that the identified geometric primitive is visible to the first image sample point; and ignoring the attribute value for the identified geometric primitive in response to the determination that the identified geometric primitive is not visible to the first image sample point.

6. The method of claim 3, wherein determining the attribute values for the identified and second identified geometric primitives includes: retrieving stored attribute values of the identified and second identified geometric primitives.

7. The method of claim 1, wherein shearing at least some of the geometric primitives to form the first sheared scene data comprises: selecting a portion of the geometric primitives potentially intersecting the first sample point; and shearing the selected portion of the geometric primitives to form the first sheared scene data.

8. The method of claim 7, wherein selecting the portion of the geometric primitives comprises: determining a bounding volume for at least a portion of the geometric primitives; projecting the bounding volume on to the image plane; determining if the bounding volume intersects the first image sample point; and selecting the geometric primitive in response to the determination that the bounding volume intersects the first image sample point.

9. The method of claim 8, wherein determining the bounding volume comprises: determining an initial bounding volume of the geometric primitive; scaling the initial bounding volume by an amount in response to the first lens position and a depth of the initial bounding volume to form the bounding volume.

10. The method of claim 1, wherein a ratio of each of the first shear amounts to the distance of each of the first portion of geometric primitives from a focal plane is equal to a ratio of the first lens position to a distance from the first lens position to the focal plane; and wherein a ratio of each of the second shear amounts to the distance of each of the second portion of geometric primitives from the focal plane is equal to a ratio of the second lens position to a distance from the second lens position to the focal plane.

11. The method of claim 1, wherein each of the first and second shear amounts are in a direction parallel to the image plane in a screen space coordinate system.

12. The method of claim 1, wherein the first and second lens positions are selected pseudo-randomly.

13. The method of claim 1, wherein the geometric primitives are selected from a group consisting of: polygons, micropolygons, polygon fragments, splines, curved surfaces, subdivision surfaces, and particles.

14. The method of claim 1 , wherein the value of the attribute of the first image sample is selected from a group consisting of: color; transparency; texture coordinate; shading attribute; and depth.

15. A method of rendering an image with depth of field, the method comprising: defining a first image sample point in an image plane; receiving scene data including first and second geometric primitives, wherein the first geometric primitive includes first vertices and the second geometric primitive includes second vertices; determining first shear vectors from a function based on depths of the first vertices of the first geometric primitive; shearing the first vertices of the first geometric primitive by the first shear vectors to form a first sheared geometric primitive; determining second shear vectors from a function based on depths of the second vertices of the second geometric primitive; shearing the second vertices of the second geometric primitive by the second shear vectors to form a second sheared geometric primitive; and sampling the first and second sheared geometric primitives to determine an attribute value of the first image sample point.

16. The method of claim 16, wherein sampling the first and second sheared geometric primitives is performed using a scanline renderer.

17. The method of claim 15, wherein the depths of the first vertices of the first geometric primitive are different from the depths of the second vertices of the second geometric primitive and the first shear vectors are different from the second shear vectors.

18. The method of claim 15, wherein the depths of the first and second vertices of the first and second geometric primitives are expressed as distances from a focal plane.

19. The method of claim 15, wherein the first and second shear vectors are parallel to the image plane in a screen space coordinate system.

20. The method of claim 15, wherein the function specifies a maximum blur value as a function of depth.

21. The method of claim 15, wherein the defining the first image sample point comprises: assigning a first lens position to the first image sample point.

22. The method of claim 21, wherein the function scales the first lens position by a scaling value based on a depth of one of the vertices of the geometric primitive.

23. The method of claim 21, wherein the scaling value is based on at least one attribute selected from a group consisting of: virtual aperture size; focal distance; f- number; and a non-linear function of the depth.

24. The method of claim 21, wherein each of the first shear vectors is equal to the distance at the depth of one of the first vertices of the first geometric primitive between a first ray passing through the first lens position and a second ray passing through the first image sample point and perpendicular to the image plane.

25. The method of claim 15, wherein sampling the first and second sheared geometric primitives comprises: projecting the first and second sheared geometric primitives onto the image plane; determining if the first and second projected sheared geometric primitives intersects the first image sample point; and in response to at least one of the first and second projected sheared geometric primitives intersecting the first image sample point, determining an attribute value for each of the intersecting geometric primitives, and combining the attribute value of the each of the intersecting geometric primitives with the attribute value of the first image sample point.

26. The method of claim 25, wherein combining the attribute value for each of the intersecting geometric primitives with the attribute value of the first image sample point comprises: performing a visibility hit test to determine if each of the intersecting geometric primitives is visible to the first image sample point; combining the attribute value for each of the intersecting geometric primitives with the attribute value of the first image sample point in response to the determination that the intersecting geometric primitive is visible to the first image sample point; and ignoring the attribute value for the intersecting geometric primitive in response to the determination that the intersecting geometric primitive is not visible to the first image sample point.

27. The method of claim 25, wherein combining the attribute value for each of the intersecting geometric primitives into the attribute value of the first image sample point comprises: comparing a depth value of each of the intersecting geometric primitives at a point of intersection with the first image sample point with a stored depth value associated with the first image sample point to determine if the intersecting geometric primitive is visible to the first image sample point; combining the attribute value for each of the intersecting geometric primitives with the attribute value of the first image sample point in response to the determination that the intersecting geometric primitive is visible to the first image sample point; and ignoring the attribute value for the each of the intersecting geometric primitives in response to the determination that the intersecting geometric primitive is not visible to the first image sample point.

28. The method of claim 25, wherein determining the attribute value for each of the intersecting geometric primitives includes: retrieving a stored attribute value of the intersecting geometric primitive.

29. The method of claim 22, wherein the first lens position is selected pseudo-randomly.

30. The method of claim 15, wherein the first and second geometric primitives are selected from a group consisting of: polygons, micropolygons, polygon fragments, splines, curved surfaces, subdivision surfaces, and particles.

31. The method of claim 15, wherein the value of the attribute of the first image sample point is selected from a group consisting of: a color; a transparency; a depth; a texture coordinate; and a shading attribute.

32. A computer program product comprising computer executable instructions operable to cause a computer to be configured to perform a method in accordance with any one of claims 1 to 31.

33. A computer readable carrier medium bearing a computer program product in accordance with claim 32.

34. A computer readable storage medium storing a computer program product in accordance with claim 32.

35. A computer receivable signal bearing information defining a computer program product in accordance with claim 32.

36. A video signal generated by way of performance of the method of any one of claims 1 to 31.