WO1997012480A2

WO1997012480A2 - Method and apparatus for implanting images into a video sequence

Info

Publication number: WO1997012480A2
Application number: PCT/IL1996/000110
Authority: WO
Inventors: Adir Pridor; Yakov Vainberg; Lev Bregman; Alexander Rubchinsky; Haim Kreitman; Dan Bar-El
Original assignee: Scidel Technologies Ltd.
Priority date: 1995-09-13
Filing date: 1996-09-12
Publication date: 1997-04-03
Also published as: WO1997012480A3; EP0850536A2; EP0850536A4; JPH11512894A; CA2231849A1; BR9610721A; AU6942296A; IL115288A0; IL115288A

Abstract

A method and system for orienting a frame viewing an arena in which action occurs is provided. The orientation system includes a background topology graph, a graph creator and a graph correlator. The background topology graph graphs the relationships of the background objects of the arena with each other, wherein the relationships are those which change minimally from view to view of the arena. The graph creator creates a frame topology graph of relationships of segments of the frame. The graph correlator correlates the frame topology graph with the background topology graph to determine which objects of the arena, if any, are represented by segments of the frame.

Description

METHOD AND APPARATUS FOR IMPLANTING IMAGES INTO A VIDEO

SEQUENCE

FIELD OF THE INVENTION

The present invention relates generally to orientation of an image within a video sequence of a scene, to such orientation which determines the locations of elements in a scene and to replacing certain elements with a prepared image within the video sequence. BACKGROUND OF THE INVENTION

Sports arenas typically include a game area where the game occurs, a seating area where the spectators sit and a wall of some kind separating the two areas. Typically, the wall is at least partially covered with advertisements from the companies which sponsor the game. When the game is filmed, the advertisements on the wall are filmed as part of the sports arena. The advertisements cannot be presented to the public at large unless they are filmed by the television cameras. Systems are known which merge predefined advertisements onto surfaces in a video of a sports arena. One system has an operator define a target surface in the arena. The system then locks on the target surface and merges a predetermined advertisement with the portion of the video stream corresponding to the surface. When the camera ceases to look at the surface, the system loses the target surface and the operator has to indicate again which surface is to be utilized.

The above-described system operates in real-time. Other systems are known which perform essentially the same operation but not in real-time. Other systems for merging data onto a video sequence are known. These include inserting an image between video scenes, superposition of image data at a fixed location of the television frame (such as of television station logos) and even electronic insertion of image data as a "replacement" of a specific targeted billboard. The latter is performed using techniques such as color keying. US 5,264,933 describes an apparatus and method of altering video images to enable the addition of advertising images to be part of the image originally displayed. The operator selects where in the captured image the advertising image is to be implanted. The system of U.S. 5,264,933 can also implant images, in selected main broadcasting areas, in response to audio signals, such as typical expressions of commentators.

PCT Application PCT/FR91/00296 describes a procedure and device for modifying a zone in successive images. The images show a non-deformable target zone which has register marks nearby. The system searches for the register marks and uses them to determine the location of the zone. A previously prepared image can then be superimposed on the zone. The register marks are any easily identifiable marks (such as crosses or other "graphemes") within or near the target zone. The system of PCT/FR91/00296 produces the captured image at many resolutions and utilizes the many resolutions in its identification process.

PCT Application PCT/US94/01649 describes a system and method for electrically exchanging the physical images on designated targets by preselected virtual ones. In its preferred embodiment, the physical image to be substituted is detected, recognized and located automatically.

All of the prior art systems utilize pattern recognition techniques to identify the target to be replaced where the target is selected or predetermined.

SUMMARYOFTHE PRESENT INVENTION

It is an object of the present invention to provide a further image implantation system which utilizes a priori knowledge of the area in which an action occurs to determine where to implant the image. The a priori knowledge includes knowledge of the rules regarding the playing area itself (its shape and the lines and curves thereon) and elements of the entire arena which form part of the background and is utilized to orient the current frame within the background arena.

Applicants have realized that a two-dimensional picture of a three- dimensional scene will have certain characteristics which change due to the projection of the three-dimensional world onto the two-dimensional surface and certain characteristics which do not change. For example, angles of lines change as do the shapes of curves. However, the textures of neighboring fixed objects, such as signs on along the edge of a professional soccer field, will not change with perspective, nor will the relationship of one neighbor to the next (e.g. a COCA- COLA sign will remain next to a SPRITE sign, no matter the perspective). In addition, lines may be shown differently, but they remain lines.

Applicants utilize this realization for orientation of a current frame of an action, such as a soccer game, within the background arena. The present invention first maps the background arena by listing the minimally changing or invariant characteristics (adjacency relationships, the locations of lines and well-defined curves, etc.) and the topology of the static objects in the arena.

When the action occurs (i.e. during the game), the present invention determines which static objects are being viewed in each frame of the video sequence of the action. Since the action is of interest, the static objects of the arena will form part of the background. Thus, the frame of the video sequence will include objects not in the empty arena which may or may not occlude some of the static objects. The present invention determines the minimally changing characteristics of the arena and then attempts to match the topology of the frame to that of the arena. The occluding objects are determined and then not considered during the matching operation. Upon matching the current frame to the map of the background arena, the frame has been oriented with respect to the background arena. Many processing actions can occur with the orientation information. For example, implantation can occur. In this embodiment, the map includes indications of which objects are to be replaced with a desired image. Once the current frame has been oriented, the objects within it which are to be replaced can be determined, the objects which occlude the object to be replaced can also be determined and an implantation can, accordingly, occur.

There is therefore provided, in accordance with a preferred embodiment of the present invention, a method and system for orienting a frame viewing an arena in which action occurs. The orientation system includes a background topology graph, a graph creator and a graph correlator. The background topology graph graphs the relationships of the background objects of the arena with each other, wherein the relationships are those which change minimally from view to view of the arena. The graph creator creates a frame topology graph of relationships of segments of the frame. The graph correlator correlates the frame topology graph with the background topology graph to determine which objects of the arena, if any, are represented by segments of the frame.

Additionally, in accordance with a preferred embodiment of the present invention, the orientation system additionally includes a background topology graph creator which includes a standards topology graph, a second graph creator, a second graph correlator and a background topology graph creator. The standards topology graph graphs standard elements known to be present in arenas of a type similar to the arena. The second graph creator which creates a frame topology graph of relationships of segments of each frame of an initial video sequence. The second graph correlator correlates the frame topology graph, for each frame of the initial video sequence, with the standards topology graph to determine which standard elements, if any, are represented by which segments of the frame and which objects of the frame should be added to the background topology graph. The background topology graph creator creates the background topology graph from the output of the graph correlator. Moreover, in accordance with a preferred embodiment of the present invention, the graph correlator includes an occlusion evaluator, a graph matcher and a perspective evaluator. The occlusion evaluator determines which segments of the frame might represent occluding objects. The graph matcher matches the frame topology graph to the background topology graph and includes apparatus for removing one or more of the segments representing possibly occluding objects from the frame topology graph to create a reduced frame topology graph and apparatus for matching the reduced frame topology graph to the background topology graph and for producing, on output, the objects of the background and the segments matched by to the objects. The perspective evaluator determines a perspective transformation between the matched objects and the matched segments.

Further, in accordance with a preferred embodiment of the present invention, the relationships include at least one of: the adjacency relationships of neighboring segments, the textures of each segment, and the boundary equations of each segment.

There is also provided, in accordance with a second preferred embodiment of the present invention, a frame description unit for describing a frame viewing an arena. The frame description unit includes a texture segmenter which segments the frame into segments of uniform texture and an adjacency determiner which creates a graph listing which segments are neighbors of which segments.

Additionally, in accordance with the second preferred embodiment of the present invention, the frame description unit also includes a boundary analyzer which determines which pixels of each segment form its borders and which determines if the border pixels generally form one of a straight line and a quadratic curve and what their coefficients are.

There is further provided, in accordance with a third preferred embodiment of the present invention, an implantation unit for implanting an image into a frame on a surface within an arena in which action occurs. The implantation unit includes an orientation unit, as described hereinabove, for orienting the frame within the arena and for indicating where in the frame the surface is and an implanter for implanting the image into the portion of the frame indicated by the orientation unit. Additionally, in accordance with the third preferred embodiment of the present invention, the orientation unit additionally includes an implantation location determiner for determining which of the matched segments corresponds to said surface to be implanted upon and the implanter includes a transformer, a permission mask and a mixer. The transformer transforms said image in accordance with said perspective transformation thereby creating a transformed image. The permission mask creator creates a permission mask from said matched segments corresponding to said surface to be implanted upon. The mixer mixes said frame with said transformed image in accordance with said permission mask. There is further provided, in accordance with a further preferred embodiment of the present invention, a method for implanting an image into a selected one at a time of a plurality of video frames representing a stream of action occurring within a background space, the space having fixed planar surfaces and being scanned by at least one video camera. The method includes the steps of a) providing an initial model, independent of the plurality of video frames, of a selected one of the fixed surfaces, the initial model comprising a graph of the relationships of the background objects of the selected fixed surface with each other, wherein the relationships are those which change minimally from view to view of the background space, b) generating a background model of objects of the background space from initial frames of the video frames which view only the background space and the initial model, the background model comprising a graph of the relationships of the background objects of the background space with each other, c) utilizing the background model for identifying the objects viewed in each video frame and d) perspectively implanting the image into the frame into the portion of the frame viewing a previously selected one of the fixed planar surfaces. BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the drawings in which:

Fig. 1 A is an isometric illustration of a soccer stadium useful in understanding the present invention;

Fig. 1B is a two-dimensional illustration of a section of the soccer stadium of Fig. 1A;

Fig. 2A is an isometric illustration of the soccer stadium of Fig. 1A with the objects therein labeled; Fig. 2B is the same two-dimensional illustration of the section shown in Fig.

1B with the labels of Fig. 2A;

Figs. 3A and 3B are graph illustrations of the objects of Figs. 2A and 2B, respectively;

Fig. 4 is a block diagram illustration of an orientation and implantation system utilizing the graphs of Figs. 3A and 3B, constructed and operative in accordance with a preferred embodiment of the present invention;

Fig. 5 is a block diagram illustration of the elements of a mapper forming part of the system of Fig. 4, wherein the mapper creates the graph of Fig. 3A;

Fig. 6 is an illustration of pixels on the boundary of two segments, useful in understanding the operations of the mapper of Fig. 5;

Fig. 7 is a block diagram illustration of the elements of an orientation system, forming part of the system of Fig. 4, which matches the graph of Fig. 3B to that of Fig. 3A;

Fig. 8A is an illustration of an exemplary background scene having 11 segments, useful in understanding the operation of the system of the present invention;

Fig. 8B is a graph of the topology of the scene of Fig. 8A; Fig. 9A is an illustration of a portion of the scene of Fig. 8A with action occurring therein;

Fig. 9B is a graph of the topology of the scene of Fig. 9A;

Fig. 10 is a flow chart illustration of the process of matching graphs, useful in understanding the operations of the orientation system of Fig. 7 and the mapper of Fig. 5; and

Fig. 11 is a block diagram illustration of the implantation unit forming part of the system of Fig. 4.

DETAILED DESCRIPTION OF THE PRESENT INVENTION

Reference is now made to Figs. 1A, 1 B, 2A, 2B, 3A and 3B which are useful in understanding the orientation concept of the present invention.

The present invention will be described in the context of a televised soccer game, it being understood that the present invention is effective for all action occurring within any relatively fixed arena, such as a sports event, an evening of entertainment, etc.

Fig. 1A, to which reference is now made, illustrates an exemplary professional soccer stadium 58. The stadium includes a field 60 on which are painted lines 62 and curves 63 which mark the various boundaries of interest in a soccer game. The lines and curves are typically in accordance with the official rules of soccer. Also on the field are two goals 64 and a series of signs 66 onto which various advertisers place their advertisements, indicated by the many different patterns on signs 66. An advertiser can utilize one or many signs; for example, Fig. 1A shows two signs with the interconnecting circles on them.

In addition, the stadium 58 includes bleachers 70, a fence 72 marking the borders of the field 60, flagpoles 74 for supporting flags 76 and a camera viewing stand 77. Fence 72 often includes many posts 78.

When a camera views the stadium 58, it rarely sees all of the arena of Fig. 1A. Instead, it sees a small portion thereof. Fig. 1B illustrates such a frame view of the arena, taken from camera stand 77 and viewing one of the areas near the left goal 64. In frame 80 of Fig. 1B, the curve labeled 63a of Fig. 1A and the lines labeled 62A, 62B, 62C and 62D are visible as are a few of the signs 66, labeled 66A and 66B. Other elements which are partially visible are the posts 78 of the fence.

As mentioned hereinabove, Applicants have realized that there are certain characteristics of the static objects in stadium 58 which are preserved in frame 80.

These characteristics are the "invariant" characteristics of the arena. In both the stadium 58 and in frame 80, curve 63A meets line 62A at exactly two points; however, from one Figure to the next, the shapes of curve 63A and of line 62A have changed. Nonetheless, curve 63A remains a curve and line 62A remains a straight line. In both the stadium 58 and in frame 80, sign 66A touches sign 66B.

In accordance with a preferred embodiment of the present invention, prior to the start of the game, the present invention maps the stadium 58 by mapping its static objects, their adjacencies and other characteristics of the objects which change minimally when viewing them at different angles. Each object is defined as a planar domain having a single texture and having a shape defined through a listing of the edges.

Examples of the labeling of the objects within the stadium 58 and the frame 70 are provided in Figs. 2A and 2B, respectively. For simplicity's sake, only the objects on the field of the stadium 58 are mapped. Each separate object on the field is labeled with a number from 1 to 53, where a field marking line or curve, being, in reality, a band and not a line, is considered as an object. The grass bordered by the marking lines is also considered to be an object. Each sign is shown as a single object, labeled 41 - 53, though the pattems on the signs can, alternatively, be divided into separate objects, one per portion of the pattern thereon. Fig. 2A shows the full set of objects, since Fig. 2A illustrates the entire stadium 58, while Fig. 2B, which illustrates only frame 80, has only a few objects.

The corresponding topological graphs are illustrated in Figs. 3A and 3B where an open circle indicates an object with the texture of a marking line, a dotted square indicates an object with the texture of grass, an open square indicates an object with a texture other than grass or marking lines, a thin line indicates that the object is bounded by a straight line and a thick line indicates that the object is bounded by a curve. Each object is labeled with its number from the corresponding Figure.

For example, consider object 4, the grass in front of the leftmost goal. It is bounded by marking lines 1 , 3, 5 and 11 , each of which is a straight band. Thus, object 4 is marked with a dark square (grass texture), and connected with thin lines to each of the other objects, all of which are open circles. Consider object 40, the grass outside of the playing field. It borders each of the signs 41 - 53 with straight line borders and also borders the outer marking lines of the field, labeled 1 , 20, 21 and 39. The map of Fig. 3A reflects this. Finally, consider object 16, the left field grass. It borders curved marking lines 9, 7, 15 and 17 (connected with thick lines) and straight marking lines 1 , 2, 6, 13, 19, 20 and 21 (connected with thin lines). Note also that most of the signs have two neighboring signs. The topology graph for frame 80 is much smaller as frame 80 has far fewer objects within it. In frame 80 we view a portion only of object 16. Thus, the graph of Fig. 3B has only some of the connections for object 16, those to objects 1, 2, 9, 13 and 15. Similarly, object 40 (the grass outside of the field) is only connected to three of the signs, those labeled 43, 44 and 45. By matching the topology graph of Fig.3B with the topology graph of Fig. 3A, the present invention can orient the view of frame 80 within the world of stadium 58. This orientation is performed by using the information in the topology graphs of Figs. 3A and 3B; it does not require pattern recognition as in the prior art.

For the purpose of the implantation embodiment, the topology graphs can also include information regarding which signs are to have their advertisements replaced. If so, once the topology graph of Fig. 3B is matched to that of Fig. 3A, the graph of Fig. 3A can be reviewed to determine if any of the matched objects are to be replaced. For example, in Fig. 3A sign 44 is marked with an X, indicating that it is to be replaced. Since, after matching, it is determined that frame 80 includes sign 44, the advertisement on sign 44 can be readily replaced, as will be described in more detail hereinbelow. Altematively, if Fig. 3A indicated that only sign 50 is to be replaced (which is not present in the graph of Fig. 3B), then, for frame 80, no signs would have their advertisements replaced.

Reference is now made to Fig. 4 which illustrates, in partial block diagram format, a system which implements the concepts outlined hereinabove for replacing advertisements seen in a video stream. The system comprises a video digitizer 100, such as the Targa2000 manufactured by Truevision Inc. of Indianapolis, Indiana, USA, an orientation unit 102, an implantation unit 104, and a host computer 106, all connected together via a bus 105, such as a peripheral component interconnect (PCI) bus. The host computer 106 typically also is associated with input devices 116, such as a keyboard and/or a mouse and/or a tablet, and a monitor 108.

The video digitizer 100 receives incoming video frames for television broadcasting to many countries. The video frames can be formatted in any one of many formats, such as NTSC (National Television Standards Committee) or PAL (Phase Alternate Lines) formats, and can be either analog or digital signals. The video digitizer 100 processes the video frames as necessary on input and on output. The output signals are those altered by the present system, as will be described in more detail hereinbelow, in the same format as the incoming video frames. The orientation unit 102 determines, per frame, what static objects appear in the frame and where they are, if and where there are occluding objects, such as players, which static objects are to have new advertising images implanted thereon and the perspective view of the frame with which the implantation can occur. The implantation unit 104 utilizes the information from the orientation unit 102 to transform an advertisement, input through the host computer 106, to the correct perspective and to implant it in the proper location within the present frame. The altered frame is then provided to the video digitizer 100 for output. For real-time operation, orientation unit 102 and implantation unit 104 are implemented on one or more parallel, high speed processing units, such as the HyperShark Board, manufactured by HyperSpeed Technology Inc. of San Diego, California, USA. For non-real-time operation, such as might occur when the sports game is not televised live, the orientation unit 102 and implantation unit 104 can be implemented on standard platforms, whether with a single or multiple processors, such as personal computers (PC), or workstations operating the Unix or Windows NT, of Microsoft Corporation of the USA, operating systems.

The host computer 106 controls the operations of the units 100, 102 and 104 and, in addition, provides user commands, received from input devices 116, to the units 100 - 104.

The orientation unit 102 is divided into two processing units which share similar operations. The first unit is a mapper 110 which receives the initial video sequence in which, typically, the stadium 58 is scanned. Mapper 110 determines which objects in the stadium 58 must be part of the stadium in accordance with the official rules of the game and which objects are present in this particular stadium. The standard elements are available in a standards database 118 and the remaining objects are placed into a stadium database 114. Both the standards database 118 and the stadium database 114 store the invariant properties of the objects of the stadium in the form of a topology graph such as is described hereinabove.

The second unit is a frame orientation unit 112 which receives the video stream during the exemplary soccer game, determines the objects in a current frame, and utilizes the topology information in database 114 to determine which static objects of the stadium 58 occur in the current frame and what the projection transform is for the current frame. In addition, orientation unit 104 determines which foreground objects, such as players or the ball, are in the current frame and how and where they occlude the sign to be replaced. This information is provided as output to the implantation unit 104.

For every object of interest in the stadium, stadium database 114 lists the texture, neighbors, boundary pixels and boundary equations. Objects to be replaced (e.g. some or all of the signs) are so marked.

The elements of mapper 110 are illustrated in Fig. 5, to which reference is now made. Mapper 110 comprises a segmenter 120, a boundary analyzer 124, two correlators 128 and 130, a graph builder 132 and a database updater 134.

The segmenter 120 segments the current frame into its component segments, in accordance with any suitable segmentation operation, such as those described in the article "Application of the Gibbs Distribution to Image Segmentation", in the book Statistical Image Processing and Graphics, edited by Edward J. Wegman and Douglas J. DePriest, Marcel Dekker, Inc., New York and Basel, 1986, pp. 3 - 11. The book is incorporated herein by reference.

The segmenter 120 divides the frame into segments by determining which connected sections of neighboring pixels have approximately the same texture, where "texture" is an externally defined quality which describes a group of neighboring pixels. For example, texture can be color and thus, each object is one with a single color or with colors near to a central color. Texture can also be luminance level or a complex description of the color range of an object, where the color range can be listed as a color vector. For example, the color of grass is a combination of green, yellow and brown pixels. The average color of a group of pixels of grass will be relatively constant, as is the covariance of all components of the color vector.

Since the present invention operates on frames of, outdoor and/or indoor activity, the texture definition must be robust with respect to lighting conditions; otherwise, as the sun changes brightness, the objects in the arena will change. The average color and the covariance of the colors within a group of color provides this robustness, as does the consideration that textures are equal if they differ within a prescribed tolerance.

It is also noted that, if the signs 44 have pattems of more than one texture, the signs will be separated into separate segments which, as described hereinbelow, are later marked as belonging to the same object. The segmenter

120 provides, on output, the texture of the segments and which pixels belong to each segment.

It is noted that, prior to output, the segmenter 120 searches for "parasite" segments, which do not correspond to real objects but result from noisy pixels of the frame. Criteria for identifying such segments are: the size of the segments and an extraordinary texture (i.e one not previously seen, one previously defined as extraordinary or a texture out of place, such as a few pixels of one texture within a segment of another, completely different texture). When the segmenter 120 determines the presence of such noise, it smooths it out of the original frame and reperforms the segmentation.

The boundary analyzer 124 reviews the segment data and produces mathematical equations describing the imaginary curve which approximates the boundaries between neighboring segments. To do so, boundary analyzer 124, for each segment, identifies the bordering pixels, namely those pixels belonging to the texture of the segment but having neighboring pixels which belong to different textures. This is illustrated in Fig. 6, to which reference is now briefly made. Pixels of two segments 136 and 138 are illustrated where each segment has a different texture. The bordering pixels of segments 136 and 138 are labeled 135 and 137, respectively.

Boundary analyzer 124 typically utilizes contour following techniques to determine the locations of the bordering pixels. One such technique is described on pages 290 - 293 in the book Pattern Classification and Scene Analysis, by

Richard O. Duda and Peter E. Hart, John Wiley and Sons, New York, 1973. The book is incoφorated herein by reference.

By using regression techniques, the boundary analyzer 124 attempts to fit straight lines or quadratic curves to varying-length sections of the bordering pixels, where the section length is a function of the quality of the fit of the bordering pixels to the straight or quadratic curves. Boundaries which match straight lines or quadratic curves are so marked.

If there are a large number, for example, more than 10, varying-length sections and/or many of them are quite short (describing the shape of just a few pixels), the segment either is poorly defined or it has a non-simple boundary. For non-simple boundaries, boundary analyzer 124 indicates to the segmenter 120 to repeat the segmentation in an attempt to smooth the boundary.

Boundary analyzer 124 produces the coefficients describing the boundaries on output.

Graph builder 132 creates the topology graph for the current frame, such as is shown in Fig. 3B, from the segments and the boundary equations. Since the boundary analyzer 124 also provides the adjacency information for each segment, graph builder 132 can create the topology graph and add to it the information regarding the texture and boundary specification for each segment. The graph typically is not drawn as shown in Figs. 3A and 3B but appropriately represented with each segment as a node and, for each segment, its neighbors, texture type and boundary equations are represented by connected nodes, node labels and edge labels. It is noted that segmenter 120, boundary analyzer 124 and graph builder 132 form a graph creator 140 which produces a topology graph for a frame.

Correlator 128 is a previous frame correlator which matches the current frame with the previous frame or frames to determine which segments are common to the two frames, thereby determining which segments are new segments in the current frame. For example, the present frame may show many field markings, only one of which has not yet been seen but three of which were seen in a frame which is three frames previous to the present one.

Correlator 128 operates by comparing the topology graph of the current frame with those of the previous frames. The comparison involves first matching the topologies of textures between the current frame and each of the previous frames. For the matched portions of the graph, the correlator 128 then matches the topologies of the boundary types. If the texture topology and the boundary type topology matches, then the two portions of the graphs match. It will be appreciated that changes in perspective do not affect the graph matching operation since the graph lists the relatively perspective-invariant elements of the arena.

The graph matching operation can be performed in accordance with any suitable graph matching techniques. A pseudo-isomorphism operation can be performed as described hereinbelow with respect to Figs. 7, 8A, 8B, 9A and 9B.

From the matched portions, the correlator 128 then determines which of the segments of the current frame have not been matched to any of the graphs of the previous frames. These segments indicate new segments which may have to be added to the stadium database 114 and are so marked on the graph for the current frame.

Correlator 128 provides the marked graph for the current frame to the correlator 130 which determines which, if any, of the segments form part of the standard objects of the playing field. To do so, correlator 130 matches the topology graph of the standard elements of the playing field, as stored in standards database 118, with the graph of the current frame. The correlation operation is similar to that described hereinabove whereby first the texture topology is matched and then the boundary topology is matched, for the matched texture topology sections.

Those segments of the current frame which match the standard elements in the standards database 118 are so marked. It will be appreciated that the standards database 118 provides the present invention with a set of already known objects in the field from which to begin to map the stadium 58.

The database updater 134 receives the marked graph, marked with segments found in previous frames and segments conforming to standard elements of the playing field, and determines which segments are new. Updater 134 then determines which of the new segments are likely to be "interesting" objects, such as by not including segments having non-simple boundaries, and which segments, of the segments conforming to the standard elements of the playing field, have not already been added into the stadium database 114. Updater 134 then includes only the selected segments as objects in the stadium database 114. The process involves providing the selected segments corresponding to non-standard objects with object numbers, defining the adjacency relationships of the new segments with the segments already in the stadium database 114 and listing the boundary equations and textures of the new segments. The updater 134 also marks the new segments for replacement if the user indicates as such. In addition, there might be objects, such as signs, which are a combination of a plurality of segments. If the user indicates that these segments all belong to one object, then the database updater 134 indicates as such to the stadium database 114.

If some of the segments of the current frame form part of the standard elements of the playing field (for which exact relationships, sizes and lengths of the various elements are known), updater 134 creates a perspective transformation from the perspective of the standard elements (which are typically provided as a top view) to that of the corresponding segments in the current frame. The perspective transformation is created utilizing the boundary equations of the standard objects in the standards database 118 and the boundary equations of the corresponding segments in the current frame. The transformation can be produced in any suitable way. Pages 386 - 441 of the book Pattern Classification and Scene Analysis, referenced hereinabove, describes how to determine perspective transformations.

If any of the segments of the current frame are not of standard objects, for example, they might be signs, the updater 134 then transforms these other segments with the perspective transformation thus determined, thereby to provide these other segments with the same perspective as that of the standard objects of the field. Since the perspective transformation relates the actual field to the field as defined by the official bodies, updater 134 can optionally determine, from the transformation information, the sizes of the standard objects and, of course, of the non-standard objects.

It will be appreciated that the entire operation is repeated per frame, for the entire initial video sequence. By the end of the initial video sequence, the arena has been fully mapped.

Reference is now made to Fig. 7 which generally illustrates the elements of the orientation unit 112. The orientation unit 112 operates on frames of the video sequence of the game and comprises a graph creator 140, an occlusion evaluator 142, a graph matcher 144, a perspective evaluator 146 and an implantation identifier 148. The operations of occlusion evaluator 142, graph matcher 144 and perspective evaluator 146 iterate until a match of a desired quality is reached. As described hereinabove with respect to Fig. 5, the graph creator 140 reviews the current, game frame and creates a topology graph for it. As described hereinabove, the output is a topology graph indicating the segments in the current, game frame, their boundary equations, neighbors and textures.

It will be appreciated that, in game frames, there are additional objects (segments) in the frame due to the activity of the game. Thus, there might be players, umpires, balls, spectators, etc. which occlude, partially or completely, the static objects of the background whose information is stored in the stadium database 114. This is illustrated in Figs. 8A, 8B, 9A and 9B to which reference is now briefly made. Fig. 8A illustrates a simple arena having 11 objects therein, labeled 1 - 11 , and Fig. 8B provides their topology graph where the circles, x's, squares and triangles indicate different textures. Fig. 9A illustrates a frame view of the arena of Fig. 8A having five objects therein, labeled A - E. Fig. 9B is the corresponding graph to Fig. 9A. It is noted that objects A - D match objects 2 - 5 and object E is an occluding object which occludes objects 2 and 4.

Reference is now made back to Fig. 7. Graph creator 140 provides the occlusion evaluator 142 with the segments of the game frame, their textures and boundary types. Occlusion evaluator 142 reviews each segment and determines which of them fulfills any or some of the following occluding object criteria: a) some of the boundaries of the object are non-simple (i.e. not straight lines or quadratic curves) as shown in Fig. 9A for object E; and b) its texture is one not seen in previous frames or one defined in previous frames as being of an occluding object. The occlusion evaluator 142 provides the list of segments which are possible occluding segments to the graph matcher 144. Graph matcher 144 also receives the graph for the current game frame from the graph creator 140. Graph matcher 144 attempts to match the small graph of the current, game frame to the topology of the stadium database. To do so, it operates in a number of ways, depending on the state of the video sequence.

Graph matcher 144 attempts to match the current game frame to the topology of the stadium database 114. For this matching, graph matcher 144 operates similarly to the previous frame correlator 128 of Fig. 5.

Initially, graph matcher 144 just matches the current graph to that of the stadium database 114 and produces a match quality measurement. Subsequently, since the graph of the current game frame includes occluding objects, graph matcher 144 removes the suspected occluding segments one at a time and, if desired, a group at a time, and produces a match quality measurement for the graph with the removed segment. It will be appreciated that graph matcher 144 performs a matching operation similar to that of correlator 128. Specifically, the matcher 144 first attempts to match the topologies of textures between the current frame and the stadium database 114. The number of segments matched out of the total number of segments in the graph indicates the quality of the match. For the matched portions of the graph, if any, the graph matcher 144 then matches the topologies of the boundary types. If the texture topology and the boundary type topology matches, then the two portions of the graphs match.

Once the graph, with none or only some of the occlusions included, has been matched, the graph matcher 144 identifies which segments of the current frame are part of the background and which segments occlude the objects of the background.

The graph matcher 144 provides this information, as well as the boundary equations of the segments as output.

The perspective evaluator 146 operates similarly to part of updater 134 and determines, from the boundary equations of the segments corresponding to objects of the background (received from graph creator 140), the perspective transformation for the current game frame. The perspective evaluator 146 can utilize all of the boundary equations or only some of them, for example, the boundary equations corresponding to objects forming part of the standard elements of the field. The transformation produces a transformation matrix M whose parameters are provided as output of the perspective evaluator 146. The transformation matrix M will be utilized to transform the image to be implanted from the perspective of the standard field elements (i.e. top view) to the perspective of the current, game frame.

The perspective evaluator 146 typically tests the perspective transform on the objects of the arena which have been identified. The result should be a frame which closely matches the game frame. However, since the perspective transformation describes the transformation of the background elements of the frame (which occurs only due to the movement of the camera viewing them), it will not successfully describe the movements of a human being who moves of his own. Thus, any segment not well matched via the transformation is considered a possibly occluding object and this information is provided to the occlusion evaluator 142 for the next iteration.

As mentioned hereinabove, the occlusion evaluator 142, graph matcher 144 and perspective evaluator 146 iterate until the graph of the current, game frame, less the occluding elements, perfectly matches a section of the graph of the stadium database 114. The perfect match indicates that the current, game frame has been oriented with respect to the stadium 58. A lack of a perfect match indicates that the camera is showing something which is not part of the stadium 58, such as a video of an advertisement. For the implantation embodiment, the orientation information is provided to the implantation identifier 148 which reviews the matched objects, provided by graph matcher 144, and determines if any of them are marked for implantation in the stadium database 114. For those that are, implantation identifier 148 provides, on output, the segments which are to be implanted and their boundary equations. It will be appreciated that signs with patterns on them are formed of many connected segments, all of which are marked for implantation and all of which are marked as being part of the same sign. The implantation identifier 148 determines the outer boundary of the collection of segments forming the sign and provides the boundary equations of the outer boundary of the sign on output. Fig. 7 indicates that the output to the orientation unit 112 is the transformation matrix for transforming the image to be implanted into the current, game frame, and the areas to be implanted.

Reference is now made to Fig. 10 which illustrates the operations of graph matcher 144 and back to Figs. 8A and 8B which are useful for understanding Fig. 10.

In step 150, the current game frame (or other current frame) is reviewed to enumerate its interior cycles and isthmus edges. A cycle is interior if it is the union of segments in the frame corresponding to nodes which are a 1 -connected set.

Thus, for the graph of Fig. 8B, the interior cycles are (2,3,4), (2,4,5), (3,4,5), (3,5,6), (3,6,7) and (6,8,9). Isthmus edges are edges (i.e. connections between nodes) whose removal increases the number of connected components of the graph. An isthmus edge provides an isthmus between two parts of the graph.

The complete list of interior cycles and isthmus edges determines the graph uniquely. It is noted that, when creating database 114, the interior cycles of the database are also determined and listed in accordance with a step similar to step

150. Step 150 involves determining all cycles of length 3 and selecting those which are interior.

Steps 152 - 159 form the method of searching through the graph to find matching graph sections and are performed per interior cycle and per isthmus edges of the graph.

The current interior cycle is compared (step 152) to all interior cycles of the database. If, for one interior cycle of the database, the textures at the nodes of the two cycles match (step 154), adjacent cycles to the current interior cycle are compared (step 156) to the adjacent cycles of the interior cycle of the database. If the adjacent cycles of the current interior cycle match the adjacent cycles of the interior cycle of the database (step 158), then the set of adjacent cycles of the current frame are marked (step 159) as having matched the database. In any case, the next interior cycle of the current frame is now considered. The process is repeated until all interior cycles have been reviewed. Analogous operations are performed for all of the isthmuses of the current frame and the database. The matching of adjacent nodes to both nodes of the considered isthmus is checked.

Reference is now made to Fig. 11 which illustrates the elements of the implantation unit 104 and to Fig. 9A which is useful in understanding parts of its operation. Unit 104 comprises a segment filler 160, a transformer 164 and a mixer 166.

Segment filler 160 receives the information of the implantation areas from the orientation unit 112 and determines the pixels of the current, game frame which are included therein. It is noted that the implantation areas include in them information of where the occluding areas are. This is illustrated in Fig. 9A. If segment A is an object to be replaced, its shape is not a triangle but a triangle less most of occluding object E. Thus, the pixels of segment A do not include any pixels of occluding object E.

From the filled segments, segment filler 160 produces a permission mask which, for the current, game frame, masks out all but the areas of the frame in which the implantation will occur. This involves placing a '1' value in all pixels of the filled implantation areas and a '0' value at all other pixels. The image will be implanted onto the pixels of value .

The transformer 164 utilizes the transformation matrix M to distort each of the advertising image into the plane of the video frame. Optionally, a blending mask can be provided for the advertising image. If so, transformer 164 transforms the blending mask also.

The mixer 166 combines the distorted advertising image with the video frame in accordance with the blending and permission masks. The formula which is implemented for each pixel (x,y) is typically:

Output(x.y) = P(x,y)^*image(x,y) + (1 - P(x,y))*video(x,y) (1)

where Output(x.y) is the value of the pixel of the output frame, image(x,y) and video(x.y) are the values in the transformed, advertising image and the current, game frame, respectively, and P(x,y) is the value of the permission mask multiplied by the blending mask. The output, output(x.y), is a video signal into which the advertising image has been implanted onto a desired surface.

It will be appreciated by persons skilled in the art that the present invention is an orientation system for orienting a frame of activity data within a mapped background scene. The orientation system operates through creation of a topology graph of the relatively invariant elements of the background scene. This orientation system can be utilized in many systems, one embodiment of which, shown herein, is an implantation system. Other systems which can utilize the present orientation

SUBSTTTUTE SHEET (RULE 26) system are systems for highlighting, changing the color or deleting background objects.

It will further be appreciated by persons skilled in the art that the present invention is not limited to what has been particularly shown and described hereinabove. Rather the scope of the present invention is defined by the claims which follow:

Claims

1. An orientation unit for orienting a frame viewing an arena in which action occurs, the orientation unit comprising: a. a background topology graph of relationships of the background objects of said arena with each other, wherein said relationships are those which change minimally from view to view of said arena; b. a graph creator which creates a frame topology graph of relationships of segments of said frame; and c. a graph correlator which correlates said frame topology graph with said background topology graph to determine which objects of said arena, if any, are represented by segments of said frame.

2. An orientation unit according to claim 1 and also comprising a background topology graph creator which includes: a. a standards topology graph of standard elements known to be present in arenas of a type similar to said arena; b. means for receiving an initial video sequence of frames viewing said arena; c. a second graph creator which creates a frame topology graph of relationships of segments of each frame of said initial video sequence; d. a second graph correlator which correlates said frame topology graph, for each frame of said initial video sequence, with said standards topology graph to determine which standard elements, if any, are represented by which segments of said frame; and e. a background topology graph creator for determining, at least from output of said graph correlator, which of said segments of said frame are to be added to said background topology graph.

3. An orientation unit according to any of claims 1 - 2 and wherein said graph correlator includes: a. an occlusion evaluator for determining which segments of said frame might represent occluding objects; b. a graph matcher for matching said frame topology graph to said background topology graph including: i. means for removing one or more of said segments representing possibly occluding objects from said frame topology graph to create a reduced frame topology graph; and ii. means for matching said reduced frame topology graph to said background topology graph and for producing, on output, the objects of said background and the segments matched by to said objects; and c. a perspective evaluator for determining a perspective transformation between said matched objects and said matched segments.

4. An orientation unit according to any of the previous claims and wherein said relationships include at least one of: the adjacency relationships of neighboring segments, the textures of each segment, and the boundary equations of each segment.

5. A frame description unit for describing a frame viewing an arena, the unit comprising: a. a texture segmenter which segments said frame into segments of uniform texture; and b. an adjacency determiner which creates a graph listing which segments are neighbors of which segments.

6. A frame description unit according to claim 5 and also including a boundary analyzer which determines which pixels of each segment form its borders, which determines if the border pixels generally form one of a straight line and a quadratic curve and which determines the coefficients of the equations of said straight line or quadratic curve which said border pixels describe.

7. An implantation unit for implanting an image into a frame on a surface within an arena in which action occurs, the implantation unit comprising: a. an orientation unit for orienting said frame within said arena and for indicating where in said frame said surface is; and b. an implanter for implanting said image into the portion of said frame indicated by said orientation unit, wherein said orientation unit includes: i. a background topology graph of relationships of the background objects of said arena with each other, wherein said relationships are those which change minimally from view to view of said arena; ii. a graph creator which creates a frame topology graph of relationships of objects of said frame; and iii. a graph correlator which correlates said frame topology graph with said background topology graph to determine which objects of said arena, if any, are viewed by said frame and when said surface is present in said frame.

8. An implantation unit according to claim 7 wherein said orientation unit also includes a background topology graph creator which includes: a. a standards topology graph of standard elements known to be present in arenas of a type similar to said arena; b. means for receiving an initial video sequence of frames viewing said arena; c. a second graph creator which creates a frame topology graph of relationships of segments of each frame of said initial video sequence; d. a second graph correlator which correlates said frame topology graph, for each frame of said initial video sequence, with said standards topology graph to determine which standard elements, if any, are represented by which segments of said frame; and e. a background topology graph creator for determining, at least from output of said graph correlator, which of said segments of said frame are to be added to said background topology graph.

9. An implantation unit according to any of claims 7 - 8 and wherein said graph correlator includes: a. an occlusion evaluator for determining which segments of said frame might represent occluding objects; b. a graph matcher for matching said frame topology graph to said background topology graph including: i. means for removing one or more of said segments representing possibly occluding objects from said frame topology graph to create a reduced frame topology graph; and ii. means for matching said reduced frame topology graph to said background topology graph and for producing, on output, the objects of said background and the segments matched by to said objects; c. a perspective evaluator for determining a perspective transformation between said matched objects and said matched segments; and d. an implantation location determiner for determining which of the matched segments corresponds to said surface to be implanted upon.

10. An implantation unit according to any of claims 7 - 9 and wherein said relationships include at least one of: the adjacency relationships of neighboring segments, the textures of each segment, and the boundary equations of each segment.

11. An implantation unit according to any of claims 9 - 10 and wherein said implanter includes: a. a transformer for transforming said image in accordance with said perspective transformation thereby creating a transformed image; b. a permission mask creator for creating a permission mask from said matched segments corresponding to said surface to be implanted upon; and c. a mixer for mixing said frame with said transformed image in accordance with said permission mask.

12. A method for implanting an image into a selected one at a time of a plurality of video frames representing a stream of action occurring within a background space, the space having fixed planar surfaces and being scanned by at least one video camera, the method comprising the steps of: a. providing an initial model, independent of said plurality of video frames, of a selected one of said fixed surfaces, said initial model comprising a graph of the relationships of the background objects of said selected fixed surface with each other, wherein said relationships are those which change minimally from view to view of said background space; b. generating a background model of objects of said background space from initial frames of said video frames which view only said background space and said initial model, said background model comprising a graph of the relationships of the background objects of said background space with each other; c. utilizing said background model for identifying the objects viewed in each video frame; and d. perspectively implanting said image into said frame into the portion of said frame viewing a previously selected one of said fixed planar surfaces.

13. A method for orienting a frame viewing an arena in which action occurs, the method comprising the steps of: a. previously generating a background topology graph of relationships of the background objects of said arena with each other, wherein said relationships are those which change minimally from view to view of said arena; b. generating a frame topology graph of relationships of segments of said frame; and c. correlating said frame topology graph with said background topology graph to determine which objects of said arena, if any, are represented by segments of said frame.

14. A method according to claim 13 and wherein said step of previously generating a background topology graph includes the steps of: a. having a standards topology graph of standard elements known to be present in arenas of a type similar to said arena; b. receiving an initial video sequence of frames viewing said arena; c. generating a frame topology graph of relationships of segments of each frame of said initial video sequence; d. correlating said frame topology graph, for each frame of said initial video sequence, with said standards topology graph to determine which standard elements, if any, are represented by which segments of said frame; and e. determining, at least from output of said step of correlating, which of said segments of said frame are to be added to said background topology graph.

15. A method according to any of claims 13 - 14 and wherein said step of correlating includes the steps of: a. determining which segments of said frame might represent occluding objects; b. matching said frame topology graph to said background topology graph including the steps of: i. removing one or more of said segments representing possibly occluding objects from said frame topology graph to create a reduced frame topology graph; and ii. matching said reduced frame topology graph to said background topology graph and generating, thereby, the objects of said background and the segments matched by to said objects; and c. determining a perspective transformation between said matched objects and said matched segments.

16. A method according to any of claims 13 - 15 and wherein said relationships include at least one of: the adjacency relationships of neighboring segments, the textures of each segment, and the boundary equations of each segment.

17. A method for describing a frame viewing an arena, the method comprising the steps of: a. segmenting said frame into segments of uniform texture; and b. generating a graph listing which segments are neighbors of which segments.

18. A method according to claim 17 and also including the steps of: a. determining which pixels of each segment form its borders; b. determining if the border pixels generally form one of a straight line and a quadratic curve; c. determining the coefficients of the equations of said straight line or quadratic curve which said border pixels describe.

19. A method for implanting an image into a frame on a surface within an arena in which action occurs, the method comprising the steps of: a. orienting said frame within said arena and indicating where in said frame said surface is; and b. implanting said image into the portion of said frame indicated by said orientation unit, wherein said step of orienting includes the steps of: i. previously generating a background topology graph of relationships of the background objects of said arena with each other, wherein said relationships are those which change minimally from view to view of said arena; ii. generating a frame topology graph of relationships of objects of said frame; and iii. correlating said frame topology graph with said background topology graph to determine which objects of said arena, if any, are viewed by said frame and when said surface is present in said frame.

20. A method according to claim 19 wherein said step of previously generating includes the steps of: a. having a standards topology graph of standard elements known to be present in arenas of a type similar to said arena; b. receiving an initial video sequence of frames viewing said arena; c. generating a frame topology graph of relationships of segments of each frame of said initial video sequence; d. correlating said frame topology graph, for each frame of said initial video sequence, with said standards topology graph to determine which standard elements, if any, are represented by which segments of said frame; and e. determining, at least from output of said graph correlator, which of said segments of said frame are to be added to said background topology graph.

21. A method according to any of claims 19 - 20 and wherein said step of correlating includes the steps of: a. determining which segments of said frame might represent occluding objects; b. matching said frame topology graph to said background topology graph including the steps of: i. removing one or more of said segments representing possibly occluding objects from said frame topology graph to create a reduced frame topology graph; and ii. matching said reduced frame topology graph to said background topology graph and generating, thereby, the objects of said background and the segments matched by to said objects; c. determining a perspective transformation between said matched objects and said matched segments; and d. determining which of the matched segments corresponds to said surface to be implanted upon.

22. A method according to any of claims 19 - 21 and wherein said relationships include at least one of: the adjacency relationships of neighboring segments, the textures of each segment, and the boundary equations of each segment.

23. A method according to any of claims 21 - 22 and wherein said third step of implanting includes the steps of: a. transforming said image in accordance with said perspective transformation thereby creating a transformed image; b. generating a permission mask from said matched segments corresponding to said surface to be implanted upon; and c. mixing said frame with said transformed image in accordance with said permission mask.

24. Apparatus according to any of claims 1 - 11 substantially as shown and described hereinabove.

25. Apparatus according to any of claims 1 - 11 substantially as illustrated in any of the drawings.

26. A method according to any of claims 12 - 23 substantially as shown and described hereinabove.

27. A method according to any of claims 12 - 23 substantially as illustrated in any of the drawings.