WO2008097051A1 - Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person - Google Patents
Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person Download PDFInfo
- Publication number
- WO2008097051A1 WO2008097051A1 PCT/KR2008/000757 KR2008000757W WO2008097051A1 WO 2008097051 A1 WO2008097051 A1 WO 2008097051A1 KR 2008000757 W KR2008000757 W KR 2008000757W WO 2008097051 A1 WO2008097051 A1 WO 2008097051A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sections
- specific person
- moving picture
- face
- voice
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 230000002123 temporal effect Effects 0.000 claims description 44
- 238000001514 detection method Methods 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 2
- 230000001815 facial effect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services; Handling legal documents
- G06Q50/184—Intellectual property management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
- G06V40/173—Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/179—Human faces, e.g. facial parts, sketches or expressions metadata assisted face recognition
Definitions
- the present invention relates to a method for searching a specific person included in a digital data, and a method and an apparatus for producing a copyright report for the specific person.
- FIG. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention
- Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention
- Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention
- Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention
- Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention. Best Mode for Carrying Out the Invention
- a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components including the steps of: (a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and (b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
- the person search unit 120 checks rapidly whether an image, e.g., a facial image, of the specific person is included in the moving picture without permission.
- an image e.g., a facial image
- the person search unit 120 includes a voice search unit 220 for searching voice included in the moving picture, and an image search unit 230 for retrieving the facial image of the specific person from the moving picture. Moreover, the person search unit 120 may further include a character string search unit 210 for retrieving character strings, such as a name, a nickname, and the like, associated with the specific person from the moving picture.
- the image search unit 230 retrieves the specific person's face from the face_search_candidate_sections to check whether the specific person's facial image is included in the moving picture without permission or not.
- the image search unit 230 may be embodied by means of one or more face detection techniques and/or face recognition techniques well known in the art.
- both the retrieval of first sections where the specific person's face is included by the image search unit 230 and the retrieval of second sections where the specific person's voice is included by the voice search unit 220 may be performed at the same time.
- the method for producing the copyright report for the specific person includes the steps of acquiring the moving picture (S310), retrieving temporal sections of the moving picture in which the specific person appears (S320), and producing the copyright report based on the retrieved sections (S330).
- the copyright report may be produced at the step 330, the copyright report being used as a supporting evidence for copyright infringement.
- FIG. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention.
- FIG. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention.
- first temporal sections including the person's voices are determined as the face_search_candidate_sections (S530).
- FIG. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
- character strings associated with the specific person are first retrieved from the moving picture (S 610), unlike the embodiments of Figs. 4 and 5.
- the character strings may include data, e.g., a caption inserted into the moving picture, as previously mentioned.
- first temporal sections including the specific person's voice are determined as the face_search_candidate_sections (S650).
- FIG. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
- character strings associated with the specific person are first retrieved from the moving picture (S710) like the embodiment of Fig. 6 (unlike the embodiments of Figs. 4 and 5). Since the examples of the character strings were previously mentioned, a detailed description thereabout will be omitted.
- first temporal sections including the person's voices are determined as the face_search_candidate_sections (S750).
- Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
- character strings associated with the specific person are retrieved from the moving picture (S 810), as shown in the embodiments of Figs. 6 and 7. Referring to Fig. 8, however, the retrieved character strings are applied to determine the face_search_candidate_sections, unlike the embodiments of Figs. 6 and 7.
- first temporal sections including the character strings or the specific person's voice are determined as the face_search_candidate_sections (S840). This is because the first temporal sections including the character strings associated with the specific person or the specific person's voice are considered as time slots in which the specific person is highly likely to appear.
- the embodiments of the present invention described with reference to Figs. 4 to 8 may be embodied by using metadata such as Electronic Program Guide (EPG).
- EPG Electronic Program Guide
- the name of the specific person may be retrieved from the EPG in the first place, which may include information on a plurality of performers, and in case the specific person is included in the EPG, the attempt to retrieve the specific person from a corresponding moving picture may be made with efficiency, resulting in a high-accurate retrieval.
- the EPG is available for moving pictures provided by broadcasting stations such as KBS, MBC, and the like.
- the EPG may be unavailable for moving pictures illegally distributed because in this case it is a matter of course not to have corresponding EPG.
Abstract
A specific person can be rapidly retrieved from a moving picture by an automated system. A report indicating whether or not a copyright infringement is committed is automatically produced by the automated system, thus allowing a copyright holder to check whether his or her copyright is infringed or not with ease. A method for automatically searching the specific person in the moving picture includes the steps of: determining a face_search_candidate_sections of the moving picture based on voice recognition technique; and retrieving sections including the specific person's face from the face_search_candidate_sections.
Description
Description
METHOD FOR SEARCHING SPECIFIC PERSON INCLUDED
IN DIGITAL DATA, AND METHOD AND APPARATUS FOR
PRODUCING COPYRIGHT REPORT FOR THE SPECIFIC
PERSON Technical Field
[1] The present invention relates to a method for searching a specific person included in a digital data, and a method and an apparatus for producing a copyright report for the specific person. Background Art
[2] In recent years, User Created Content ("UCC") has been skyrocketing in its popularity. The number of websites for providing the UCC to user is also increasing.
[3] The UCC refers to various kinds of media content produced by ordinary people rather than companies. In detail, the UCC includes a variety of contents, such as music, a photograph, a flash animation, a moving picture, and the like. Disclosure of Invention Technical Problem
[4] The increasing popularity of UCC induces the diversification of the subject who creates the UCC. In former days when contents had been produced by only a few subjects, a copyright holder could protect his or her copyright with no difficulty. Nowadays, however, the diversification of the subjects may often bring about issues on copyright infringement.
[5] In order to check whether the copyright infringement is committed, much time and cost may be required. For example, time and cost may be required to determine when a specific person appears in the UCC such as a moving picture without permission. Accordingly, there is a need for a scheme capable of easily determining whether the specific person appears in the moving picture, thereby effectively protecting copyrights.
[6] Further, the scheme may be used to produce a copyright report without having to require much time and cost. Technical Solution
[7] It is, therefore, one object of the present invention to provide a method for searching a specific person in a moving picture.
[8] It is another object of the present invention to provide a method for producing a copyright report for the specific person included in the moving picture in order to
easily check whether a copyright infringement for the specific person is committed or not.
[9] It is yet another object of the present invention to provide an apparatus for producing the copyright report for the specific person included in the moving picture without having to require much time and cost.
Advantageous Effects
[10] In accordance with the present invention, a specific person can be rapidly searched in the moving picture without a vexatious manual search. [11] Further, in accordance with the present invention, a copyright report for the specific person is automatically produced so that a copyright holder can easily check whether infringement is committed against his or her copyright. [12] Furthermore, in accordance with the present invention, since the face of the specific person is rapidly retrieved from the moving picture, the copyright holder can easily check whether the copyright infringement is committed or not, thereby protecting his or her copyright effectively.
Brief Description of the Drawings [13] The above and other objects and features of the present invention will become apparent from the following description of preferred embodiments given in conjunction with the accompanying drawings, in which: [14] Fig. 1 shows a block diagram illustrating an apparatus for producing a copyright report for a specific person who appears in a moving picture in accordance with the present invention; [15] Fig. 2 provides a block diagram showing a person_search_unit in accordance with the present invention; [16] Fig. 3 depicts a flowchart illustrating a process of producing a copyright report for the specific person included in the moving picture in accordance with the present invention; [17] Fig. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention; [18] Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention; [19] Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention; [20] Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention; and [21] Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
Best Mode for Carrying Out the Invention
[22] In accordance with one aspect of the present invention, there is provided a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method including the steps of: (a) extracting the audio components from the moving picture and determining first sections as face_search_candidate_sections, the first sections including temporal sections, in which person's voices among the extracted audio components are included, by using voice recognition technique; and (b) comparing faces of one or more persons appearing in a part of the video components included in the face_search_candidate_sections with the specific person's face by using face recognition technique, and determining second sections as results, the second sections including temporal sections, in which a certain individual among the persons appearing in the part of the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[23] In accordance with another aspect of the present invention, there is provided a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method including the steps of: (a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and (b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
[24] In accordance with yet another aspect of the present invention, there is provided an apparatus for producing a copyright report for a specific person appearing in a moving picture, the apparatus including: a moving picture acquiring unit for acquiring the moving picture; a component extracting unit for extracting video components and audio components from the acquired moving picture; a character string search unit for determining first sections so as to include temporal sections in which one or more character strings associated with the specific person are included, in case the character strings are retrieved from the extracted video components by character recognition technique; a voice search unit for determining second sections so as to include temporal sections in which the specific person's voice is included, in case the specific person's voice is retrieved from the extracted audio components by voice recognition
technique; an image search unit for comparing faces of one or more persons appearing in the extracted video components with the specific person's face by using face recognition technique, and determining third sections so as to include temporal sections in which a certain individual among the persons appearing in the extracted video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face; and a copyright report producing unit for automatically producing a copyright report including information on time slots corresponding to at least one of the first sections, the second sections, and the third sections and the name of the specific person appearing in at least one of the first sections, the second sections, and the third sections. Mode for the Invention
[25] In the following detailed description, reference is made to the accompanying drawings that show, by way of illustration, specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention. It is to be understood that the various embodiments of the present invention, although different, are not necessarily mutually exclusive. For example, a particular feature, structure, or characteristic described herein in connection with one embodiment may be implemented within other embodiments without departing from the spirit and scope of the present invention. In addition, it is to be understood that the location or arrangement of individual elements within each disclosed embodiment may be modified without departing from the spirit and scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims, appropriately interpreted, along with the full range of equivalents to which the claims are entitled. In the drawings, like numerals refer to the same or similar functionality throughout the several views.
[26] The present invention will now be described in more detail, with reference to the accompanying drawings.
[27] Fig. 1 shows a block diagram illustrating an apparatus 100 for producing a copyright report for a specific person included in a digital data, e.g., a moving picture, in accordance with the present invention.
[28] In detail, the apparatus 100 includes a moving picture acquiring unit 110 for acquiring a moving picture, a person search unit 120 for checking whether a specific person appears in the moving picture provided by the moving picture acquiring unit 110, and a copyright report producing unit 130 for producing a copyright report based on the information provided by the person search unit 120.
[29] The moving picture acquiring unit 110 may acquire the moving picture through wired or wireless networks or from a broadcast.
[30] The person search unit 120 checks rapidly whether an image, e.g., a facial image, of the specific person is included in the moving picture without permission.
[31] The copyright report producing unit 130 generates a copyright report based on the result obtained by the person search unit 120.
[32] Fig. 2 provides a block diagram showing the person search unit 120 in accordance with the present invention.
[33] The person search unit 120 includes a voice search unit 220 for searching voice included in the moving picture, and an image search unit 230 for retrieving the facial image of the specific person from the moving picture. Moreover, the person search unit 120 may further include a character string search unit 210 for retrieving character strings, such as a name, a nickname, and the like, associated with the specific person from the moving picture.
[34] In case the moving picture is obtained from a digital broadcast, the character string search unit 210 may retrieve the character strings associated with the specific person from additional data, e.g., a caption, included in the moving picture. The character string search unit 210 may be embodied by means of one or more character recognition techniques well known in the art.
[35] The voice search unit 220 may retrieve one or more temporal sections, which may be considered as so-called face_search_candidate_sections, from the moving picture, where the specific person's voice is included. By determining the face_search_candidate_sections, temporal range of the moving picture to be searched for the specific person may shrink substantially, thereby reducing the time and the cost needed for the search process.
[36] However, in case the specific person's voice is retrieved from the moving picture by the voice_search_unit 220, a part of the temporal sections are likely to be skipped during the search process although the specific person appears therein, due to a failure in recognition of the specific person's voice.
[37] Accordingly, the voice search unit 220 may be embodied so as to search extended temporal sections including all person's voices (instead of the temporal sections including the specific person's voice). The person's voice may be detected by referring to periodicity of vibration of vocal cords, periodic acceleration resulting from the presence of Glottal Closure Instance (GCI), a peculiar shape of a spectral envelope of person's voice, and the like. The voice search unit 220 may be embodied by means of one or more voice recognition techniques well known in the art.
[38] After the face_search_candidate_sections are determined, the image search unit 230 retrieves the specific person's face from the face_search_candidate_sections to check
whether the specific person's facial image is included in the moving picture without permission or not. The image search unit 230 may be embodied by means of one or more face detection techniques and/or face recognition techniques well known in the art.
[39] In accordance with another example embodiment of the present invention, so-called voice_search_candidate_sections, may be retrieved from the moving picture in the first place by checking whether there appears at least one candidate in the moving picture who is likely to be recognized as the specific person by the image search unit 230, and then finalized temporal sections where the specific person's voice is included may be retrieved from the voice_search_candidate_sections by the voice search unit 220.
[40] Moreover, in accordance with yet another example embodiment of the present invention, both the retrieval of first sections where the specific person's face is included by the image search unit 230 and the retrieval of second sections where the specific person's voice is included by the voice search unit 220 may be performed at the same time.
[41] It is to be noted that the above-mentioned various embodiments may be applied to other embodiments expounded hereinafter even though any specific description thereabout is not presented thereat.
[42] Fig. 3 depicts a flowchart illustrating a process of producing a copyright report for the specific person who appears in the moving picture in accordance with the present invention.
[43] Referring to Fig. 3, the method for producing the copyright report for the specific person includes the steps of acquiring the moving picture (S310), retrieving temporal sections of the moving picture in which the specific person appears (S320), and producing the copyright report based on the retrieved sections (S330).
[44] In detail, at the step 310, the moving picture may be acquired through wired or wireless networks or digital broadcasts.
[45] After the temporal sections where the facial image of the specific person is included without permission are retrieved from the moving picture at the step 320, the copyright report may be produced at the step 330, the copyright report being used as a supporting evidence for copyright infringement.
[46] The copyright report may contain information on the specific person. For example, the copyright report may say that the specific person "Peter" appears at "three scenes" of the moving picture "A", a total time of the sections where "Peter" appears is "10 minutes", and time slots at which "Peter" appears are "from xx seconds to xxx seconds" and the like.
[47] By referring to the copyright report, a copyright holder can determine whether to cope with the copyright infringement.
[48] Fig. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention.
[49] First, voices which have high probabilities of being determined as the specific person's voice are retrieved from the moving picture (S410).
[50] A determination is then made on whether the specific person's voice is included in the moving picture (S420).
[51] In case the specific person's voice is considered to be included in the moving picture, first temporal sections including the specific person's voice (e.g., scenes at which the specific person's voice is inserted) are determined as the face_search_candidate_sections (S430).
[52] After the face_search_candidate_sections are determined, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S440).
[53] Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention.
[54] First, person's voices are searched in the moving picture (S510). Unlike the embodiment of Fig. 4, the reason why the unspecific person's voices are searched instead of the specific person's voice is that a part of the temporal sections including the specific person are likely to be skipped due to inaccuracy of the technique for recognizing a certain voice.
[55] A determination is then made on whether the person's voices are included in the moving picture (S520).
[56] In case the person's voices are considered to be included in the moving picture, first temporal sections including the person's voices (e.g., scenes where the person's voices are inserted) are determined as the face_search_candidate_sections (S530).
[57] After the face_search_candidate_sections are determined, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S540).
[58] Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
[59] Referring to Fig. 6, character strings associated with the specific person are first retrieved from the moving picture (S 610), unlike the embodiments of Figs. 4 and 5. In case the moving picture is obtained from a digital broadcast, the character strings may include data, e.g., a caption inserted into the moving picture, as previously mentioned.
[60] A determination is then made on whether the character strings associated with the specific person are included in the moving picture (S 620). In case the character strings associated with the specific person are not included in the moving picture, it is determined that the specific person does not appear in the moving picture.
[61] However, in case the character strings associated with the specific person are included in the moving picture, voices which have high probabilities of being determined as the specific person's voice are retrieved from the moving picture (S630).
[62] A determination is then made on whether the specific person's voice is included in the moving picture (S640).
[63] In case the specific person's voice is considered to be included in the moving picture, first temporal sections including the specific person's voice (e.g., scenes at which the specific person's voice are inserted) are determined as the face_search_candidate_sections (S650).
[64] Then, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S660).
[65] Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
[66] Referring to Fig. 7, character strings associated with the specific person are first retrieved from the moving picture (S710) like the embodiment of Fig. 6 (unlike the embodiments of Figs. 4 and 5). Since the examples of the character strings were previously mentioned, a detailed description thereabout will be omitted.
[67] A determination is then made on whether the character strings associated with the specific person are included in the moving picture (S720). In case the character strings associated with the specific person are not included in the moving picture, it is determined that the specific person does not appear in the moving picture.
[68] However, in case the character strings associated with the specific person are considered to be included in the moving picture, person's voices are retrieved from the moving picture (S730). Unlike the embodiment of Fig. 6, the reason why the un- specific person's voices are retrieved instead of the specific person's voice is that a part of the temporal sections including the specific person are likely to be skipped due to the inaccuracy of the technique for recognizing a certain voice. Since this was mentioned above, a detailed description thereabout will be omitted.
[69] A determination is then made on whether the person's voices are included in the moving picture (S740).
[70] In case the person's voices are included in the moving picture, first temporal sections including the person's voices (e.g., scenes at which the person's voices are inserted) are determined as the face_search_candidate_sections (S750).
[71] Thereafter, second temporal sections including the specific person's facial image are retrieved from the face_search_candidate_sections (S760).
[72] Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
[73] First, character strings associated with the specific person are retrieved from the moving picture (S 810), as shown in the embodiments of Figs. 6 and 7. Referring to Fig. 8, however, the retrieved character strings are applied to determine the face_search_candidate_sections, unlike the embodiments of Figs. 6 and 7.
[74] Thereafter, voices which have high probabilities of being determined as the specific person's voice are searched in the moving picture (S820). Herein, it should be noted that the steps 810 and 820 can be performed in a reverse order or at the same time.
[75] After the retrieval of the character strings and the voice are ended, a determination is then made on whether the character strings associated with the specific person or the specific person's voice are included in the moving picture (S830).
[76] When the character strings associated with the specific person or the specific person's voice are included in the moving picture, first temporal sections including the character strings or the specific person's voice are determined as the face_search_candidate_sections (S840). This is because the first temporal sections including the character strings associated with the specific person or the specific person's voice are considered as time slots in which the specific person is highly likely to appear.
[77] After the face_search_candidate_sections are determined, second temporal sections including the specific person's facial image are retrieved from the face_search_candidate_sections (S850).
[78] As described above, the embodiments of the present invention described with reference to Figs. 4 to 8 may be embodied by using metadata such as Electronic Program Guide (EPG). For example, the name of the specific person may be retrieved from the EPG in the first place, which may include information on a plurality of performers, and in case the specific person is included in the EPG, the attempt to retrieve the specific person from a corresponding moving picture may be made with efficiency, resulting in a high-accurate retrieval.
[79] However, since the EPG includes only key performers in general, it is undesirable in terms of accuracy not to check content of the moving picture for the simple reason that the name of the specific person has not been found in the EPG.
[80] Meanwhile, the EPG is available for moving pictures provided by broadcasting stations such as KBS, MBC, and the like. However, the EPG may be unavailable for moving pictures illegally distributed because in this case it is a matter of course not to have corresponding EPG.
[81] While the present invention has been shown and described with respect to the preferred embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and the scope of the present invention as defined in the following claims.
Claims
[1] A method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method comprising the steps of:
(a) extracting the audio components from the moving picture and determining first sections as face_search_candidate_sections, the first sections including temporal sections, in which person's voices among the extracted audio components are included, by using voice recognition technique; and
(b) comparing faces of one or more persons appearing in a part of the video components included in the face_search_candidate_sections with the specific person's face by using face recognition technique, and determining second sections as results, the second sections including temporal sections, in which a certain individual among the persons appearing in the part of the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[2] The method of claim 1, wherein, at the step (a), the first sections are determined by detecting temporal sections of the moving picture in which the specific person's voice among the extracted audio components is included.
[3] The method of claim 2, wherein a shape of a unique spectral envelope of the specific person's voice is compared with that of the extracted audio components in order to determine whether the specific person's voice is included in the extracted audio components.
[4] The method of claim 1, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; extracting character strings from the video components; and determining the first sections as the face_search_candidate_sections in case one or more character strings associated with the specific person are retrieved from the extracted character strings.
[5] The method of claim 4, wherein the first sections are determined by detecting temporal sections of the moving picture in which the specific person's voice among the extracted audio components id included.
[6] The method of claim 1, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; determining third sections including temporal sections of the moving picture in
which character strings associated with the specific person are included in the extracted video components; determining the first sections including temporal sections of the moving picture in which the specific person's voice among the extracted audio components is included; and determining the third sections and the first sections as the face_search_candidate_sections.
[7] The method of any one of claims 4 to 6, wherein the character strings include captions of the moving picture.
[8] The method of claim 1, wherein the step (a) includes the step of determining whether a name of the specific person is included in an Electronic Program
Guide (EPG) associated with the moving picture; and determining the first sections as the face_search_candidate_sections in case the name of the specific person is included in the EPG.
[9] A method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method comprising the steps of:
(a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and
(b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
[10] The method of claim 9, wherein the step (a) includes the steps of: extracting the video components from the moving picture; and comparing faces of one or more persons appearing in the video components with the specific person's face by using face recognition technique, and determining the first sections so as to include temporal sections, in which a certain individual among the persons appearing in the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[11] The method of claim 9, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; extracting character strings from the video components; and determining the first sections as the voice_search_candidate_sections in case one
or more character strings associated with the specific person are retrieved from the extracted character strings.
[12] The method of claim 9, the step (a) includes the step of determining whether a name of the specific person is included in an Electronic Program Guide associated with the moving picture; and determining the first sections as the voice_search_candidate_sections in case the name of the specific person is included in the EPG.
[13] The method of any one of claims 1 to 6 and 8 to 12, further comprising the step of: automatically producing a copyright report including information on time slots corresponding to the second sections and the name of the specific person appearing in the second sections, in case the second sections exist.
[14] A medium recording a computer readable program to execute the method of any one of claims 1 to 6 and 8 to 12.
[15] An apparatus for producing a copyright report for a specific person appearing in a moving picture, the apparatus comprising: a moving picture acquiring unit for acquiring the moving picture; a component extracting unit for extracting video components and audio components from the acquired moving picture; a character string search unit for determining first sections so as to include temporal sections in which one or more character strings associated with the specific person are included, in case the character strings are retrieved from the extracted video components by character recognition technique; a voice search unit for determining second sections so as to include temporal sections in which the specific person's voice is included, in case the specific person's voice is retrieved from the extracted audio components by voice recognition technique; an image search unit for comparing faces of one or more persons appearing in the extracted video components with the specific person's face by using face recognition technique, and determining third sections so as to include temporal sections in which a certain individual among the persons appearing in the extracted video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face; and a copyright report producing unit for automatically producing a copyright report including information on time slots corresponding to at least one of the first sections, the second sections, and the third sections and the name of the specific person appearing in at least one of the first sections, the second sections, and the third sections.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2007-0013040 | 2007-02-08 | ||
KR1020070013040A KR100865973B1 (en) | 2007-02-08 | 2007-02-08 | Method for searching certain person and method and system for generating copyright report for the certain person |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2008097051A1 true WO2008097051A1 (en) | 2008-08-14 |
Family
ID=39681894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2008/000757 WO2008097051A1 (en) | 2007-02-08 | 2008-02-05 | Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR100865973B1 (en) |
WO (1) | WO2008097051A1 (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011001002A1 (en) * | 2009-06-30 | 2011-01-06 | Nokia Corporation | A method, devices and a service for searching |
WO2011017557A1 (en) * | 2009-08-07 | 2011-02-10 | Google Inc. | Architecture for responding to a visual query |
US7925676B2 (en) | 2006-01-27 | 2011-04-12 | Google Inc. | Data object visualization using maps |
US7953720B1 (en) | 2005-03-31 | 2011-05-31 | Google Inc. | Selecting the best answer to a fact query from among a set of potential answers |
US8055674B2 (en) | 2006-02-17 | 2011-11-08 | Google Inc. | Annotation framework |
US8065290B2 (en) | 2005-03-31 | 2011-11-22 | Google Inc. | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
US8670597B2 (en) | 2009-08-07 | 2014-03-11 | Google Inc. | Facial recognition with social network aiding |
US8805079B2 (en) | 2009-12-02 | 2014-08-12 | Google Inc. | Identifying matching canonical documents in response to a visual query and in accordance with geographic information |
US8811742B2 (en) | 2009-12-02 | 2014-08-19 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
US8935246B2 (en) | 2012-08-08 | 2015-01-13 | Google Inc. | Identifying textual terms in response to a visual query |
US8954426B2 (en) | 2006-02-17 | 2015-02-10 | Google Inc. | Query language |
US8977639B2 (en) | 2009-12-02 | 2015-03-10 | Google Inc. | Actionable search results for visual queries |
US9087059B2 (en) | 2009-08-07 | 2015-07-21 | Google Inc. | User interface for presenting search results for multiple regions of a visual query |
US9183224B2 (en) | 2009-12-02 | 2015-11-10 | Google Inc. | Identifying matching canonical documents in response to a visual query |
US9405772B2 (en) | 2009-12-02 | 2016-08-02 | Google Inc. | Actionable search results for street view visual queries |
US9530229B2 (en) | 2006-01-27 | 2016-12-27 | Google Inc. | Data object visualization using graphs |
US9852156B2 (en) | 2009-12-03 | 2017-12-26 | Google Inc. | Hybrid use of location sensor data and visual query to return local listings for visual query |
US9892132B2 (en) | 2007-03-14 | 2018-02-13 | Google Llc | Determining geographic locations for place names in a fact repository |
WO2019095221A1 (en) * | 2017-11-16 | 2019-05-23 | 深圳前海达闼云端智能科技有限公司 | Method for searching for person, apparatus, terminal and cloud server |
WO2019240434A1 (en) * | 2018-06-15 | 2019-12-19 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling thereof |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101079180B1 (en) * | 2010-01-22 | 2011-11-02 | 주식회사 상상커뮤니케이션 | Query Based Image Search System for Searching Specific Person |
CN106569946B (en) * | 2016-10-31 | 2021-04-13 | 惠州Tcl移动通信有限公司 | Mobile terminal performance test method and system |
KR101684273B1 (en) * | 2016-11-17 | 2016-12-08 | 주식회사 엘지유플러스 | Movie managing server, movie playing apparatus, and method for providing charater information using thereof |
KR101689195B1 (en) * | 2016-11-17 | 2016-12-23 | 주식회사 엘지유플러스 | Movie managing server, movie playing apparatus, and method for providing charater information using thereof |
KR101686425B1 (en) * | 2016-11-17 | 2016-12-14 | 주식회사 엘지유플러스 | Movie managing server, movie playing apparatus, and method for providing charater information using thereof |
KR102433393B1 (en) | 2017-12-12 | 2022-08-17 | 한국전자통신연구원 | Apparatus and method for recognizing character in video contents |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6546185B1 (en) * | 1998-07-28 | 2003-04-08 | Lg Electronics Inc. | System for searching a particular character in a motion picture |
KR20040071369A (en) * | 2003-02-05 | 2004-08-12 | (주)에어스파이더 | Digital Image Data Search System |
KR20050051857A (en) * | 2003-11-28 | 2005-06-02 | 삼성전자주식회사 | Device and method for searching for image by using audio data |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100474848B1 (en) * | 2002-07-19 | 2005-03-10 | 삼성전자주식회사 | System and method for detecting and tracking a plurality of faces in real-time by integrating the visual ques |
-
2007
- 2007-02-08 KR KR1020070013040A patent/KR100865973B1/en not_active IP Right Cessation
-
2008
- 2008-02-05 WO PCT/KR2008/000757 patent/WO2008097051A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6546185B1 (en) * | 1998-07-28 | 2003-04-08 | Lg Electronics Inc. | System for searching a particular character in a motion picture |
KR20040071369A (en) * | 2003-02-05 | 2004-08-12 | (주)에어스파이더 | Digital Image Data Search System |
KR20050051857A (en) * | 2003-11-28 | 2005-06-02 | 삼성전자주식회사 | Device and method for searching for image by using audio data |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8065290B2 (en) | 2005-03-31 | 2011-11-22 | Google Inc. | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
US8650175B2 (en) | 2005-03-31 | 2014-02-11 | Google Inc. | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
US8224802B2 (en) | 2005-03-31 | 2012-07-17 | Google Inc. | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
US7953720B1 (en) | 2005-03-31 | 2011-05-31 | Google Inc. | Selecting the best answer to a fact query from among a set of potential answers |
US9530229B2 (en) | 2006-01-27 | 2016-12-27 | Google Inc. | Data object visualization using graphs |
US7925676B2 (en) | 2006-01-27 | 2011-04-12 | Google Inc. | Data object visualization using maps |
US8954426B2 (en) | 2006-02-17 | 2015-02-10 | Google Inc. | Query language |
US8055674B2 (en) | 2006-02-17 | 2011-11-08 | Google Inc. | Annotation framework |
US9892132B2 (en) | 2007-03-14 | 2018-02-13 | Google Llc | Determining geographic locations for place names in a fact repository |
WO2011001002A1 (en) * | 2009-06-30 | 2011-01-06 | Nokia Corporation | A method, devices and a service for searching |
US10031927B2 (en) | 2009-08-07 | 2018-07-24 | Google Llc | Facial recognition with social network aiding |
US9208177B2 (en) | 2009-08-07 | 2015-12-08 | Google Inc. | Facial recognition with social network aiding |
US10515114B2 (en) | 2009-08-07 | 2019-12-24 | Google Llc | Facial recognition with social network aiding |
US10534808B2 (en) | 2009-08-07 | 2020-01-14 | Google Llc | Architecture for responding to visual query |
US9087059B2 (en) | 2009-08-07 | 2015-07-21 | Google Inc. | User interface for presenting search results for multiple regions of a visual query |
US8670597B2 (en) | 2009-08-07 | 2014-03-11 | Google Inc. | Facial recognition with social network aiding |
US9135277B2 (en) | 2009-08-07 | 2015-09-15 | Google Inc. | Architecture for responding to a visual query |
WO2011017557A1 (en) * | 2009-08-07 | 2011-02-10 | Google Inc. | Architecture for responding to a visual query |
US9183224B2 (en) | 2009-12-02 | 2015-11-10 | Google Inc. | Identifying matching canonical documents in response to a visual query |
US9405772B2 (en) | 2009-12-02 | 2016-08-02 | Google Inc. | Actionable search results for street view visual queries |
US9087235B2 (en) | 2009-12-02 | 2015-07-21 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
US8977639B2 (en) | 2009-12-02 | 2015-03-10 | Google Inc. | Actionable search results for visual queries |
US8811742B2 (en) | 2009-12-02 | 2014-08-19 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
US8805079B2 (en) | 2009-12-02 | 2014-08-12 | Google Inc. | Identifying matching canonical documents in response to a visual query and in accordance with geographic information |
US9852156B2 (en) | 2009-12-03 | 2017-12-26 | Google Inc. | Hybrid use of location sensor data and visual query to return local listings for visual query |
US10346463B2 (en) | 2009-12-03 | 2019-07-09 | Google Llc | Hybrid use of location sensor data and visual query to return local listings for visual query |
US9372920B2 (en) | 2012-08-08 | 2016-06-21 | Google Inc. | Identifying textual terms in response to a visual query |
US8935246B2 (en) | 2012-08-08 | 2015-01-13 | Google Inc. | Identifying textual terms in response to a visual query |
WO2019095221A1 (en) * | 2017-11-16 | 2019-05-23 | 深圳前海达闼云端智能科技有限公司 | Method for searching for person, apparatus, terminal and cloud server |
WO2019240434A1 (en) * | 2018-06-15 | 2019-12-19 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling thereof |
US11561760B2 (en) | 2018-06-15 | 2023-01-24 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling thereof |
Also Published As
Publication number | Publication date |
---|---|
KR20080074266A (en) | 2008-08-13 |
KR100865973B1 (en) | 2008-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2008097051A1 (en) | Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person | |
CN1774717B (en) | Method and apparatus for summarizing a music video using content analysis | |
KR100915847B1 (en) | Streaming video bookmarks | |
US7949207B2 (en) | Video structuring device and method | |
JP5029030B2 (en) | Information grant program, information grant device, and information grant method | |
Huang et al. | Automated generation of news content hierarchy by integrating audio, video, and text information | |
US20030123712A1 (en) | Method and system for name-face/voice-role association | |
US20140245463A1 (en) | System and method for accessing multimedia content | |
US20050228665A1 (en) | Metadata preparing device, preparing method therefor and retrieving device | |
US20030131362A1 (en) | Method and apparatus for multimodal story segmentation for linking multimedia content | |
JP5218766B2 (en) | Rights information extraction device, rights information extraction method and program | |
JP2004526373A (en) | Parental control system for video programs based on multimedia content information | |
US8453179B2 (en) | Linking real time media context to related applications and services | |
WO2007004110A2 (en) | System and method for the alignment of intrinsic and extrinsic audio-visual information | |
CN101137986A (en) | Summarization of audio and/or visual data | |
JP2005512233A (en) | System and method for retrieving information about a person in a video program | |
RU2413990C2 (en) | Method and apparatus for detecting content item boundaries | |
JP2009027428A (en) | Recording/reproduction system and recording/reproduction method | |
JP4192703B2 (en) | Content processing apparatus, content processing method, and program | |
JP2004520756A (en) | Method for segmenting and indexing TV programs using multimedia cues | |
US7349477B2 (en) | Audio-assisted video segmentation and summarization | |
JP2002354391A (en) | Method for recording program signal, and method for transmitting record program control signal | |
JP2004514350A (en) | Program summarization and indexing | |
JP2007060606A (en) | Computer program comprised of automatic video structure extraction/provision scheme | |
EP2811416A1 (en) | An identification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08712408 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08712408 Country of ref document: EP Kind code of ref document: A1 |