WO2008097051A1 - Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person - Google Patents

Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person Download PDF

Info

Publication number
WO2008097051A1
WO2008097051A1 PCT/KR2008/000757 KR2008000757W WO2008097051A1 WO 2008097051 A1 WO2008097051 A1 WO 2008097051A1 KR 2008000757 W KR2008000757 W KR 2008000757W WO 2008097051 A1 WO2008097051 A1 WO 2008097051A1
Authority
WO
WIPO (PCT)
Prior art keywords
sections
specific person
moving picture
face
voice
Prior art date
Application number
PCT/KR2008/000757
Other languages
French (fr)
Inventor
Jung-Hee Ryu
Junhwan Kim
Original Assignee
Olaworks, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Olaworks, Inc. filed Critical Olaworks, Inc.
Publication of WO2008097051A1 publication Critical patent/WO2008097051A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents
    • G06Q50/184Intellectual property management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • G06V40/173Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/179Human faces, e.g. facial parts, sketches or expressions metadata assisted face recognition

Definitions

  • the present invention relates to a method for searching a specific person included in a digital data, and a method and an apparatus for producing a copyright report for the specific person.
  • FIG. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention
  • Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention
  • Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention
  • Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention
  • Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention. Best Mode for Carrying Out the Invention
  • a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components including the steps of: (a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and (b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
  • the person search unit 120 checks rapidly whether an image, e.g., a facial image, of the specific person is included in the moving picture without permission.
  • an image e.g., a facial image
  • the person search unit 120 includes a voice search unit 220 for searching voice included in the moving picture, and an image search unit 230 for retrieving the facial image of the specific person from the moving picture. Moreover, the person search unit 120 may further include a character string search unit 210 for retrieving character strings, such as a name, a nickname, and the like, associated with the specific person from the moving picture.
  • the image search unit 230 retrieves the specific person's face from the face_search_candidate_sections to check whether the specific person's facial image is included in the moving picture without permission or not.
  • the image search unit 230 may be embodied by means of one or more face detection techniques and/or face recognition techniques well known in the art.
  • both the retrieval of first sections where the specific person's face is included by the image search unit 230 and the retrieval of second sections where the specific person's voice is included by the voice search unit 220 may be performed at the same time.
  • the method for producing the copyright report for the specific person includes the steps of acquiring the moving picture (S310), retrieving temporal sections of the moving picture in which the specific person appears (S320), and producing the copyright report based on the retrieved sections (S330).
  • the copyright report may be produced at the step 330, the copyright report being used as a supporting evidence for copyright infringement.
  • FIG. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention.
  • FIG. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention.
  • first temporal sections including the person's voices are determined as the face_search_candidate_sections (S530).
  • FIG. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
  • character strings associated with the specific person are first retrieved from the moving picture (S 610), unlike the embodiments of Figs. 4 and 5.
  • the character strings may include data, e.g., a caption inserted into the moving picture, as previously mentioned.
  • first temporal sections including the specific person's voice are determined as the face_search_candidate_sections (S650).
  • FIG. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
  • character strings associated with the specific person are first retrieved from the moving picture (S710) like the embodiment of Fig. 6 (unlike the embodiments of Figs. 4 and 5). Since the examples of the character strings were previously mentioned, a detailed description thereabout will be omitted.
  • first temporal sections including the person's voices are determined as the face_search_candidate_sections (S750).
  • Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
  • character strings associated with the specific person are retrieved from the moving picture (S 810), as shown in the embodiments of Figs. 6 and 7. Referring to Fig. 8, however, the retrieved character strings are applied to determine the face_search_candidate_sections, unlike the embodiments of Figs. 6 and 7.
  • first temporal sections including the character strings or the specific person's voice are determined as the face_search_candidate_sections (S840). This is because the first temporal sections including the character strings associated with the specific person or the specific person's voice are considered as time slots in which the specific person is highly likely to appear.
  • the embodiments of the present invention described with reference to Figs. 4 to 8 may be embodied by using metadata such as Electronic Program Guide (EPG).
  • EPG Electronic Program Guide
  • the name of the specific person may be retrieved from the EPG in the first place, which may include information on a plurality of performers, and in case the specific person is included in the EPG, the attempt to retrieve the specific person from a corresponding moving picture may be made with efficiency, resulting in a high-accurate retrieval.
  • the EPG is available for moving pictures provided by broadcasting stations such as KBS, MBC, and the like.
  • the EPG may be unavailable for moving pictures illegally distributed because in this case it is a matter of course not to have corresponding EPG.

Abstract

A specific person can be rapidly retrieved from a moving picture by an automated system. A report indicating whether or not a copyright infringement is committed is automatically produced by the automated system, thus allowing a copyright holder to check whether his or her copyright is infringed or not with ease. A method for automatically searching the specific person in the moving picture includes the steps of: determining a face_search_candidate_sections of the moving picture based on voice recognition technique; and retrieving sections including the specific person's face from the face_search_candidate_sections.

Description

Description
METHOD FOR SEARCHING SPECIFIC PERSON INCLUDED
IN DIGITAL DATA, AND METHOD AND APPARATUS FOR
PRODUCING COPYRIGHT REPORT FOR THE SPECIFIC
PERSON Technical Field
[1] The present invention relates to a method for searching a specific person included in a digital data, and a method and an apparatus for producing a copyright report for the specific person. Background Art
[2] In recent years, User Created Content ("UCC") has been skyrocketing in its popularity. The number of websites for providing the UCC to user is also increasing.
[3] The UCC refers to various kinds of media content produced by ordinary people rather than companies. In detail, the UCC includes a variety of contents, such as music, a photograph, a flash animation, a moving picture, and the like. Disclosure of Invention Technical Problem
[4] The increasing popularity of UCC induces the diversification of the subject who creates the UCC. In former days when contents had been produced by only a few subjects, a copyright holder could protect his or her copyright with no difficulty. Nowadays, however, the diversification of the subjects may often bring about issues on copyright infringement.
[5] In order to check whether the copyright infringement is committed, much time and cost may be required. For example, time and cost may be required to determine when a specific person appears in the UCC such as a moving picture without permission. Accordingly, there is a need for a scheme capable of easily determining whether the specific person appears in the moving picture, thereby effectively protecting copyrights.
[6] Further, the scheme may be used to produce a copyright report without having to require much time and cost. Technical Solution
[7] It is, therefore, one object of the present invention to provide a method for searching a specific person in a moving picture.
[8] It is another object of the present invention to provide a method for producing a copyright report for the specific person included in the moving picture in order to easily check whether a copyright infringement for the specific person is committed or not.
[9] It is yet another object of the present invention to provide an apparatus for producing the copyright report for the specific person included in the moving picture without having to require much time and cost.
Advantageous Effects
[10] In accordance with the present invention, a specific person can be rapidly searched in the moving picture without a vexatious manual search. [11] Further, in accordance with the present invention, a copyright report for the specific person is automatically produced so that a copyright holder can easily check whether infringement is committed against his or her copyright. [12] Furthermore, in accordance with the present invention, since the face of the specific person is rapidly retrieved from the moving picture, the copyright holder can easily check whether the copyright infringement is committed or not, thereby protecting his or her copyright effectively.
Brief Description of the Drawings [13] The above and other objects and features of the present invention will become apparent from the following description of preferred embodiments given in conjunction with the accompanying drawings, in which: [14] Fig. 1 shows a block diagram illustrating an apparatus for producing a copyright report for a specific person who appears in a moving picture in accordance with the present invention; [15] Fig. 2 provides a block diagram showing a person_search_unit in accordance with the present invention; [16] Fig. 3 depicts a flowchart illustrating a process of producing a copyright report for the specific person included in the moving picture in accordance with the present invention; [17] Fig. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention; [18] Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention; [19] Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention; [20] Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention; and [21] Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention. Best Mode for Carrying Out the Invention
[22] In accordance with one aspect of the present invention, there is provided a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method including the steps of: (a) extracting the audio components from the moving picture and determining first sections as face_search_candidate_sections, the first sections including temporal sections, in which person's voices among the extracted audio components are included, by using voice recognition technique; and (b) comparing faces of one or more persons appearing in a part of the video components included in the face_search_candidate_sections with the specific person's face by using face recognition technique, and determining second sections as results, the second sections including temporal sections, in which a certain individual among the persons appearing in the part of the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[23] In accordance with another aspect of the present invention, there is provided a method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method including the steps of: (a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and (b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
[24] In accordance with yet another aspect of the present invention, there is provided an apparatus for producing a copyright report for a specific person appearing in a moving picture, the apparatus including: a moving picture acquiring unit for acquiring the moving picture; a component extracting unit for extracting video components and audio components from the acquired moving picture; a character string search unit for determining first sections so as to include temporal sections in which one or more character strings associated with the specific person are included, in case the character strings are retrieved from the extracted video components by character recognition technique; a voice search unit for determining second sections so as to include temporal sections in which the specific person's voice is included, in case the specific person's voice is retrieved from the extracted audio components by voice recognition technique; an image search unit for comparing faces of one or more persons appearing in the extracted video components with the specific person's face by using face recognition technique, and determining third sections so as to include temporal sections in which a certain individual among the persons appearing in the extracted video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face; and a copyright report producing unit for automatically producing a copyright report including information on time slots corresponding to at least one of the first sections, the second sections, and the third sections and the name of the specific person appearing in at least one of the first sections, the second sections, and the third sections. Mode for the Invention
[25] In the following detailed description, reference is made to the accompanying drawings that show, by way of illustration, specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention. It is to be understood that the various embodiments of the present invention, although different, are not necessarily mutually exclusive. For example, a particular feature, structure, or characteristic described herein in connection with one embodiment may be implemented within other embodiments without departing from the spirit and scope of the present invention. In addition, it is to be understood that the location or arrangement of individual elements within each disclosed embodiment may be modified without departing from the spirit and scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims, appropriately interpreted, along with the full range of equivalents to which the claims are entitled. In the drawings, like numerals refer to the same or similar functionality throughout the several views.
[26] The present invention will now be described in more detail, with reference to the accompanying drawings.
[27] Fig. 1 shows a block diagram illustrating an apparatus 100 for producing a copyright report for a specific person included in a digital data, e.g., a moving picture, in accordance with the present invention.
[28] In detail, the apparatus 100 includes a moving picture acquiring unit 110 for acquiring a moving picture, a person search unit 120 for checking whether a specific person appears in the moving picture provided by the moving picture acquiring unit 110, and a copyright report producing unit 130 for producing a copyright report based on the information provided by the person search unit 120. [29] The moving picture acquiring unit 110 may acquire the moving picture through wired or wireless networks or from a broadcast.
[30] The person search unit 120 checks rapidly whether an image, e.g., a facial image, of the specific person is included in the moving picture without permission.
[31] The copyright report producing unit 130 generates a copyright report based on the result obtained by the person search unit 120.
[32] Fig. 2 provides a block diagram showing the person search unit 120 in accordance with the present invention.
[33] The person search unit 120 includes a voice search unit 220 for searching voice included in the moving picture, and an image search unit 230 for retrieving the facial image of the specific person from the moving picture. Moreover, the person search unit 120 may further include a character string search unit 210 for retrieving character strings, such as a name, a nickname, and the like, associated with the specific person from the moving picture.
[34] In case the moving picture is obtained from a digital broadcast, the character string search unit 210 may retrieve the character strings associated with the specific person from additional data, e.g., a caption, included in the moving picture. The character string search unit 210 may be embodied by means of one or more character recognition techniques well known in the art.
[35] The voice search unit 220 may retrieve one or more temporal sections, which may be considered as so-called face_search_candidate_sections, from the moving picture, where the specific person's voice is included. By determining the face_search_candidate_sections, temporal range of the moving picture to be searched for the specific person may shrink substantially, thereby reducing the time and the cost needed for the search process.
[36] However, in case the specific person's voice is retrieved from the moving picture by the voice_search_unit 220, a part of the temporal sections are likely to be skipped during the search process although the specific person appears therein, due to a failure in recognition of the specific person's voice.
[37] Accordingly, the voice search unit 220 may be embodied so as to search extended temporal sections including all person's voices (instead of the temporal sections including the specific person's voice). The person's voice may be detected by referring to periodicity of vibration of vocal cords, periodic acceleration resulting from the presence of Glottal Closure Instance (GCI), a peculiar shape of a spectral envelope of person's voice, and the like. The voice search unit 220 may be embodied by means of one or more voice recognition techniques well known in the art.
[38] After the face_search_candidate_sections are determined, the image search unit 230 retrieves the specific person's face from the face_search_candidate_sections to check whether the specific person's facial image is included in the moving picture without permission or not. The image search unit 230 may be embodied by means of one or more face detection techniques and/or face recognition techniques well known in the art.
[39] In accordance with another example embodiment of the present invention, so-called voice_search_candidate_sections, may be retrieved from the moving picture in the first place by checking whether there appears at least one candidate in the moving picture who is likely to be recognized as the specific person by the image search unit 230, and then finalized temporal sections where the specific person's voice is included may be retrieved from the voice_search_candidate_sections by the voice search unit 220.
[40] Moreover, in accordance with yet another example embodiment of the present invention, both the retrieval of first sections where the specific person's face is included by the image search unit 230 and the retrieval of second sections where the specific person's voice is included by the voice search unit 220 may be performed at the same time.
[41] It is to be noted that the above-mentioned various embodiments may be applied to other embodiments expounded hereinafter even though any specific description thereabout is not presented thereat.
[42] Fig. 3 depicts a flowchart illustrating a process of producing a copyright report for the specific person who appears in the moving picture in accordance with the present invention.
[43] Referring to Fig. 3, the method for producing the copyright report for the specific person includes the steps of acquiring the moving picture (S310), retrieving temporal sections of the moving picture in which the specific person appears (S320), and producing the copyright report based on the retrieved sections (S330).
[44] In detail, at the step 310, the moving picture may be acquired through wired or wireless networks or digital broadcasts.
[45] After the temporal sections where the facial image of the specific person is included without permission are retrieved from the moving picture at the step 320, the copyright report may be produced at the step 330, the copyright report being used as a supporting evidence for copyright infringement.
[46] The copyright report may contain information on the specific person. For example, the copyright report may say that the specific person "Peter" appears at "three scenes" of the moving picture "A", a total time of the sections where "Peter" appears is "10 minutes", and time slots at which "Peter" appears are "from xx seconds to xxx seconds" and the like.
[47] By referring to the copyright report, a copyright holder can determine whether to cope with the copyright infringement. [48] Fig. 4 offers a flowchart illustrating a method for searching the specific person in the moving picture in accordance with an example embodiment of the present invention.
[49] First, voices which have high probabilities of being determined as the specific person's voice are retrieved from the moving picture (S410).
[50] A determination is then made on whether the specific person's voice is included in the moving picture (S420).
[51] In case the specific person's voice is considered to be included in the moving picture, first temporal sections including the specific person's voice (e.g., scenes at which the specific person's voice is inserted) are determined as the face_search_candidate_sections (S430).
[52] After the face_search_candidate_sections are determined, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S440).
[53] Fig. 5 illustrates a method for searching the specific person in the moving picture in accordance with another example embodiment of the present invention.
[54] First, person's voices are searched in the moving picture (S510). Unlike the embodiment of Fig. 4, the reason why the unspecific person's voices are searched instead of the specific person's voice is that a part of the temporal sections including the specific person are likely to be skipped due to inaccuracy of the technique for recognizing a certain voice.
[55] A determination is then made on whether the person's voices are included in the moving picture (S520).
[56] In case the person's voices are considered to be included in the moving picture, first temporal sections including the person's voices (e.g., scenes where the person's voices are inserted) are determined as the face_search_candidate_sections (S530).
[57] After the face_search_candidate_sections are determined, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S540).
[58] Fig. 6 illustrates a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
[59] Referring to Fig. 6, character strings associated with the specific person are first retrieved from the moving picture (S 610), unlike the embodiments of Figs. 4 and 5. In case the moving picture is obtained from a digital broadcast, the character strings may include data, e.g., a caption inserted into the moving picture, as previously mentioned.
[60] A determination is then made on whether the character strings associated with the specific person are included in the moving picture (S 620). In case the character strings associated with the specific person are not included in the moving picture, it is determined that the specific person does not appear in the moving picture. [61] However, in case the character strings associated with the specific person are included in the moving picture, voices which have high probabilities of being determined as the specific person's voice are retrieved from the moving picture (S630).
[62] A determination is then made on whether the specific person's voice is included in the moving picture (S640).
[63] In case the specific person's voice is considered to be included in the moving picture, first temporal sections including the specific person's voice (e.g., scenes at which the specific person's voice are inserted) are determined as the face_search_candidate_sections (S650).
[64] Then, second temporal sections including persons who have high probabilities of being determined as the specific person are retrieved from the face_search_candidate_sections (S660).
[65] Fig. 7 shows a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention.
[66] Referring to Fig. 7, character strings associated with the specific person are first retrieved from the moving picture (S710) like the embodiment of Fig. 6 (unlike the embodiments of Figs. 4 and 5). Since the examples of the character strings were previously mentioned, a detailed description thereabout will be omitted.
[67] A determination is then made on whether the character strings associated with the specific person are included in the moving picture (S720). In case the character strings associated with the specific person are not included in the moving picture, it is determined that the specific person does not appear in the moving picture.
[68] However, in case the character strings associated with the specific person are considered to be included in the moving picture, person's voices are retrieved from the moving picture (S730). Unlike the embodiment of Fig. 6, the reason why the un- specific person's voices are retrieved instead of the specific person's voice is that a part of the temporal sections including the specific person are likely to be skipped due to the inaccuracy of the technique for recognizing a certain voice. Since this was mentioned above, a detailed description thereabout will be omitted.
[69] A determination is then made on whether the person's voices are included in the moving picture (S740).
[70] In case the person's voices are included in the moving picture, first temporal sections including the person's voices (e.g., scenes at which the person's voices are inserted) are determined as the face_search_candidate_sections (S750).
[71] Thereafter, second temporal sections including the specific person's facial image are retrieved from the face_search_candidate_sections (S760).
[72] Fig. 8 provides a method for searching the specific person in the moving picture in accordance with yet another example embodiment of the present invention. [73] First, character strings associated with the specific person are retrieved from the moving picture (S 810), as shown in the embodiments of Figs. 6 and 7. Referring to Fig. 8, however, the retrieved character strings are applied to determine the face_search_candidate_sections, unlike the embodiments of Figs. 6 and 7.
[74] Thereafter, voices which have high probabilities of being determined as the specific person's voice are searched in the moving picture (S820). Herein, it should be noted that the steps 810 and 820 can be performed in a reverse order or at the same time.
[75] After the retrieval of the character strings and the voice are ended, a determination is then made on whether the character strings associated with the specific person or the specific person's voice are included in the moving picture (S830).
[76] When the character strings associated with the specific person or the specific person's voice are included in the moving picture, first temporal sections including the character strings or the specific person's voice are determined as the face_search_candidate_sections (S840). This is because the first temporal sections including the character strings associated with the specific person or the specific person's voice are considered as time slots in which the specific person is highly likely to appear.
[77] After the face_search_candidate_sections are determined, second temporal sections including the specific person's facial image are retrieved from the face_search_candidate_sections (S850).
[78] As described above, the embodiments of the present invention described with reference to Figs. 4 to 8 may be embodied by using metadata such as Electronic Program Guide (EPG). For example, the name of the specific person may be retrieved from the EPG in the first place, which may include information on a plurality of performers, and in case the specific person is included in the EPG, the attempt to retrieve the specific person from a corresponding moving picture may be made with efficiency, resulting in a high-accurate retrieval.
[79] However, since the EPG includes only key performers in general, it is undesirable in terms of accuracy not to check content of the moving picture for the simple reason that the name of the specific person has not been found in the EPG.
[80] Meanwhile, the EPG is available for moving pictures provided by broadcasting stations such as KBS, MBC, and the like. However, the EPG may be unavailable for moving pictures illegally distributed because in this case it is a matter of course not to have corresponding EPG.
[81] While the present invention has been shown and described with respect to the preferred embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and the scope of the present invention as defined in the following claims.

Claims

Claims
[1] A method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method comprising the steps of:
(a) extracting the audio components from the moving picture and determining first sections as face_search_candidate_sections, the first sections including temporal sections, in which person's voices among the extracted audio components are included, by using voice recognition technique; and
(b) comparing faces of one or more persons appearing in a part of the video components included in the face_search_candidate_sections with the specific person's face by using face recognition technique, and determining second sections as results, the second sections including temporal sections, in which a certain individual among the persons appearing in the part of the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[2] The method of claim 1, wherein, at the step (a), the first sections are determined by detecting temporal sections of the moving picture in which the specific person's voice among the extracted audio components is included.
[3] The method of claim 2, wherein a shape of a unique spectral envelope of the specific person's voice is compared with that of the extracted audio components in order to determine whether the specific person's voice is included in the extracted audio components.
[4] The method of claim 1, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; extracting character strings from the video components; and determining the first sections as the face_search_candidate_sections in case one or more character strings associated with the specific person are retrieved from the extracted character strings.
[5] The method of claim 4, wherein the first sections are determined by detecting temporal sections of the moving picture in which the specific person's voice among the extracted audio components id included.
[6] The method of claim 1, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; determining third sections including temporal sections of the moving picture in which character strings associated with the specific person are included in the extracted video components; determining the first sections including temporal sections of the moving picture in which the specific person's voice among the extracted audio components is included; and determining the third sections and the first sections as the face_search_candidate_sections.
[7] The method of any one of claims 4 to 6, wherein the character strings include captions of the moving picture.
[8] The method of claim 1, wherein the step (a) includes the step of determining whether a name of the specific person is included in an Electronic Program
Guide (EPG) associated with the moving picture; and determining the first sections as the face_search_candidate_sections in case the name of the specific person is included in the EPG.
[9] A method for searching temporal sections, in which a specific person appears, of a moving picture, the moving picture including audio components and video components, the method comprising the steps of:
(a) extracting the video components from the moving picture and determining first sections as voice_search_candidate_sections, the first sections including temporal sections, in which person's faces are included among the extracted video components, by face detection technique; and
(b) determining second sections as results, the second sections including temporal sections, in which the specific person's voice is included among the audio components in the voice_search_candidate_sections, by voice recognition technique.
[10] The method of claim 9, wherein the step (a) includes the steps of: extracting the video components from the moving picture; and comparing faces of one or more persons appearing in the video components with the specific person's face by using face recognition technique, and determining the first sections so as to include temporal sections, in which a certain individual among the persons appearing in the video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face.
[11] The method of claim 9, wherein the step (a) includes the steps of: extracting the video components and the audio components from the moving picture; extracting character strings from the video components; and determining the first sections as the voice_search_candidate_sections in case one or more character strings associated with the specific person are retrieved from the extracted character strings.
[12] The method of claim 9, the step (a) includes the step of determining whether a name of the specific person is included in an Electronic Program Guide associated with the moving picture; and determining the first sections as the voice_search_candidate_sections in case the name of the specific person is included in the EPG.
[13] The method of any one of claims 1 to 6 and 8 to 12, further comprising the step of: automatically producing a copyright report including information on time slots corresponding to the second sections and the name of the specific person appearing in the second sections, in case the second sections exist.
[14] A medium recording a computer readable program to execute the method of any one of claims 1 to 6 and 8 to 12.
[15] An apparatus for producing a copyright report for a specific person appearing in a moving picture, the apparatus comprising: a moving picture acquiring unit for acquiring the moving picture; a component extracting unit for extracting video components and audio components from the acquired moving picture; a character string search unit for determining first sections so as to include temporal sections in which one or more character strings associated with the specific person are included, in case the character strings are retrieved from the extracted video components by character recognition technique; a voice search unit for determining second sections so as to include temporal sections in which the specific person's voice is included, in case the specific person's voice is retrieved from the extracted audio components by voice recognition technique; an image search unit for comparing faces of one or more persons appearing in the extracted video components with the specific person's face by using face recognition technique, and determining third sections so as to include temporal sections in which a certain individual among the persons appearing in the extracted video components is included, there being a degree of similarity over a predetermined threshold value between face of the certain individual and the specific person's face; and a copyright report producing unit for automatically producing a copyright report including information on time slots corresponding to at least one of the first sections, the second sections, and the third sections and the name of the specific person appearing in at least one of the first sections, the second sections, and the third sections.
PCT/KR2008/000757 2007-02-08 2008-02-05 Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person WO2008097051A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2007-0013040 2007-02-08
KR1020070013040A KR100865973B1 (en) 2007-02-08 2007-02-08 Method for searching certain person and method and system for generating copyright report for the certain person

Publications (1)

Publication Number Publication Date
WO2008097051A1 true WO2008097051A1 (en) 2008-08-14

Family

ID=39681894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2008/000757 WO2008097051A1 (en) 2007-02-08 2008-02-05 Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person

Country Status (2)

Country Link
KR (1) KR100865973B1 (en)
WO (1) WO2008097051A1 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011001002A1 (en) * 2009-06-30 2011-01-06 Nokia Corporation A method, devices and a service for searching
WO2011017557A1 (en) * 2009-08-07 2011-02-10 Google Inc. Architecture for responding to a visual query
US7925676B2 (en) 2006-01-27 2011-04-12 Google Inc. Data object visualization using maps
US7953720B1 (en) 2005-03-31 2011-05-31 Google Inc. Selecting the best answer to a fact query from among a set of potential answers
US8055674B2 (en) 2006-02-17 2011-11-08 Google Inc. Annotation framework
US8065290B2 (en) 2005-03-31 2011-11-22 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US8670597B2 (en) 2009-08-07 2014-03-11 Google Inc. Facial recognition with social network aiding
US8805079B2 (en) 2009-12-02 2014-08-12 Google Inc. Identifying matching canonical documents in response to a visual query and in accordance with geographic information
US8811742B2 (en) 2009-12-02 2014-08-19 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US8935246B2 (en) 2012-08-08 2015-01-13 Google Inc. Identifying textual terms in response to a visual query
US8954426B2 (en) 2006-02-17 2015-02-10 Google Inc. Query language
US8977639B2 (en) 2009-12-02 2015-03-10 Google Inc. Actionable search results for visual queries
US9087059B2 (en) 2009-08-07 2015-07-21 Google Inc. User interface for presenting search results for multiple regions of a visual query
US9183224B2 (en) 2009-12-02 2015-11-10 Google Inc. Identifying matching canonical documents in response to a visual query
US9405772B2 (en) 2009-12-02 2016-08-02 Google Inc. Actionable search results for street view visual queries
US9530229B2 (en) 2006-01-27 2016-12-27 Google Inc. Data object visualization using graphs
US9852156B2 (en) 2009-12-03 2017-12-26 Google Inc. Hybrid use of location sensor data and visual query to return local listings for visual query
US9892132B2 (en) 2007-03-14 2018-02-13 Google Llc Determining geographic locations for place names in a fact repository
WO2019095221A1 (en) * 2017-11-16 2019-05-23 深圳前海达闼云端智能科技有限公司 Method for searching for person, apparatus, terminal and cloud server
WO2019240434A1 (en) * 2018-06-15 2019-12-19 Samsung Electronics Co., Ltd. Electronic device and method of controlling thereof

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101079180B1 (en) * 2010-01-22 2011-11-02 주식회사 상상커뮤니케이션 Query Based Image Search System for Searching Specific Person
CN106569946B (en) * 2016-10-31 2021-04-13 惠州Tcl移动通信有限公司 Mobile terminal performance test method and system
KR101684273B1 (en) * 2016-11-17 2016-12-08 주식회사 엘지유플러스 Movie managing server, movie playing apparatus, and method for providing charater information using thereof
KR101689195B1 (en) * 2016-11-17 2016-12-23 주식회사 엘지유플러스 Movie managing server, movie playing apparatus, and method for providing charater information using thereof
KR101686425B1 (en) * 2016-11-17 2016-12-14 주식회사 엘지유플러스 Movie managing server, movie playing apparatus, and method for providing charater information using thereof
KR102433393B1 (en) 2017-12-12 2022-08-17 한국전자통신연구원 Apparatus and method for recognizing character in video contents

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6546185B1 (en) * 1998-07-28 2003-04-08 Lg Electronics Inc. System for searching a particular character in a motion picture
KR20040071369A (en) * 2003-02-05 2004-08-12 (주)에어스파이더 Digital Image Data Search System
KR20050051857A (en) * 2003-11-28 2005-06-02 삼성전자주식회사 Device and method for searching for image by using audio data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100474848B1 (en) * 2002-07-19 2005-03-10 삼성전자주식회사 System and method for detecting and tracking a plurality of faces in real-time by integrating the visual ques

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6546185B1 (en) * 1998-07-28 2003-04-08 Lg Electronics Inc. System for searching a particular character in a motion picture
KR20040071369A (en) * 2003-02-05 2004-08-12 (주)에어스파이더 Digital Image Data Search System
KR20050051857A (en) * 2003-11-28 2005-06-02 삼성전자주식회사 Device and method for searching for image by using audio data

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8065290B2 (en) 2005-03-31 2011-11-22 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US8650175B2 (en) 2005-03-31 2014-02-11 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US8224802B2 (en) 2005-03-31 2012-07-17 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US7953720B1 (en) 2005-03-31 2011-05-31 Google Inc. Selecting the best answer to a fact query from among a set of potential answers
US9530229B2 (en) 2006-01-27 2016-12-27 Google Inc. Data object visualization using graphs
US7925676B2 (en) 2006-01-27 2011-04-12 Google Inc. Data object visualization using maps
US8954426B2 (en) 2006-02-17 2015-02-10 Google Inc. Query language
US8055674B2 (en) 2006-02-17 2011-11-08 Google Inc. Annotation framework
US9892132B2 (en) 2007-03-14 2018-02-13 Google Llc Determining geographic locations for place names in a fact repository
WO2011001002A1 (en) * 2009-06-30 2011-01-06 Nokia Corporation A method, devices and a service for searching
US10031927B2 (en) 2009-08-07 2018-07-24 Google Llc Facial recognition with social network aiding
US9208177B2 (en) 2009-08-07 2015-12-08 Google Inc. Facial recognition with social network aiding
US10515114B2 (en) 2009-08-07 2019-12-24 Google Llc Facial recognition with social network aiding
US10534808B2 (en) 2009-08-07 2020-01-14 Google Llc Architecture for responding to visual query
US9087059B2 (en) 2009-08-07 2015-07-21 Google Inc. User interface for presenting search results for multiple regions of a visual query
US8670597B2 (en) 2009-08-07 2014-03-11 Google Inc. Facial recognition with social network aiding
US9135277B2 (en) 2009-08-07 2015-09-15 Google Inc. Architecture for responding to a visual query
WO2011017557A1 (en) * 2009-08-07 2011-02-10 Google Inc. Architecture for responding to a visual query
US9183224B2 (en) 2009-12-02 2015-11-10 Google Inc. Identifying matching canonical documents in response to a visual query
US9405772B2 (en) 2009-12-02 2016-08-02 Google Inc. Actionable search results for street view visual queries
US9087235B2 (en) 2009-12-02 2015-07-21 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US8977639B2 (en) 2009-12-02 2015-03-10 Google Inc. Actionable search results for visual queries
US8811742B2 (en) 2009-12-02 2014-08-19 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US8805079B2 (en) 2009-12-02 2014-08-12 Google Inc. Identifying matching canonical documents in response to a visual query and in accordance with geographic information
US9852156B2 (en) 2009-12-03 2017-12-26 Google Inc. Hybrid use of location sensor data and visual query to return local listings for visual query
US10346463B2 (en) 2009-12-03 2019-07-09 Google Llc Hybrid use of location sensor data and visual query to return local listings for visual query
US9372920B2 (en) 2012-08-08 2016-06-21 Google Inc. Identifying textual terms in response to a visual query
US8935246B2 (en) 2012-08-08 2015-01-13 Google Inc. Identifying textual terms in response to a visual query
WO2019095221A1 (en) * 2017-11-16 2019-05-23 深圳前海达闼云端智能科技有限公司 Method for searching for person, apparatus, terminal and cloud server
WO2019240434A1 (en) * 2018-06-15 2019-12-19 Samsung Electronics Co., Ltd. Electronic device and method of controlling thereof
US11561760B2 (en) 2018-06-15 2023-01-24 Samsung Electronics Co., Ltd. Electronic device and method of controlling thereof

Also Published As

Publication number Publication date
KR20080074266A (en) 2008-08-13
KR100865973B1 (en) 2008-10-30

Similar Documents

Publication Publication Date Title
WO2008097051A1 (en) Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person
CN1774717B (en) Method and apparatus for summarizing a music video using content analysis
KR100915847B1 (en) Streaming video bookmarks
US7949207B2 (en) Video structuring device and method
JP5029030B2 (en) Information grant program, information grant device, and information grant method
Huang et al. Automated generation of news content hierarchy by integrating audio, video, and text information
US20030123712A1 (en) Method and system for name-face/voice-role association
US20140245463A1 (en) System and method for accessing multimedia content
US20050228665A1 (en) Metadata preparing device, preparing method therefor and retrieving device
US20030131362A1 (en) Method and apparatus for multimodal story segmentation for linking multimedia content
JP5218766B2 (en) Rights information extraction device, rights information extraction method and program
JP2004526373A (en) Parental control system for video programs based on multimedia content information
US8453179B2 (en) Linking real time media context to related applications and services
WO2007004110A2 (en) System and method for the alignment of intrinsic and extrinsic audio-visual information
CN101137986A (en) Summarization of audio and/or visual data
JP2005512233A (en) System and method for retrieving information about a person in a video program
RU2413990C2 (en) Method and apparatus for detecting content item boundaries
JP2009027428A (en) Recording/reproduction system and recording/reproduction method
JP4192703B2 (en) Content processing apparatus, content processing method, and program
JP2004520756A (en) Method for segmenting and indexing TV programs using multimedia cues
US7349477B2 (en) Audio-assisted video segmentation and summarization
JP2002354391A (en) Method for recording program signal, and method for transmitting record program control signal
JP2004514350A (en) Program summarization and indexing
JP2007060606A (en) Computer program comprised of automatic video structure extraction/provision scheme
EP2811416A1 (en) An identification method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08712408

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08712408

Country of ref document: EP

Kind code of ref document: A1