US20150019974A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program Download PDF

Info

Publication number
US20150019974A1
US20150019974A1 US14/273,735 US201414273735A US2015019974A1 US 20150019974 A1 US20150019974 A1 US 20150019974A1 US 201414273735 A US201414273735 A US 201414273735A US 2015019974 A1 US2015019974 A1 US 2015019974A1
Authority
US
United States
Prior art keywords
region
display
image
layer
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/273,735
Inventor
Shouichi Doi
Yoshiki Takeoka
Masayuki Takada
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Assigned to SONY CORPORATION reassignment SONY CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DOI, SHOUICHI, TAKADA, MASAYUKI, TAKEOKA, YOSHIKI
Publication of US20150019974A1 publication Critical patent/US20150019974A1/en
Priority to US16/049,228 priority Critical patent/US10725734B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials

Definitions

  • the present disclosure relates to an information processing device, an information processing method, and a program.
  • NUIs natural user interfaces
  • a technology for combining and using an input manipulation by a voice and an input manipulation by a gesture discloses a technology for combining and using an input manipulation by a voice and an input manipulation by a gesture.
  • an information processing device including a processor configured to realize an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display, a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region, a voice input acquisition function of acquiring a voice input for the image, and a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
  • an information processing method including, by a processor defining an address term for at least a partial region of an image to be displayed on a display, displaying the image on the display and temporarily displaying the address term on the display in association with the region, acquiring a voice input for the image, and issuing a command relevant to the region when the address term is included in the voice input.
  • a program causing a computer to realize an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display, a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region, a voice input acquisition function of acquiring a voice input for the image, and a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
  • FIG. 1 is a block diagram illustrating an overall configuration of a display device according to an embodiment of the present disclosure
  • FIG. 2 is a diagram illustrating an overall functional configuration realized in the display device according to an embodiment of the present disclosure
  • FIG. 3 is a diagram for describing a layered structure of regions in an image defined in a first embodiment of the present disclosure
  • FIG. 4 is a diagram for describing the layered structure of the regions in the image defined in the first embodiment of the present disclosure
  • FIG. 5 is a diagram illustrating a display example of address terms in the first embodiment of the present disclosure
  • FIG. 6 is a diagram illustrating a display example of the address terms in the first embodiment of the present disclosure.
  • FIG. 7 is a diagram for describing a layered structure of regions in an image defined in a second embodiment of the present disclosure.
  • FIGS. 8A to 8C are diagrams illustrating a display example of address terms in the second embodiment of the present disclosure.
  • FIGS. 9A to 9C are diagrams illustrating a display example of the address terms in the second embodiment of the present disclosure.
  • FIGS. 10A to 10C are diagrams illustrating a first modification example of the second embodiment of the present disclosure.
  • FIG. 11 is a diagram illustrating a second modification example of the second embodiment of the present disclosure.
  • FIG. 12 is a diagram illustrating the second modification example of the second embodiment of the present disclosure.
  • FIG. 13 is a diagram illustrating a display example of an address term in a third embodiment of the present disclosure.
  • FIG. 1 is a block diagram illustrating an overall configuration of a display device according to an embodiment of the present disclosure.
  • a display device 100 includes a processor 102 , a memory 104 , a storage 106 , a communication module 108 , a display 110 , a speaker 112 , a microphone 114 , an input device 116 , a camera module 118 , and a connection port 120 .
  • the display device 100 may be any of the various devices that display an image on a display 110 according to a user's desire.
  • the display device 100 may be a television, a personal computer (PC), a tablet terminal, a smartphone, a portable media player, or a portable game device including the display 110 .
  • the display device 100 may be a PC, a set-top box, a recorder, or a game device connected to the separately configured displayed 110 and controlling the display 110 .
  • the constituent elements of the display device 100 will be further described.
  • the processor 102 is realized by, for example, a central processing unit (CPU), a digital signal processor (DSP), or an application specific integrated circuit (ASIC) and operates according to programs stored in the memory 104 to realize various functions.
  • the processor 102 acquires various inputs by controlling each unit of the display device 100 and provides various outputs. The detailed functions realized by the processor 102 will be described below.
  • the memory 104 is realized by, for example, by a semiconductor memory used as a random access memory (RAM) or a read-only memory (ROM).
  • the memory 104 stores, for example, programs causing the processor 102 to operate.
  • the programs may be read from the storage 106 and may be temporarily loaded on the memory 104 or the programs may be permanently stored in the memory 104 .
  • the programs may be received by the communication module 108 and may be loaded temporarily on the memory 104 .
  • the memory 104 temporarily or permanently stores various kinds of data generated through processes of the processor 102 .
  • the storage 106 is realized by, for example, a storage device such as a magnetic disk such as a hard disk drive (HDD), an optical disc, or a magneto-optical disc or a flash memory.
  • the storage 106 permanently stores, for example, programs causing the processor 102 to operate or various kinds of data generated through processes of the processor 102 .
  • the storage 106 may be configured to include a removable medium or may be included in the display device 100 .
  • the communication module 108 is realized by any of the various communication circuits performing wired or wireless network communication under the control of the processor 102 .
  • the communication module 108 may include an antenna.
  • the communication module 108 performs network communication in conformity with a communication standard of the Internet, a local area network (LAN), Bluetooth (registered trademark), or the like.
  • the display device 100 includes the display 110 and the speaker 112 as output units.
  • the display 110 is realized by, for example, a liquid crystal display (LCD) or an organic electro-luminescence (EL) display. As described above, the display 110 may be integrated with the display device 100 or may be a separate display.
  • the display 110 displays various kinds of information as images under the control of the processor 102 . An example of an image displayed on the display 110 will be described below.
  • the speaker 112 outputs various kinds of information as voices under the control of the processor 102 .
  • the microphone 114 acquires diverse kinds of voices, such as voices spoken by a user, produced in the vicinity of the display device 100 , and supplies the voices as voice data to the processor 102 .
  • the microphone 114 is used as a voice input unit on the NUI. That is, the voice data provided by the microphone 114 is analyzed by the processor 102 and various commands are executed based on the voices or the like spoken by the user and extracted from the voice data.
  • the input device 116 is another input unit used in the display device 100 .
  • the input device 116 may include, for example, a keyboard, a button, or a mouse.
  • the input device 116 may include a touch sensor arranged at a position corresponding to the display 110 so that a touch panel is configured by the display 110 and the touch sensor.
  • the display device 100 can be sufficiently manipulated by a voice input using the microphone 114 , the separate input device 116 may not be installed.
  • the camera module 118 is realized by, for example, an image sensor such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS), an optical system such as a lens controlling formation of a subject image in the image sensor, and a driving circuit driving the image sensor and the optical system.
  • the camera module 118 supplies a still image or a moving image generated by capturing the subject image by the image sensor as image data to the processor 102 .
  • the still image or the moving image generated by the camera module 118 may be displayed as, for example, a through image or a recorded image on the display 110 .
  • the connection port 120 is a port directly connecting an external device to the display device 100 and is realized by, for example, a Universal Serial Bus (USB) port, an IEEE 1394 port, or a High-Definition Multimedia Interface (HDMI) (registered trademark) port.
  • USB Universal Serial Bus
  • HDMI High-Definition Multimedia Interface
  • the storage 106 , the display 110 , the speaker 112 , the microphone 114 , and the input device 116 are connected to the processor 102 internally (for example, by a bus or the like), but such constituent elements may be separate from the display device 100 .
  • a display device an external display or the like
  • an input device a keyboard, a mouse, or the like
  • a storage device an external HDD or the like
  • Devices connected to the connection port 120 are not limited to these examples, but various devices other than the above-described devices may be connected.
  • FIG. 2 is a diagram illustrating an overall functional configuration realized in the display device according to an embodiment of the present disclosure.
  • an image generation function 151 in the display device 100 , an image generation function 151 , an address term definition function 153 , a display control function 155 , a voice input acquisition function 157 , and a command issuing function 159 can be realized.
  • Such functions are realized, for example, when the processor 102 of the display device 100 operates according to programs stored in the memory 104 . Any of the foregoing functions refers to a command DB 161 .
  • the command DB 161 may be stored in the storage 106 of the display device 100 and a part or the entirety of the command DB 161 may be read to the memory 104 , as necessary.
  • the image generation function 151 generates an image to be displayed on the display 110 of the display device 100 .
  • the image may include a content image such as a photo or a video or an image of a document (including, for example, web pages described by the hypertext markup language (http)) described in various formats.
  • the image may include a graphical user interface (GUI) image used to manipulate the display device 100 .
  • GUI graphical user interface
  • data for displaying such an image may be read from the storage 106 or may be acquired from a server or the like on a network via the communication module 108 .
  • the image generated by the image generation function 151 is displayed on the display 110 by the display control function 155 .
  • the image generation function 151 can generate an image including a plurality of sub-images.
  • the sub-images may be, for example, content images, document images, or GUI images and may include images displayed on the display 110 by arranging such images in predetermined regions.
  • the image generation function 151 can expand and display any of the sub-images on the entire region (full screen) of the display 110 , for example, in response to a command issued by the command issuing function 159 .
  • the image generation function 151 can generate an image in which a subject image is defined.
  • the image may be, for example, a content image such as a photo or a video and the subject region in the image is recognized through an image recognition process or a setting manipulation of the user.
  • the image generation function 151 can expand and display an image using any of the subject regions as a criterion, for example, in response to a command issued by the command issuing function 159 .
  • such an image may be displayed in the entire region (full screen) of the display 110 or may be one of the foregoing sub-images.
  • the address term definition function 153 defines an address term regarding at least a partial region of an image generated by the image generation function 151 and displayed on the display 110 by the display control function 155 .
  • an image displayed on the display 110 can include a region defined as, for example, a sub-image or a subject region in the image.
  • the address term definition function 153 defines an address term for each of the regions so that display of an image on the display 110 can be easily manipulated by a voice input, as will be described below.
  • the address term definition function 153 supplies the defined address term to the display control function 155 to display the address term on the display 110 and stores information regarding the address term in the command DB 161 .
  • the address term definition function 153 defines an address term for each of the display regions of the plurality of sub-images included in the image displayed on the display 110 .
  • the address term definition function 153 may define an address term based on set information of an application function providing each sub-image (for example, address terms such as “web browser” or “media player” can be defined).
  • the address term definition function 153 may define an address term based on a title, text, or the like included in each sub-image (for example, address terms such as “news,” “memo,” and “movie” can be defined).
  • the address term definition function 153 may uniquely define the address term by using, for example, a sequence number (for example, address terms such as “web browser 1” and “web browser 2” can be defined). Further, for example, the address term definition function 153 may define an address term based on the location of each sub-image in an image (for example, address terms such as “top left” and “bottom right” can be defined).
  • the address term definition function 153 may define an address term for the region of a GUI component (for example, which can be a button, a tab, an icon, or the like) included in an application image.
  • the address term definition function 153 may define an address term based on information regarding the GUI component defined in a program providing an application image (for example, an address term such as “header,” “tab,” or “address bar” can be defined).
  • the address term definition function 153 defines an address term for each of a plurality of subject regions included in an image displayed on the display 110 .
  • the address term definition function 153 may define an address term based on a result of newly performed image analysis or information added as metadata to an image.
  • the address term definition function 153 may define an address term of a subject region based on the name of the subject (for example, an address term such as “parents,” “children,” “hands,” or “faces” can be defined.
  • the address term definition function 153 may define an address term based on the position of each subject region in an image (for example, an address term such as “top left” or “bottom right” can be defined).
  • the address term definition function 153 defines an address term for a manipulation on an image displayed on the display 110 .
  • the “manipulation on an image” mentioned here can be distinguished from manipulations of other examples in the manipulation is a manipulation performed without designation of a specific region in an image.
  • the address term defined here can be an address term for the entire region of the image displayed on the display 110 .
  • the address term definition function 153 may define an address term based on the name of a manipulation command stored in advance in the command DB 161 (for example, an address term such as “zoom out” or “scroll down” can be defined).
  • the display control function 155 displays an image generated by the image generation function 151 on the display 110 and temporarily displays address terms defined by the address term definition function 153 on the display 110 in association with regions corresponding to the address terms in the image. For example, the display control function 155 displays the address term as text at locations somewhere in the corresponding regions.
  • the display control function 155 may display frame borders or the like indicating the regions corresponding to the address terms and may display the address terms and the frame borders or the like in association therewith.
  • the address terms (and the frame borders or the like) by the display control function 155 are temporarily displayed.
  • the display control function 155 may start the display of an image newly generated by the image generation function 151 on the display 110 and then display the address terms (and the frame borders or the like) on the display 110 only for a predetermined time. After the predetermined time passes, only the image generated by the image generation function 151 can be displayed on the display 110 .
  • the address terms defined by the address term definition function 153 are temporarily displayed on the display 110 in association with the regions corresponding to these address terms, the user can easily recognize by which address term the user may specify a manipulation target when manipulating the display device 100 using a voice input, as will be described below. Also, the address terms (and the frame borders or the like) are temporarily displayed. For example, by hiding the address terms after a predetermined time passes, it is possible to ensure visibility of an image displayed on the display 110 .
  • the display control function 155 can resume the display of the address terms (and the frame borders or the like) according to a command issued by the command issuing function 159 .
  • the display control function 155 may continue to display the address terms (and the frame borders or the like), for example, until a separate command is issued from the command issuing function 159 .
  • the command DB 161 stores the information regarding the address terms defined by the address term definition function 153 , as described above.
  • the information regarding the address term can include, for example, the address term itself and information for specifying a region specified by the address term in an image.
  • identification information for example, which can be a process ID, a window ID, or the like
  • an application function providing the sub-image can be stored along with the address term.
  • identification information for example, which can be an ID given to a button, a tab, or the like
  • identification information for example, which can be coordinate information in an image
  • the subject region can be stored along with the address term.
  • the command DB 161 stores the names of manipulation commands defined in advance.
  • the names of the manipulation commands are referred to when the address term definition function 153 sets the address term for a manipulation on an image displayed on the display 110 and are also referred to when the command issuing function 159 to be described below analyzes a voice input from the user.
  • the manipulation commands include manipulation commands performed without designating specific regions in an image, as described above, and also include manipulation commands performed by designating specific regions in an image (for example, which can be manipulation commands such as “zoom in” and “select”).
  • the command issuing function 159 specifies a kind of command instructed by a voice input by referring to the name of the manipulation command stored in the command DB 161 and also specifies a region of a target by referring to information regarding the names stored in the command DB 161 .
  • the voice input acquisition function 157 acquires a voice input for an image displayed on the display 110 . More specifically, when an image is displayed on the display 110 by the display control function 155 , the voice input acquisition function 157 acquires the voice input of the user acquired via the microphone 114 . As described above, since the microphone 114 acquires diverse kinds of voices produced near the display device 100 , the voice input acquisition function 157 may perform a process of extracting a predetermined voice such as a speech of the user from acquired voice data. Alternatively, apart from the voice input acquisition function 157 , a preprocessing unit (not illustrated) performing the foregoing process may be provided.
  • the command issuing function 159 issues a command relevant to the region corresponding to the address term. More specifically, based on text extracted through voice recognition from the acquired voice input, the command issuing function 159 retrieves the address term and the name of the manipulation command included in the text with reference to the command DB 161 . When the corresponding address term and the name of the manipulation command are found, the command issuing function 159 issues the manipulation command corresponding to the name included in the text with respect to the region corresponding to the address term included in the text. The command can be issued to, for example, the image generation function 151 or the display control function 155 .
  • the command issuing function 159 issues a command for expanding and displaying any of the sub-images on the entire region (full screen) of the display 110 to the image generation function 151 .
  • which sub-image is expanded and displayed can be determined based on the address term included in the voice input with reference to the command DB 161 .
  • the command issuing function 159 issues a command for expanding and displaying the image using any of the subject regions as a criterion to the image generation function 151 .
  • which subject region is used as the criterion to expand and display the image can be determined based on the address term included in the voice input with reference to the command DB 161 .
  • FIGS. 3 and 4 are diagrams for describing a layered structure of regions in an image defined in a first embodiment of the present disclosure.
  • regions in an image for which address terms are defined by the address term definition function 153 can be set according to a layered structure to be described below.
  • a region of Layer 1 is a superordinate-layer region of a region of Layer 2
  • a region of Layer 2 is a superordinate-layer region of a region of Layer 3
  • a region of Layer 3 is a superordinate-layer region of a region of Layer 4 .
  • a subordinate layer region is included in a superordinate layer region.
  • all of the subordinate layer regions are included in a superordinate layer region. In another example, however, at least some of the subordinate layer regions may be included in a superordinate layer region.
  • such a layered structure is used to select a region in an image for which an address term is displayed.
  • an address term is displayed for all of the regions from Layer 1 to Layer 4 on the display 110 along with an image
  • the display control function 155 can display address terms of subordinate layer regions included in the displayed superordinate layer region on the display 110 .
  • the display control function 155 displays the address terms of regions (subordinate layer regions) of Layer 2 included in the region of Layer 1 on the display 110 .
  • the display control function 155 displays the address terms of regions (subordinate layer regions) of Layer 3 included in the displayed region of Layer 2 on the display 110 .
  • the layered structure described above may be defined as a parent-child relation or a link relation as in the example illustrated in FIG. 4 .
  • the illustrated example shows a case in which two regions (layer 2 - 1 and Layer 2 - 2 ) of Layer 2 are included in a region (only one region is set and corresponds to the entire image, for example) of Layer 1 , three regions (Layer 3 - 1 , Layer 3 - 2 , and Layer 3 - 3 ) of Layer 3 are included in Layer 2 - 1 , and so on.
  • Such a relation is specified when the address term definition function 153 defines an address term for each region. Thereafter, the address term can be used by the display control function 155 or the command issuing function 159 .
  • a specific use example of the layered structure of the regions will be described below.
  • FIGS. 5 and 6 are diagrams illustrating display examples of address terms according to the first embodiment of the present disclosure.
  • an address term is defined for each of the display regions of a plurality of sub-images included in an image displayed on the display 110 .
  • a region of Layer 1 is set to correspond to the entire image displayed on the display 110
  • regions of Layer 2 are defined for display regions of sub-images included in the image.
  • the address term definition function 153 can define an address term to correspond to each of the display regions of a web browser (Layer 2 - 1 ), a development tool (Layer 2 - 2 ), a text editor (Layer 2 - 3 ), and a task bar (Layer 2 - 4 ).
  • the display device 100 can be, for example, a PC.
  • the display control function 155 displays the image (the region of Layer 1 ) on the display 110 and temporarily displays an address term defined for a region corresponding to each sub-image and a frame border indicating a region corresponding the address term on the display 110 .
  • the address term AT 2 - 1 corresponding to the region of Layer 2 - 1 web browser
  • the address term AT 2 - 2 corresponding to the region of Layer 2 - 2 development tool
  • the address term AT 2 - 3 corresponding to the region of Layer 2 - 3 text editor
  • the address term AT 2 - 4 corresponding to the region of Layer 2 - 4 task bar
  • the display control function 155 ends the display of the frame borders of the address terms AT 2 - 1 to AT 2 - 4 and the display returns to display of only the image (the region of Layer 1 ).
  • the user can perform a manipulation on the regions corresponding to the address terms AT 2 - 1 to AT 2 - 4 during or after the display of the address terms AT 2 - 1 to AT 2 - 4 by giving a voice input including the address terms AT 2 - 1 to AT 2 - 4 to the display device 100 .
  • a zoom-in manipulation can be performed on a display region of a web browser corresponding to the region of Layer 2 - 1 , as will be described below, by the command issuing function 159 to which this voice input is supplied by the voice input acquisition function 157 .
  • the command issuing function 159 to which this voice input is supplied by the voice input acquisition function 157 .
  • display of a task bar corresponding to the region of Layer 2 - 4 can be hidden.
  • the command issuing function 159 finds the address term of a manipulation command “Zoom in” and the address term “web browser” (corresponding to the address term AT 2 - 1 displayed in the example of FIG. 5 ) based on text extracted through voice recognition from the voice input of the foregoing speech with reference to the command DB 161 .
  • the command issuing function 159 issues a command to give a request for expanding and displaying the sub-image (corresponding to the region of Layer 2 - 1 ) of the web browser in the entire region of the image displayed on the display 110 to the image generation function 151 .
  • the image generation function 151 changes the image displayed on the display 110 from the image for displaying the region of Layer 1 to an image for expanding and displaying the entire region of Layer 2 - 1 .
  • the display control function 155 displays a new image (the region of Layer 2 - 1 ) on the display 110 and temporarily displays the address term defined for the region of the GUI component included in the image and the frame border indicating the region corresponding to this address term on the display 110 .
  • an address term AT 3 - 1 corresponding to the region of Layer 3 - 1 (tab), an address term AT 3 - 2 corresponding to the region of Layer 3 - 2 (address bar), an address term AT 3 - 3 corresponding to the region of Layer 3 - 3 (header), an address term AT 3 - 4 corresponding to the region of Layer 3 - 4 (body), and an address term AT 3 - 5 corresponding to the region of Layer 3 - 5 (options) are temporarily displayed on the display 110 by the display control function 155 .
  • the address terms AT 3 - 1 to AT 3 - 5 temporarily displayed when the new image (the region of Layer 2 - 1 ) is displayed on the display 110 may be newly defined by the address term definition function 153 .
  • the address terms AT 3 - 1 to AT 3 - 5 may be defined by the address term definition function 153 and may be stored along with data of the layered structure illustrated in FIG. 4 in the command DB 161 or the like.
  • the display control function 155 temporarily displays address terms C 1 (corresponding to a manipulation of zoom out) and C 2 (corresponding to a manipulation of scroll down) in the manipulation on the image displayed on the display 110 along with the foregoing address terms AT 3 - 1 to AT 3 - 5 on the display 110 . Since the image of the web browser is displayed in the entirety of the new image (the region of Layer 2 - 1 ) displayed on the display 110 , for example, a manipulation such as zoom out or scroll down can be performed without designating a specific region in the image.
  • the address term definition function 153 extracts the names of manipulation commands executable for the image of the web browser among the address terms of the manipulation commands stored in advance in the command DB 161 , the address term definition function 153 defines address terms corresponding to the extracted names of the manipulation commands.
  • the display control function 155 temporarily displays the address terms C 1 and C 2 on the display 110 . For example, positions at which the address terms C 1 and C 2 are displayed can be near ends or near corners of the display 110 so that view of displayed images is not obstructed.
  • the display control function 155 ends the display of the address terms AT 3 - 1 to AT 3 - 5 and the frame borders, and the display of the address terms C 1 and C 2 and the display returns to display of only the image (the region of layer 2 - 1 ).
  • the user can perform a manipulation on the regions corresponding to the address terms AT 3 - 1 to AT 3 - 5 or the address terms C 1 and C 2 by giving a voice input including the address terms AT 3 - 1 to AT 3 - 5 or the address terms C 1 and C 2 to the display device 100 .
  • the command issuing function 159 issues a command to give a request for displaying the address terms AT 3 - 1 to AT 3 - 5 and the frame borders and displaying the address terms C 1 and C 2 again on the display 110 to the display control function 155 .
  • the display control function 155 displays the foregoing address terms and the frame borders again on the display 110 .
  • the address terms and the frame borders displayed at this time may disappear at a predetermined time after start of the display of the address terms and the frame borders, as in the initial display of the address terms and the frame borders, or may be set not to disappear automatically in consideration of intentional calling of the user.
  • the display control function 155 can end the display of the address terms and the frame borders.
  • FIG. 7 is a diagram for describing a layered structure of regions in an image defined in the second embodiment of the present disclosure.
  • the entire image of a photo displayed on the display 110 is a region of Layer 1 .
  • Two regions (Layer 2 - 1 and Layer 2 - 2 ) of Layer 2 included in the region of Layer 1 are defined.
  • the two regions (Layer 2 - 1 and Layer 2 - 2 ) of Layer 2 include two regions (Layer 3 - 1 and Layer 3 - 2 ) and two regions (Layer 3 - 3 and Layer 3 - 4 ) of Layer 3 , respectively.
  • the two regions (Layer 3 - 3 and Layer 3 - 4 ) further include two regions (Layer 4 - 1 and Layer 4 - 2 ) and two regions (Layer 4 - 3 and Layer 4 - 4 ) of Layer 4 , respectively.
  • the region of each layer equivalent to or subordinate to Layer 2 can include, for example, a subject region specified based on a result of image analysis. In this case, all of the regions may not necessarily be defined based on the result of the image analysis.
  • a region of a superordinate layer for example, a region of Layer 2
  • the subject region may be specified based on a set manipulation of the user in addition to or instead of the image analysis.
  • the subject region may include a region determined when the user approves or corrects candidate regions suggested based on the result of the image analysis.
  • a region may be defined in advance by metadata incidental on image data of a photo or may be newly defined based on a result obtained by performing image analysis at the time of display of an image.
  • FIGS. 8A and 9C are diagrams illustrating display examples of address terms according to the second embodiment of the present disclosure.
  • an address term is defined for each of a plurality of subject regions included in an image displayed on the display 110 .
  • the region of Layer 1 is set to correspond to the entire image of a photo displayed on the display 110 (or the image of the photo may not necessarily be displayed on the entire display 110 , that is, the image of the photo may be displayed as one of the sub-images in the foregoing first embodiment) and regions equivalent to or subordinate to Layer 2 are defined for the subject regions included in the image.
  • the display control function 155 temporarily displays an address term defined for the region corresponding to each of the regions of Layer 2 and a frame border indicating the region corresponding to the address term on the display 110 .
  • an address term AT 2 - 1 corresponding to the region of Layer 2 - 1 and an address term AT 2 - 2 corresponding to the region of Layer 2 - 2 are temporarily displayed on the display 110 by the display control function 155 .
  • the address term definition function 153 defines address terms displayed as the address terms AT 2 - 1 and AT 2 - 2 , for example, based on the names of subjects specified by image analysis or an input from the user. For example, when metadata incidental to the image data of the photo records the fact that subjects included in the region of Layer 2 - 1 are parents and subjects included in the region of Layer 2 - 2 are children, the address term generation function can define address terms for the region of Layer 2 - 1 and the region of Layer 2 - 2 as “parents” and “children,” respectively.
  • the display control function 155 ends the display of the address terms AT 2 - 1 to AT 2 - 2 and the frame borders and the display returns to the display of a simple image (the region of Layer 1 ).
  • the user can perform a manipulation on the regions corresponding to the address terms AT 2 - 1 and AT 2 - 2 during or after the display of the address terms AT 2 - 1 and AT 2 - 2 by giving a voice input including the address terms AT 2 - 1 and AT 2 - 2 to the display device 100 .
  • the user before the display of the address terms and the frame borders end, the user gives a voice input “Select ‘Children’” to the display device 100 .
  • the command issuing function 159 finds the name of a manipulation command, “Select,” and an address term (corresponding to the address term AT 2 - 2 displayed in the example of FIG. 8A ) of the region, “Children,” based on text extracted through voice recognition from the voice input with reference to the command DB 161 .
  • the command issuing function 159 issues a command to give a request for allowing that the region of Layer 2 - 2 to enter a selection state to the image generation function 151 .
  • the image generation function 151 generates an image for displaying the region of Layer 2 - 2 in the selection state in the image displayed on the display 110 and the display control function 155 displays this image on the display 110 , as illustrated in FIG.
  • the region of Layer 2 - 2 can be expressed in the selection state, for example, by darkly displaying a region other than the region of Layer 2 - 2 or displaying the region of Layer 2 - 2 in connection with the frame border (the expressions are not necessarily shown in FIG. 8B ).
  • the display control function 155 displays a new image (an image in which the region of Layer 2 - 2 is in the selection state) on the display 110 and temporarily displays address terms defined for more subordinate layer regions included in the region of Layer 2 - 2 in the selection state, that is, the regions of Layer 3 - 3 and Layer 3 - 4 , and frame borders indicating the regions corresponding to the address terms along with the address term and the frame border defined for each of the regions of Layer 2 on the display 110 .
  • address terms AT 3 - 3 and AT 3 - 4 are further temporarily displayed on the display 110 in addition to the address terms AT 2 - 1 and AT 2 - 2 by the display control function 155 .
  • the display control function 155 ends the display of the foregoing address terms and frame borders and the display returns to the display of the simple image (the image in which the region of Layer 2 - 2 is in the selection state).
  • the user can perform a manipulation on the regions corresponding to the address terms during and after the display of the address terms and the frame borders by giving a voice input including the address terms to the display device 100 .
  • the user gives a voice input “Zoom in ‘Boy’” to the display device 100 .
  • the command issuing function 159 finds the name of a manipulation command, “Zoom in,” and an address term of the region, “Boy,” based on text extracted through voice recognition from the voice input with reference to the command DB 161 .
  • the command issuing function 159 issues a command to give a request for expanding and displaying the image using the region of Layer 3 - 3 as a criterion to the image generation function 151 .
  • the image generation function 151 generates an image expanded using the region of Layer 3 - 3 as the criterion and the display control function 155 displays this image on the display 110 , as illustrated in FIG. 8C .
  • the expansion and display of the image using the region of Layer 3 - 3 as the criterion can be realized right away (the regions of Layer 2 are skipped) from the state in which the image of the region of Layer 1 is displayed.
  • the display control function 155 displays a new image (the image expanded using the region of Layer 3 - 3 as the criterion) on the display 110 and temporarily displays address terms defined for more subordinate layer regions included in the region of Layer 3 - 3 used as the criterion of the expansion, that is, the regions of Layer 4 - 1 and Layer 4 - 2 , and frame borders indicating the regions corresponding to the address terms along with the address term and the frame border defined for the region of Layer 3 - 3 on the display 110 .
  • the image is expanded and displayed using the region of Layer 3 - 3 as the criterion and the region of Layer 3 - 3 is not necessarily displayed on the entire display 110 (in many cases, an aspect ratio of the subject region does not match an aspect ratio of the display 110 ), it can be useful to display the address terms and the regions defined for the region of Layer 3 - 3 even after the image is expanded and displayed.
  • an address term and a region defined for the region of Layer 3 - 4 may also be temporarily displayed on the display 110 by the display control function 155 .
  • the display control function 155 temporarily displays an address term C 1 (corresponding to a manipulation of zoom out) for a manipulation on the image displayed on the display 110 along with the foregoing address term on the display 110 . Since the original image is expanded and displayed in the new image (the image expanded using the region of Layer 3 - 3 as the criterion) displayed on the display 110 , for example, the manipulation of zoom out can be performed without designating a specific region in the image.
  • the address term definition function 153 extracts the names of manipulation commands executable on the image among the names of the manipulation commands stored in advance in the command DB 161 and defines address terms corresponding to the extracted names of the manipulation commands. As a result, the display control function 155 temporarily displays the address term C 1 on the display 110 .
  • the display control function 155 ends the display of the foregoing address terms and frame borders and the display returns to the display of the simple image (the image expanded using the region of Layer 3 - 3 as the criterion).
  • the user can perform a manipulation on the regions corresponding to the address terms during or after the display of the address terms by giving a voice input including the address terms to the display device 100 .
  • the user gives a voice input “Hand” to the display device 100 before the display of the address terms and the frame borders ends.
  • the command issuing function 159 finds the name of the region “Hand” based on text extracted through voice recognition from the voice input with reference to the command DB 161 .
  • the command issuing function 159 estimates a manipulation command for a region (region of Layer 4 - 1 ) specified as a manipulation target with reference to the command DB 161 .
  • the command issuing function 159 can recognize that selection of the region of Layer 4 - 1 is meaningless due to the fact that Layer 4 is the lowest layer and there are no more subordinate layer regions and can estimate that the manipulation command is “zoom in” based on this recognition.
  • the command issuing function 159 issues a command to give a request for expanding and displaying an image using the region of Layer 4 - 1 as a criterion to the image generation function 151 .
  • the image generation function 151 displays an image expanded using the region of Layer 4 - 1 as the criterion on the display 110 , as illustrated in FIG. 9B .
  • the display control function 155 displays no address term of a new region on the display 110 since Layer 4 is the lowest layer, as described above.
  • the display control function 155 temporarily displays an address term C 2 corresponding to the zoom out manipulation on the display 110 .
  • the display position (the top right corner of the image) of the address term C 2 may be set to a position different from the display position (the bottom left corner of the image) of the address term C 1 up to FIG. 9A .
  • the display control function 155 ends the display of the address term AT 4 - 1 and the corresponding frame border and the display returns to the display of the sample image (the image expanded using the region of Layer 4 - 1 as the criterion).
  • the user can perform a manipulation on the regions corresponding to the address terms during or after the display of the address terms by giving a voice input including the address terms to the display device 100 .
  • the user gives a voice input “Show ‘Girl's Face’” to the display device 100 before the display of the address terms and the frame borders ends.
  • the command issuing function 159 extracts the name of the manipulation command, “Show,” and the address term of the region, “Girl's Face,” based on text extracted through voice recognition from the voice input with reference to the command DB 161 .
  • the command issuing function 159 retrieves information regarding the address term stored in the command DB 161 in association with the layered structure of the regions.
  • the command issuing function 159 analyzes the text “Girl's Face,” and first retrieves “Girl” and subsequently retrieves “Face.” This is because there is a probability of an address term for a more subordinate layer region, for example, “Face,” being redundantly defined in other subordinate layer regions included in mutually different superordinate layer regions.
  • the command issuing function 159 first finds the address term “Girl” defined for the region of Layer 3 - 4 as the result obtained by retrieving the address term “Girl” from the command DB 161 . Then, the command issuing function 159 finds the address term “Face” defined for the region of Layer 4 - 4 as the result obtained by retrieving the address term “Face” in subordinate layer regions included in the region of Layer 3 - 4 from the command DB 161 . Based on the above retrieval results, the command issuing function 159 issues a command to give a request for expanding and displaying the image using the region of Layer 4 - 4 as a criterion to the image generation function 151 .
  • the image generation function 151 expands and displays the image expanded using the region of Layer 4 - 4 as the criterion, as illustrated in FIG. 9C .
  • a manipulation intended by the user can be realized even for the voice input including the address term which is not displayed at that time on the display 110 but which the user remembers as being previously displayed, since the address terms of the regions previously defined and displayed are stored in the command DB 161 .
  • FIGS. 10A to 10C are diagrams illustrating a first modification example of the second embodiment of the present disclosure.
  • FIGS. 10A to 10C when the speech “Zoom in ‘Boy’” of the user described above with reference to FIGS. 8B and 8C is acquired as a voice input, a change in display of the display 110 at the time of the expansion of display of an image using the region of Layer 3 - 3 as a criterion is shown together with display during transition ( FIG. 10B ).
  • text T including an address term included in the voice input and recognized by the display device 100 (the command issuing function 159 ) during the step of transition of the image displayed on the display 110 from the image of the region of Layer 1 to the image expanded and displayed using the region of Layer 3 - 3 as the criterion is displayed by the image generation function 151 .
  • the text T may be text indicating a command issued by the command issuing function 159 of the display device 100 .
  • the user can recognize that the display device 100 operates according to an intention of the user. For example, when the display device 100 does not correctly recognize a user's voice input and a command unintended by the user is performed, the user can understand what has happened based on such a display.
  • FIGS. 11 and 12 are diagrams illustrating a second modification example of the second embodiment of the present disclosure.
  • the layered structure of the regions including the subject regions has been set for a photo displayed on the display 110 .
  • all of the subordinate layer regions are included in the superordinate layer region, but are not included in this modification example.
  • only a part of a subordinate layer region may be included in a superordinate layer region in some cases.
  • only a part of the region of Layer 4 - 1 is included in the region of Layer 3 - 3 which is a superordinate layer region and the other part thereof is outside of the region of Layer 3 - 3 .
  • only a part of the region of Layer 4 - 3 is included in the region of Layer 3 - 4 which is a superordinate layer region and the other part thereof is outside of the region of Layer 3 - 4 .
  • a subordinate layer region which is entirely included in a superordinate layer region can be present.
  • FIG. 12 illustrates an example of the expansion and display of an image when a region is set as in the example of FIG. 11 .
  • a relation recognized between the superordinate layer region and the subordinate layer region can be relatively slight. Accordingly, for example, even when an image is expanded and displayed using the region of Layer 3 - 3 as a criterion, not only address terms (AT 4 - 1 and AT 4 - 2 ) of the subordinate layer regions (Layer 4 - 1 and Layer 4 - 2 ) included in Layer 3 - 3 but also an address term (AT 4 - 3 ) of the subordinate layer region (Layer 4 - 3 ) included in Layer 3 - 4 can be displayed.
  • the address term definition function 153 adds sequence numbers to distinguish the address terms from each other, so that the address term AT 4 - 1 becomes “Hand1” and the address term AT 4 - 3 becomes “Hand2.”
  • the address term definition function 153 may omit a load of the sequence numbers.
  • regions in an image can be configured as in the example illustrated in FIG. 7 and can also be configured as in the example illustrated in FIG. 11 .
  • regions in an image can be configured as in the example illustrated in FIG. 7 and can also be configured as in the example illustrated in FIG. 11 .
  • the address terms of Layers 4 - 1 and Layer 4 - 2 which are subordinate layer regions of the region of Layer 3 - 3 are temporarily displayed.
  • the region of Layer 3 - 4 included in a display range of the image and the address terms of Layer 4 - 3 and Layer 4 - 4 which are subordinate layer regions of the region of Layer 3 - 4 can also be temporarily displayed.
  • the address term of the region of Layer 4 - 1 or Layer 4 - 2 and the address term of the region of Layer 4 - 3 or Layer 4 - 4 overlap, the address terms are distinguished from each other by adding sequence numbers, as described above, or the overlapping can be resolved by not displaying the overlapping address term of the region of Layer 4 - 3 or Layer 4 - 4 .
  • FIG. 13 is a diagram illustrating a display example of address terms in the third embodiment of the present disclosure.
  • an address term is defined for each of the display regions of a plurality of sub-images included in an image displayed on the display 110 .
  • the region of Layer 1 is defined in the entire image displayed on the display 110 and regions of Layer 2 are defined for display regions of the sub-images included in the image.
  • the image displayed on the display 110 is an image displayed in a presentation by a presenter and includes a currently displayed slide (page) and sub-images indicating previously displayed graphs, pages, or the like.
  • the address term definition function 153 can define address terms to correspond to display regions of main text (Layer 2 - 1 ) and a right graph (Layer 2 - 2 ) in a title “Current Slide,” and a last graph (Layer 2 - 3 ), a last page (Layer 2 - 4 ), and a next page (Layer 2 - 5 ) displayed separately from “Current Slide.”
  • the display control function 155 displays the image (the region of Layer 1 ) on the display 110 and temporarily displays an address term defined for a region corresponding to each sub-image and a frame border indicating a region corresponding to the address term on the display 110 .
  • an address term AT 2 - 1 corresponding to the region of Layer 2 - 1 (the main text), an address term AT 2 - 2 corresponding to the region of Layer 2 - 2 (the right graph), an address term AT 2 - 3 corresponding to the region of Layer 2 - 3 (the last graph), an address term AT 2 - 4 corresponding to the region of Layer 2 - 4 (the last page), and an address term AT 2 - 5 corresponding to the region of Layer 2 - 5 (the next page) are temporarily displayed on the display 110 by the display control function 155 .
  • a voice input of the user given to the display device 100 can be, for example, “Show ‘Last graph’” or “Go to ‘Next page.’”
  • the command issuing function 159 issues a command to give a request for enlarging and displaying the graph referred to previously again, a command to give a request for proceeding to a next slide (page), or the like to the image generation function 151 .
  • the image generation function 151 newly displays a slide (page) displayed as “Next page” in the drawing in the region of “Current Slide” based on the command to give a request for proceeding to the next slide (page), the slide (page) displayed as “Current Slide” up to that time is displayed as “Last page” at this time.
  • the address term definition function 153 may define address terms based on a chronological change in display forms of the regions. For example, when the slide (page) displayed as “Current Slide” at that time is displayed to be smaller at a location separate from “Current slide,” the address term definition function 153 may define an address term with the same meaning as that of “Last page” for the region corresponding to this slide (page).
  • Embodiments of the present disclosure can include, for example, the information processing device described above (described as the display device), a system, an information processing method executed in the information processing device or the system, a program causing the information processing device to function, and a non-transitory computer-readable storage medium having a program stored therein.
  • subordinate layer region corresponds to a display region of each of a plurality of sub-images included in the image.
  • the superordinate layer region corresponds to an application image displayed in an entire region of the image
  • subordinate layer region corresponds to a region of a GUI component included in the application image.
  • an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display

Abstract

There is provided an information processing device including a processor configured to realize an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display, a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region, a voice input acquisition function of acquiring a voice input for the image, and a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of Japanese Priority Patent Application JP 2013-144449 filed Jul. 10, 2013, the entire contents of which are incorporated herein by reference.
  • BACKGROUND
  • The present disclosure relates to an information processing device, an information processing method, and a program.
  • Of user interfaces of computers, natural user interfaces (NUIs) realizing manipulations in more natural and more intuitive operations for users have recently been popularized widely. Well-known natural user interfaces are NUIs in which voices spoken by users, gestures made by users, or the like are used as input manipulations. Such input manipulations are individually used in many cases. However, for example, JP 2012-103840A discloses a technology for combining and using an input manipulation by a voice and an input manipulation by a gesture.
  • SUMMARY
  • However, for example, when a plurality of UI components are intended to be selectively manipulated by an NUI, it is difficult for a user to understand which symbol (for example, an address term by a voice) is set in order to specify a manipulation target UI component in some cases. Although the technology disclosed in JP 2012-103840A described above contributes to an improvement in a user experience on an NUI, the technology may not necessarily be said to sufficiently deal with the above-mentioned point.
  • Accordingly, it is desirable to provide a novel and improved information processing device, a novel and improved information processing method, and a novel and improved program capable of notifying a user of a symbol for specifying a manipulation target on an NUI so that the user can easily understand the symbol.
  • According to an embodiment of the present disclosure, there is provided an information processing device including a processor configured to realize an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display, a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region, a voice input acquisition function of acquiring a voice input for the image, and a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
  • According to another embodiment of the present disclosure, there is provided an information processing method including, by a processor defining an address term for at least a partial region of an image to be displayed on a display, displaying the image on the display and temporarily displaying the address term on the display in association with the region, acquiring a voice input for the image, and issuing a command relevant to the region when the address term is included in the voice input.
  • According to still another embodiment of the present disclosure, there is provided a program causing a computer to realize an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display, a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region, a voice input acquisition function of acquiring a voice input for the image, and a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
  • As described above, according to an embodiment of the present disclosure, it is possible to notify a user of a symbol for specifying a manipulation target on an NUI so that the user can easily understand the symbol.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram illustrating an overall configuration of a display device according to an embodiment of the present disclosure;
  • FIG. 2 is a diagram illustrating an overall functional configuration realized in the display device according to an embodiment of the present disclosure;
  • FIG. 3 is a diagram for describing a layered structure of regions in an image defined in a first embodiment of the present disclosure;
  • FIG. 4 is a diagram for describing the layered structure of the regions in the image defined in the first embodiment of the present disclosure;
  • FIG. 5 is a diagram illustrating a display example of address terms in the first embodiment of the present disclosure;
  • FIG. 6 is a diagram illustrating a display example of the address terms in the first embodiment of the present disclosure;
  • FIG. 7 is a diagram for describing a layered structure of regions in an image defined in a second embodiment of the present disclosure;
  • FIGS. 8A to 8C are diagrams illustrating a display example of address terms in the second embodiment of the present disclosure;
  • FIGS. 9A to 9C are diagrams illustrating a display example of the address terms in the second embodiment of the present disclosure;
  • FIGS. 10A to 10C are diagrams illustrating a first modification example of the second embodiment of the present disclosure;
  • FIG. 11 is a diagram illustrating a second modification example of the second embodiment of the present disclosure;
  • FIG. 12 is a diagram illustrating the second modification example of the second embodiment of the present disclosure; and
  • FIG. 13 is a diagram illustrating a display example of an address term in a third embodiment of the present disclosure.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the appended drawings. Note that, in this specification and the appended drawings, structural elements that have substantially the same function and structure are denoted with the same reference numerals, and repeated explanation of these structural elements is omitted.
  • The description will be made in the following order.
  • 1. System Configuration
  • 1-1. Hardware Configuration
  • 1-2. Functional Configuration
  • 1-3. Layered Structure of Regions
  • 1-4. Display Example of Address Term
  • 2. Second Embodiment
  • 2-1. Layered Structure of Regions
  • 2-2. Display Example of Address Term
  • 2-3. Modification Examples
  • 3. Third Embodiment
  • 4. Supplement
  • 1. System Configuration (1-1. Hardware Configuration)
  • FIG. 1 is a block diagram illustrating an overall configuration of a display device according to an embodiment of the present disclosure. Referring to FIG. 1, a display device 100 includes a processor 102, a memory 104, a storage 106, a communication module 108, a display 110, a speaker 112, a microphone 114, an input device 116, a camera module 118, and a connection port 120.
  • The display device 100 may be any of the various devices that display an image on a display 110 according to a user's desire. For example, the display device 100 may be a television, a personal computer (PC), a tablet terminal, a smartphone, a portable media player, or a portable game device including the display 110. For example, the display device 100 may be a PC, a set-top box, a recorder, or a game device connected to the separately configured displayed 110 and controlling the display 110. Hereinafter, the constituent elements of the display device 100 will be further described.
  • The processor 102 is realized by, for example, a central processing unit (CPU), a digital signal processor (DSP), or an application specific integrated circuit (ASIC) and operates according to programs stored in the memory 104 to realize various functions. The processor 102 acquires various inputs by controlling each unit of the display device 100 and provides various outputs. The detailed functions realized by the processor 102 will be described below.
  • The memory 104 is realized by, for example, by a semiconductor memory used as a random access memory (RAM) or a read-only memory (ROM). The memory 104 stores, for example, programs causing the processor 102 to operate. For example, the programs may be read from the storage 106 and may be temporarily loaded on the memory 104 or the programs may be permanently stored in the memory 104. Alternatively, the programs may be received by the communication module 108 and may be loaded temporarily on the memory 104. Also, the memory 104 temporarily or permanently stores various kinds of data generated through processes of the processor 102.
  • The storage 106 is realized by, for example, a storage device such as a magnetic disk such as a hard disk drive (HDD), an optical disc, or a magneto-optical disc or a flash memory. The storage 106 permanently stores, for example, programs causing the processor 102 to operate or various kinds of data generated through processes of the processor 102. The storage 106 may be configured to include a removable medium or may be included in the display device 100.
  • The communication module 108 is realized by any of the various communication circuits performing wired or wireless network communication under the control of the processor 102. When wireless communication is performed, the communication module 108 may include an antenna. For example, the communication module 108 performs network communication in conformity with a communication standard of the Internet, a local area network (LAN), Bluetooth (registered trademark), or the like.
  • The display device 100 includes the display 110 and the speaker 112 as output units. The display 110 is realized by, for example, a liquid crystal display (LCD) or an organic electro-luminescence (EL) display. As described above, the display 110 may be integrated with the display device 100 or may be a separate display. The display 110 displays various kinds of information as images under the control of the processor 102. An example of an image displayed on the display 110 will be described below. The speaker 112 outputs various kinds of information as voices under the control of the processor 102.
  • For example, the microphone 114 acquires diverse kinds of voices, such as voices spoken by a user, produced in the vicinity of the display device 100, and supplies the voices as voice data to the processor 102. Here, in the embodiment, the microphone 114 is used as a voice input unit on the NUI. That is, the voice data provided by the microphone 114 is analyzed by the processor 102 and various commands are executed based on the voices or the like spoken by the user and extracted from the voice data.
  • The input device 116 is another input unit used in the display device 100. The input device 116 may include, for example, a keyboard, a button, or a mouse. The input device 116 may include a touch sensor arranged at a position corresponding to the display 110 so that a touch panel is configured by the display 110 and the touch sensor. When the display device 100 can be sufficiently manipulated by a voice input using the microphone 114, the separate input device 116 may not be installed.
  • The camera module 118 is realized by, for example, an image sensor such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS), an optical system such as a lens controlling formation of a subject image in the image sensor, and a driving circuit driving the image sensor and the optical system. The camera module 118 supplies a still image or a moving image generated by capturing the subject image by the image sensor as image data to the processor 102. The still image or the moving image generated by the camera module 118 may be displayed as, for example, a through image or a recorded image on the display 110.
  • The connection port 120 is a port directly connecting an external device to the display device 100 and is realized by, for example, a Universal Serial Bus (USB) port, an IEEE 1394 port, or a High-Definition Multimedia Interface (HDMI) (registered trademark) port. In the illustrated example, the storage 106, the display 110, the speaker 112, the microphone 114, and the input device 116 are connected to the processor 102 internally (for example, by a bus or the like), but such constituent elements may be separate from the display device 100. In this case, for example, a display device (an external display or the like), an input device (a keyboard, a mouse, or the like), or a storage device (an external HDD or the like) can be connected to the connection port 120. Devices connected to the connection port 120 are not limited to these examples, but various devices other than the above-described devices may be connected.
  • (1-2. Functional Configuration)
  • FIG. 2 is a diagram illustrating an overall functional configuration realized in the display device according to an embodiment of the present disclosure. Referring to FIG. 2, in the display device 100, an image generation function 151, an address term definition function 153, a display control function 155, a voice input acquisition function 157, and a command issuing function 159 can be realized.
  • Such functions are realized, for example, when the processor 102 of the display device 100 operates according to programs stored in the memory 104. Any of the foregoing functions refers to a command DB 161. The command DB 161 may be stored in the storage 106 of the display device 100 and a part or the entirety of the command DB 161 may be read to the memory 104, as necessary.
  • (Image Generation Function)
  • The image generation function 151 generates an image to be displayed on the display 110 of the display device 100. For example, the image may include a content image such as a photo or a video or an image of a document (including, for example, web pages described by the hypertext markup language (http)) described in various formats. The image may include a graphical user interface (GUI) image used to manipulate the display device 100. For example, data for displaying such an image may be read from the storage 106 or may be acquired from a server or the like on a network via the communication module 108. The image generated by the image generation function 151 is displayed on the display 110 by the display control function 155.
  • For example, the image generation function 151 can generate an image including a plurality of sub-images. The sub-images may be, for example, content images, document images, or GUI images and may include images displayed on the display 110 by arranging such images in predetermined regions. In this case, the image generation function 151 can expand and display any of the sub-images on the entire region (full screen) of the display 110, for example, in response to a command issued by the command issuing function 159.
  • For example, the image generation function 151 can generate an image in which a subject image is defined. The image may be, for example, a content image such as a photo or a video and the subject region in the image is recognized through an image recognition process or a setting manipulation of the user. In this case, the image generation function 151 can expand and display an image using any of the subject regions as a criterion, for example, in response to a command issued by the command issuing function 159. For example, such an image may be displayed in the entire region (full screen) of the display 110 or may be one of the foregoing sub-images.
  • (Address Term Definition Function)
  • The address term definition function 153 defines an address term regarding at least a partial region of an image generated by the image generation function 151 and displayed on the display 110 by the display control function 155. As described above, in the embodiment, an image displayed on the display 110 can include a region defined as, for example, a sub-image or a subject region in the image. The address term definition function 153 defines an address term for each of the regions so that display of an image on the display 110 can be easily manipulated by a voice input, as will be described below. The address term definition function 153 supplies the defined address term to the display control function 155 to display the address term on the display 110 and stores information regarding the address term in the command DB 161.
  • For example, the address term definition function 153 defines an address term for each of the display regions of the plurality of sub-images included in the image displayed on the display 110. In this case, for example, the address term definition function 153 may define an address term based on set information of an application function providing each sub-image (for example, address terms such as “web browser” or “media player” can be defined). For example, the address term definition function 153 may define an address term based on a title, text, or the like included in each sub-image (for example, address terms such as “news,” “memo,” and “movie” can be defined). Here, for example, when an overlapping address term is defined in such an example, the address term definition function 153 may uniquely define the address term by using, for example, a sequence number (for example, address terms such as “web browser 1” and “web browser 2” can be defined). Further, for example, the address term definition function 153 may define an address term based on the location of each sub-image in an image (for example, address terms such as “top left” and “bottom right” can be defined).
  • Here, for example, when a content image, a document image, or the like (hereinafter also referred to as an application image) which is the foregoing sub-image is expanded and displayed on the entire region of the display 110 in response to a command issued by the command issuing function 159, for example, the address term definition function 153 may define an address term for the region of a GUI component (for example, which can be a button, a tab, an icon, or the like) included in an application image. In this case, for example, the address term definition function 153 may define an address term based on information regarding the GUI component defined in a program providing an application image (for example, an address term such as “header,” “tab,” or “address bar” can be defined).
  • For example, the address term definition function 153 defines an address term for each of a plurality of subject regions included in an image displayed on the display 110. In this case, for example, the address term definition function 153 may define an address term based on a result of newly performed image analysis or information added as metadata to an image. Here, for example, when a subject can be identified by the result of the image analysis or the name of a subject is recorded as metadata, the address term definition function 153 may define an address term of a subject region based on the name of the subject (for example, an address term such as “parents,” “children,” “hands,” or “faces” can be defined. Otherwise, for example, the address term definition function 153 may define an address term based on the position of each subject region in an image (for example, an address term such as “top left” or “bottom right” can be defined).
  • For example, the address term definition function 153 defines an address term for a manipulation on an image displayed on the display 110. The “manipulation on an image” mentioned here can be distinguished from manipulations of other examples in the manipulation is a manipulation performed without designation of a specific region in an image. The address term defined here can be an address term for the entire region of the image displayed on the display 110. In this case, for example, the address term definition function 153 may define an address term based on the name of a manipulation command stored in advance in the command DB 161 (for example, an address term such as “zoom out” or “scroll down” can be defined).
  • (Display Control Function)
  • The display control function 155 displays an image generated by the image generation function 151 on the display 110 and temporarily displays address terms defined by the address term definition function 153 on the display 110 in association with regions corresponding to the address terms in the image. For example, the display control function 155 displays the address term as text at locations somewhere in the corresponding regions. The display control function 155 may display frame borders or the like indicating the regions corresponding to the address terms and may display the address terms and the frame borders or the like in association therewith.
  • As described above, the address terms (and the frame borders or the like) by the display control function 155 are temporarily displayed. For example, the display control function 155 may start the display of an image newly generated by the image generation function 151 on the display 110 and then display the address terms (and the frame borders or the like) on the display 110 only for a predetermined time. After the predetermined time passes, only the image generated by the image generation function 151 can be displayed on the display 110.
  • Thus, when the address terms defined by the address term definition function 153 are temporarily displayed on the display 110 in association with the regions corresponding to these address terms, the user can easily recognize by which address term the user may specify a manipulation target when manipulating the display device 100 using a voice input, as will be described below. Also, the address terms (and the frame borders or the like) are temporarily displayed. For example, by hiding the address terms after a predetermined time passes, it is possible to ensure visibility of an image displayed on the display 110.
  • For example, the display control function 155 can resume the display of the address terms (and the frame borders or the like) according to a command issued by the command issuing function 159. In this case, the display control function 155 may continue to display the address terms (and the frame borders or the like), for example, until a separate command is issued from the command issuing function 159.
  • (Command DB)
  • The command DB 161 stores the information regarding the address terms defined by the address term definition function 153, as described above. The information regarding the address term can include, for example, the address term itself and information for specifying a region specified by the address term in an image. For example, when an address term is set to correspond to a display region of a sub-image, identification information (for example, which can be a process ID, a window ID, or the like) of an application function providing the sub-image can be stored along with the address term. For example, when an address term is set to correspond to the region of a GUI component included in an application image, identification information (for example, which can be an ID given to a button, a tab, or the like) of the GUI component can be stored along with the address term. For example, when an address term is set to correspond to a subject region, identification information (for example, which can be coordinate information in an image) of the subject region can be stored along with the address term.
  • The command DB 161 stores the names of manipulation commands defined in advance. For example, as described above, the names of the manipulation commands are referred to when the address term definition function 153 sets the address term for a manipulation on an image displayed on the display 110 and are also referred to when the command issuing function 159 to be described below analyzes a voice input from the user. The manipulation commands include manipulation commands performed without designating specific regions in an image, as described above, and also include manipulation commands performed by designating specific regions in an image (for example, which can be manipulation commands such as “zoom in” and “select”). The command issuing function 159 specifies a kind of command instructed by a voice input by referring to the name of the manipulation command stored in the command DB 161 and also specifies a region of a target by referring to information regarding the names stored in the command DB 161.
  • (Voice Input Acquisition Function)
  • The voice input acquisition function 157 acquires a voice input for an image displayed on the display 110. More specifically, when an image is displayed on the display 110 by the display control function 155, the voice input acquisition function 157 acquires the voice input of the user acquired via the microphone 114. As described above, since the microphone 114 acquires diverse kinds of voices produced near the display device 100, the voice input acquisition function 157 may perform a process of extracting a predetermined voice such as a speech of the user from acquired voice data. Alternatively, apart from the voice input acquisition function 157, a preprocessing unit (not illustrated) performing the foregoing process may be provided.
  • (Command Issuing Function)
  • When the voice input acquired by the voice input acquisition function 157 includes an address term defined by the address term definition function 153, the command issuing function 159 issues a command relevant to the region corresponding to the address term. More specifically, based on text extracted through voice recognition from the acquired voice input, the command issuing function 159 retrieves the address term and the name of the manipulation command included in the text with reference to the command DB 161. When the corresponding address term and the name of the manipulation command are found, the command issuing function 159 issues the manipulation command corresponding to the name included in the text with respect to the region corresponding to the address term included in the text. The command can be issued to, for example, the image generation function 151 or the display control function 155.
  • For example, when an image displayed on the display 110 includes a plurality of sub-images, the command issuing function 159 issues a command for expanding and displaying any of the sub-images on the entire region (full screen) of the display 110 to the image generation function 151. At this time, which sub-image is expanded and displayed can be determined based on the address term included in the voice input with reference to the command DB 161.
  • For example, when an image displayed on the display 110 includes a plurality of subject regions, the command issuing function 159 issues a command for expanding and displaying the image using any of the subject regions as a criterion to the image generation function 151. At this time, which subject region is used as the criterion to expand and display the image can be determined based on the address term included in the voice input with reference to the command DB 161.
  • 1-3. Layered Structure of Regions
  • FIGS. 3 and 4 are diagrams for describing a layered structure of regions in an image defined in a first embodiment of the present disclosure. In the embodiment, regions in an image for which address terms are defined by the address term definition function 153 can be set according to a layered structure to be described below.
  • In an example illustrated in FIG. 3, four layers, Layer 1 to Layer 4, have superordinate and subordinate relations. That is, a region of Layer 1 is a superordinate-layer region of a region of Layer 2, a region of Layer 2 is a superordinate-layer region of a region of Layer 3, and a region of Layer 3 is a superordinate-layer region of a region of Layer 4. In the embodiment, a subordinate layer region is included in a superordinate layer region. In the illustrated example, all of the subordinate layer regions are included in a superordinate layer region. In another example, however, at least some of the subordinate layer regions may be included in a superordinate layer region.
  • In the embodiment, such a layered structure is used to select a region in an image for which an address term is displayed. In the case of the example illustrated in FIG. 3, for example, when address terms are displayed for all of the regions from Layer 1 to Layer 4 on the display 110 along with an image, there is a probability of visibility of the image being damaged even in temporary display. Accordingly, when a superordinate layer region is displayed on the display 110, the display control function 155 can display address terms of subordinate layer regions included in the displayed superordinate layer region on the display 110.
  • That is, for example, when a region (superordinate layer region) of Layer 1 is displayed on the display 110, the display control function 155 displays the address terms of regions (subordinate layer regions) of Layer 2 included in the region of Layer 1 on the display 110. For example, when any region (superordinate layer region) of the two regions of Layer 2 is displayed on the display 110 in the illustrated example, the display control function 155 displays the address terms of regions (subordinate layer regions) of Layer 3 included in the displayed region of Layer 2 on the display 110.
  • For example, the layered structure described above may be defined as a parent-child relation or a link relation as in the example illustrated in FIG. 4. The illustrated example shows a case in which two regions (layer 2-1 and Layer 2-2) of Layer 2 are included in a region (only one region is set and corresponds to the entire image, for example) of Layer 1, three regions (Layer 3-1, Layer 3-2, and Layer 3-3) of Layer 3 are included in Layer 2-1, and so on. Such a relation is specified when the address term definition function 153 defines an address term for each region. Thereafter, the address term can be used by the display control function 155 or the command issuing function 159. A specific use example of the layered structure of the regions will be described below.
  • (1-4. Display Example of Address Term)
  • FIGS. 5 and 6 are diagrams illustrating display examples of address terms according to the first embodiment of the present disclosure. In the embodiment, for example, as illustrated in the example of FIG. 5, an address term is defined for each of the display regions of a plurality of sub-images included in an image displayed on the display 110. In the illustrated example, a region of Layer 1 is set to correspond to the entire image displayed on the display 110, and regions of Layer 2 are defined for display regions of sub-images included in the image. The address term definition function 153 can define an address term to correspond to each of the display regions of a web browser (Layer 2-1), a development tool (Layer 2-2), a text editor (Layer 2-3), and a task bar (Layer 2-4). In this example, the display device 100 can be, for example, a PC.
  • At this time, the display control function 155 displays the image (the region of Layer 1) on the display 110 and temporarily displays an address term defined for a region corresponding to each sub-image and a frame border indicating a region corresponding the address term on the display 110. In the drawing, the address term AT2-1 corresponding to the region of Layer 2-1 (web browser), the address term AT2-2 corresponding to the region of Layer 2-2 (development tool), the address term AT2-3 corresponding to the region of Layer 2-3 (text editor), and the address term AT2-4 corresponding to the region of Layer 2-4 (task bar) are temporarily displayed on the display 110 by the display control function 155.
  • Thereafter, when display of the image (the region of Layer 1) starts and a predetermined time passes, the display control function 155 ends the display of the frame borders of the address terms AT2-1 to AT2-4 and the display returns to display of only the image (the region of Layer 1). The user can perform a manipulation on the regions corresponding to the address terms AT2-1 to AT2-4 during or after the display of the address terms AT2-1 to AT2-4 by giving a voice input including the address terms AT2-1 to AT2-4 to the display device 100. For example, when the user says “Zoom in ‘web browser,’” a zoom-in manipulation can be performed on a display region of a web browser corresponding to the region of Layer 2-1, as will be described below, by the command issuing function 159 to which this voice input is supplied by the voice input acquisition function 157. For example, when the user says “Hide Task Bar,” display of a task bar corresponding to the region of Layer 2-4 can be hidden.
  • In the example illustrated in FIG. 6, a change of the display is shown when the user says “Zoom in ‘web browser’” in the first example illustrated in FIG. 5. In this case, the command issuing function 159 finds the address term of a manipulation command “Zoom in” and the address term “web browser” (corresponding to the address term AT2-1 displayed in the example of FIG. 5) based on text extracted through voice recognition from the voice input of the foregoing speech with reference to the command DB 161. Thus, the command issuing function 159 issues a command to give a request for expanding and displaying the sub-image (corresponding to the region of Layer 2-1) of the web browser in the entire region of the image displayed on the display 110 to the image generation function 151. In response to this command, the image generation function 151 changes the image displayed on the display 110 from the image for displaying the region of Layer 1 to an image for expanding and displaying the entire region of Layer 2-1.
  • At this time, the display control function 155 displays a new image (the region of Layer 2-1) on the display 110 and temporarily displays the address term defined for the region of the GUI component included in the image and the frame border indicating the region corresponding to this address term on the display 110. In the drawing, an address term AT3-1 corresponding to the region of Layer 3-1 (tab), an address term AT3-2 corresponding to the region of Layer 3-2 (address bar), an address term AT3-3 corresponding to the region of Layer 3-3 (header), an address term AT3-4 corresponding to the region of Layer 3-4 (body), and an address term AT3-5 corresponding to the region of Layer 3-5 (options) are temporarily displayed on the display 110 by the display control function 155.
  • In the illustrated example, for example, the address terms AT3-1 to AT3-5 temporarily displayed when the new image (the region of Layer 2-1) is displayed on the display 110 may be newly defined by the address term definition function 153. Alternatively, for example, when the region of Layer 2-1 is displayed as the sub-image in the image of the region of Layer 1, the address terms AT3-1 to AT3-5 may be defined by the address term definition function 153 and may be stored along with data of the layered structure illustrated in FIG. 4 in the command DB 161 or the like.
  • The display control function 155 temporarily displays address terms C1 (corresponding to a manipulation of zoom out) and C2 (corresponding to a manipulation of scroll down) in the manipulation on the image displayed on the display 110 along with the foregoing address terms AT3-1 to AT3-5 on the display 110. Since the image of the web browser is displayed in the entirety of the new image (the region of Layer 2-1) displayed on the display 110, for example, a manipulation such as zoom out or scroll down can be performed without designating a specific region in the image. Thus, the address term definition function 153 extracts the names of manipulation commands executable for the image of the web browser among the address terms of the manipulation commands stored in advance in the command DB 161, the address term definition function 153 defines address terms corresponding to the extracted names of the manipulation commands. As a result, the display control function 155 temporarily displays the address terms C1 and C2 on the display 110. For example, positions at which the address terms C1 and C2 are displayed can be near ends or near corners of the display 110 so that view of displayed images is not obstructed.
  • Thereafter, when the display of the image (the region of Layer 2-1) starts and a predetermined time passes, the display control function 155 ends the display of the address terms AT3-1 to AT3-5 and the frame borders, and the display of the address terms C1 and C2 and the display returns to display of only the image (the region of layer 2-1). Even after the display of the address terms AT3-1 to AT3-5 and the address terms C1 and C2, the user can perform a manipulation on the regions corresponding to the address terms AT3-1 to AT3-5 or the address terms C1 and C2 by giving a voice input including the address terms AT3-1 to AT3-5 or the address terms C1 and C2 to the display device 100.
  • When the user gives a predetermined voice input (for example, “Show Commands”) to the display device 100, the command issuing function 159 issues a command to give a request for displaying the address terms AT3-1 to AT3-5 and the frame borders and displaying the address terms C1 and C2 again on the display 110 to the display control function 155. In response to this command, the display control function 155 displays the foregoing address terms and the frame borders again on the display 110. The address terms and the frame borders displayed at this time may disappear at a predetermined time after start of the display of the address terms and the frame borders, as in the initial display of the address terms and the frame borders, or may be set not to disappear automatically in consideration of intentional calling of the user. In this case, when the user gives another predetermined voice input (for example, “Hide Commands”) to the display device 100, the display control function 155 can end the display of the address terms and the frame borders.
  • 2. Second Embodiment
  • Next, a second embodiment of the present disclosure will be described.
  • Since a configuration in the second embodiment is almost the same as that in the foregoing first embodiment except for configuration examples of regions and display examples of address terms to be described below, detailed description other than that of the configuration examples and the display examples will be omitted.
  • (2-1. Layered Structure of Regions)
  • FIG. 7 is a diagram for describing a layered structure of regions in an image defined in the second embodiment of the present disclosure. In the embodiment, the entire image of a photo displayed on the display 110 is a region of Layer 1. Two regions (Layer 2-1 and Layer 2-2) of Layer 2 included in the region of Layer 1 are defined. The two regions (Layer 2-1 and Layer 2-2) of Layer 2 include two regions (Layer 3-1 and Layer 3-2) and two regions (Layer 3-3 and Layer 3-4) of Layer 3, respectively. Of the four regions (Layer 3-1 to Layer 3-4) of Layer 3, the two regions (Layer 3-3 and Layer 3-4) further include two regions (Layer 4-1 and Layer 4-2) and two regions (Layer 4-3 and Layer 4-4) of Layer 4, respectively.
  • The region of each layer equivalent to or subordinate to Layer 2 can include, for example, a subject region specified based on a result of image analysis. In this case, all of the regions may not necessarily be defined based on the result of the image analysis. For example, a region of a superordinate layer (for example, a region of Layer 2) may be a region set later as a region for grouping regions (for example, the regions of Layer 3 and Layer 4) of subordinate layers corresponding to a subject region specified based on the result of the image analysis. Alternatively, the subject region may be specified based on a set manipulation of the user in addition to or instead of the image analysis. For example, the subject region may include a region determined when the user approves or corrects candidate regions suggested based on the result of the image analysis. For example, such a region may be defined in advance by metadata incidental on image data of a photo or may be newly defined based on a result obtained by performing image analysis at the time of display of an image.
  • (2-2. Display Example of Address Term)
  • FIGS. 8A and 9C are diagrams illustrating display examples of address terms according to the second embodiment of the present disclosure. In the embodiment, for example, as illustrated in the example of FIG. 7, an address term is defined for each of a plurality of subject regions included in an image displayed on the display 110. In the illustrated example, the region of Layer 1 is set to correspond to the entire image of a photo displayed on the display 110 (or the image of the photo may not necessarily be displayed on the entire display 110, that is, the image of the photo may be displayed as one of the sub-images in the foregoing first embodiment) and regions equivalent to or subordinate to Layer 2 are defined for the subject regions included in the image.
  • First, as illustrated in FIG. 8A, when an image (the region of Layer 1) is displayed on the display 110, the display control function 155 temporarily displays an address term defined for the region corresponding to each of the regions of Layer 2 and a frame border indicating the region corresponding to the address term on the display 110. In the drawing, an address term AT2-1 corresponding to the region of Layer 2-1 and an address term AT2-2 corresponding to the region of Layer 2-2 are temporarily displayed on the display 110 by the display control function 155.
  • Here, the address term definition function 153 defines address terms displayed as the address terms AT2-1 and AT2-2, for example, based on the names of subjects specified by image analysis or an input from the user. For example, when metadata incidental to the image data of the photo records the fact that subjects included in the region of Layer 2-1 are parents and subjects included in the region of Layer 2-2 are children, the address term generation function can define address terms for the region of Layer 2-1 and the region of Layer 2-2 as “parents” and “children,” respectively.
  • Thereafter, when the display of the image (the region of Layer 1) starts and a predetermined time passes, the display control function 155 ends the display of the address terms AT2-1 to AT2-2 and the frame borders and the display returns to the display of a simple image (the region of Layer 1). The user can perform a manipulation on the regions corresponding to the address terms AT2-1 and AT2-2 during or after the display of the address terms AT2-1 and AT2-2 by giving a voice input including the address terms AT2-1 and AT2-2 to the display device 100. In the illustrated example, before the display of the address terms and the frame borders end, the user gives a voice input “Select ‘Children’” to the display device 100.
  • At this time, the command issuing function 159 finds the name of a manipulation command, “Select,” and an address term (corresponding to the address term AT2-2 displayed in the example of FIG. 8A) of the region, “Children,” based on text extracted through voice recognition from the voice input with reference to the command DB 161. Thus, the command issuing function 159 issues a command to give a request for allowing that the region of Layer 2-2 to enter a selection state to the image generation function 151. In response to this command, the image generation function 151 generates an image for displaying the region of Layer 2-2 in the selection state in the image displayed on the display 110 and the display control function 155 displays this image on the display 110, as illustrated in FIG. 8B. The region of Layer 2-2 can be expressed in the selection state, for example, by darkly displaying a region other than the region of Layer 2-2 or displaying the region of Layer 2-2 in connection with the frame border (the expressions are not necessarily shown in FIG. 8B).
  • Here, as illustrated in FIG. 8B, the display control function 155 displays a new image (an image in which the region of Layer 2-2 is in the selection state) on the display 110 and temporarily displays address terms defined for more subordinate layer regions included in the region of Layer 2-2 in the selection state, that is, the regions of Layer 3-3 and Layer 3-4, and frame borders indicating the regions corresponding to the address terms along with the address term and the frame border defined for each of the regions of Layer 2 on the display 110. In the drawing, address terms AT3-3 and AT3-4 are further temporarily displayed on the display 110 in addition to the address terms AT2-1 and AT2-2 by the display control function 155.
  • Thereafter, when the display of the image (the image in which the region of Layer 2-2 is in the selection state) starts and a predetermined time passes, the display control function 155 ends the display of the foregoing address terms and frame borders and the display returns to the display of the simple image (the image in which the region of Layer 2-2 is in the selection state). The user can perform a manipulation on the regions corresponding to the address terms during and after the display of the address terms and the frame borders by giving a voice input including the address terms to the display device 100. In the illustrated example, before the display of the address terms and the frame borders ends, the user gives a voice input “Zoom in ‘Boy’” to the display device 100.
  • At this time, the command issuing function 159 finds the name of a manipulation command, “Zoom in,” and an address term of the region, “Boy,” based on text extracted through voice recognition from the voice input with reference to the command DB 161. Thus, the command issuing function 159 issues a command to give a request for expanding and displaying the image using the region of Layer 3-3 as a criterion to the image generation function 151. In response to this command, the image generation function 151 generates an image expanded using the region of Layer 3-3 as the criterion and the display control function 155 displays this image on the display 110, as illustrated in FIG. 8C. In the foregoing state of FIG. 8B, by temporarily displaying the address terms (AT3-3 and AT3-4) of some of the regions of Layer 3 on the display 110 in addition to the address terms (AT2-1 and AT2-2) of the regions of Layer 2, the expansion and display of the image using the region of Layer 3-3 as the criterion can be realized right away (the regions of Layer 2 are skipped) from the state in which the image of the region of Layer 1 is displayed.
  • Here, as illustrated in FIG. 8C, the display control function 155 displays a new image (the image expanded using the region of Layer 3-3 as the criterion) on the display 110 and temporarily displays address terms defined for more subordinate layer regions included in the region of Layer 3-3 used as the criterion of the expansion, that is, the regions of Layer 4-1 and Layer 4-2, and frame borders indicating the regions corresponding to the address terms along with the address term and the frame border defined for the region of Layer 3-3 on the display 110.
  • In the illustrated example, since the image is expanded and displayed using the region of Layer 3-3 as the criterion and the region of Layer 3-3 is not necessarily displayed on the entire display 110 (in many cases, an aspect ratio of the subject region does not match an aspect ratio of the display 110), it can be useful to display the address terms and the regions defined for the region of Layer 3-3 even after the image is expanded and displayed.
  • As illustrated in FIG. 8C, when a region (the region of Layer 3-4) other than the region of Layer 3 (the layer to which the image serving as the criterion of the expansion belongs) is included in a display range of the new image (the image expanded using the region of Layer 3-3 as the criterion), an address term and a region defined for the region of Layer 3-4 may also be temporarily displayed on the display 110 by the display control function 155.
  • In the step shown in FIG. 8C, the display control function 155 temporarily displays an address term C1 (corresponding to a manipulation of zoom out) for a manipulation on the image displayed on the display 110 along with the foregoing address term on the display 110. Since the original image is expanded and displayed in the new image (the image expanded using the region of Layer 3-3 as the criterion) displayed on the display 110, for example, the manipulation of zoom out can be performed without designating a specific region in the image. Thus, the address term definition function 153 extracts the names of manipulation commands executable on the image among the names of the manipulation commands stored in advance in the command DB 161 and defines address terms corresponding to the extracted names of the manipulation commands. As a result, the display control function 155 temporarily displays the address term C1 on the display 110.
  • Thereafter, when the display of the image (the image expanded using the region of Layer 3-3 as the criterion) starts and a predetermined time passes, the display control function 155 ends the display of the foregoing address terms and frame borders and the display returns to the display of the simple image (the image expanded using the region of Layer 3-3 as the criterion). The user can perform a manipulation on the regions corresponding to the address terms during or after the display of the address terms by giving a voice input including the address terms to the display device 100. In the illustrated example, as illustrated in FIG. 9A, the user gives a voice input “Hand” to the display device 100 before the display of the address terms and the frame borders ends.
  • At this time, the command issuing function 159 finds the name of the region “Hand” based on text extracted through voice recognition from the voice input with reference to the command DB 161. The command issuing function 159 estimates a manipulation command for a region (region of Layer 4-1) specified as a manipulation target with reference to the command DB 161. For example, when “zoom in” and “selection” are defined as manipulation commands executable for general regions in the command DB 161, the command issuing function 159 can recognize that selection of the region of Layer 4-1 is meaningless due to the fact that Layer 4 is the lowest layer and there are no more subordinate layer regions and can estimate that the manipulation command is “zoom in” based on this recognition. Thus, the command issuing function 159 issues a command to give a request for expanding and displaying an image using the region of Layer 4-1 as a criterion to the image generation function 151. In response to this command, the image generation function 151 displays an image expanded using the region of Layer 4-1 as the criterion on the display 110, as illustrated in FIG. 9B.
  • Here, in the step shown in FIG. 9B, the display control function 155 displays no address term of a new region on the display 110 since Layer 4 is the lowest layer, as described above. On the other hand, as in the image displayed up to FIG. 9A, a zoom out manipulation can be performed even in the image displayed in FIG. 9B. Therefore, the display control function 155 temporarily displays an address term C2 corresponding to the zoom out manipulation on the display 110. For example, according to a relation with a subject region of an image being displayed, the display position (the top right corner of the image) of the address term C2 may be set to a position different from the display position (the bottom left corner of the image) of the address term C1 up to FIG. 9A.
  • Thereafter, when the display of the image (the image expanded using the region of Layer 4-3 as the criterion) starts and a predetermined time passes, the display control function 155 ends the display of the address term AT4-1 and the corresponding frame border and the display returns to the display of the sample image (the image expanded using the region of Layer 4-1 as the criterion). The user can perform a manipulation on the regions corresponding to the address terms during or after the display of the address terms by giving a voice input including the address terms to the display device 100. In the illustrated example, as illustrated in FIG. 9B, the user gives a voice input “Show ‘Girl's Face’” to the display device 100 before the display of the address terms and the frame borders ends.
  • At this time, the command issuing function 159 extracts the name of the manipulation command, “Show,” and the address term of the region, “Girl's Face,” based on text extracted through voice recognition from the voice input with reference to the command DB 161. However, in the image display at the time of FIG. 9B, there is no region of the address term, “Girl's Face.” Thus, the command issuing function 159 retrieves information regarding the address term stored in the command DB 161 in association with the layered structure of the regions. For example, the command issuing function 159 analyzes the text “Girl's Face,” and first retrieves “Girl” and subsequently retrieves “Face.” This is because there is a probability of an address term for a more subordinate layer region, for example, “Face,” being redundantly defined in other subordinate layer regions included in mutually different superordinate layer regions.
  • In the illustrated example, as described above, the command issuing function 159 first finds the address term “Girl” defined for the region of Layer 3-4 as the result obtained by retrieving the address term “Girl” from the command DB 161. Then, the command issuing function 159 finds the address term “Face” defined for the region of Layer 4-4 as the result obtained by retrieving the address term “Face” in subordinate layer regions included in the region of Layer 3-4 from the command DB 161. Based on the above retrieval results, the command issuing function 159 issues a command to give a request for expanding and displaying the image using the region of Layer 4-4 as a criterion to the image generation function 151. In response to this command, the image generation function 151 expands and displays the image expanded using the region of Layer 4-4 as the criterion, as illustrated in FIG. 9C. Throughout the foregoing steps of FIGS. 8A to 9B, a manipulation intended by the user can be realized even for the voice input including the address term which is not displayed at that time on the display 110 but which the user remembers as being previously displayed, since the address terms of the regions previously defined and displayed are stored in the command DB 161.
  • 2-3. MODIFICATION EXAMPLES First Modification Example
  • FIGS. 10A to 10C are diagrams illustrating a first modification example of the second embodiment of the present disclosure. In FIGS. 10A to 10C, when the speech “Zoom in ‘Boy’” of the user described above with reference to FIGS. 8B and 8C is acquired as a voice input, a change in display of the display 110 at the time of the expansion of display of an image using the region of Layer 3-3 as a criterion is shown together with display during transition (FIG. 10B).
  • In FIG. 10B, text T including an address term included in the voice input and recognized by the display device 100 (the command issuing function 159) during the step of transition of the image displayed on the display 110 from the image of the region of Layer 1 to the image expanded and displayed using the region of Layer 3-3 as the criterion is displayed by the image generation function 151. Alternatively, the text T may be text indicating a command issued by the command issuing function 159 of the display device 100. When such a display is made on the display 110, the user can recognize that the display device 100 operates according to an intention of the user. For example, when the display device 100 does not correctly recognize a user's voice input and a command unintended by the user is performed, the user can understand what has happened based on such a display.
  • Second Modification Example
  • FIGS. 11 and 12 are diagrams illustrating a second modification example of the second embodiment of the present disclosure. In the embodiment, as described with reference to FIG. 7, the layered structure of the regions including the subject regions has been set for a photo displayed on the display 110. In the example illustrated in FIG. 7, all of the subordinate layer regions are included in the superordinate layer region, but are not included in this modification example.
  • For example, as illustrated in FIG. 11, only a part of a subordinate layer region may be included in a superordinate layer region in some cases. In the example illustrated in FIG. 11, only a part of the region of Layer 4-1 is included in the region of Layer 3-3 which is a superordinate layer region and the other part thereof is outside of the region of Layer 3-3. Likewise, only a part of the region of Layer 4-3 is included in the region of Layer 3-4 which is a superordinate layer region and the other part thereof is outside of the region of Layer 3-4. Like another region of Layer 3 or Layer 4, a subordinate layer region which is entirely included in a superordinate layer region can be present.
  • FIG. 12 illustrates an example of the expansion and display of an image when a region is set as in the example of FIG. 11. In the modification example, since the entire subordinate layer region is not necessarily included in the superordinate layer region, a relation recognized between the superordinate layer region and the subordinate layer region can be relatively slight. Accordingly, for example, even when an image is expanded and displayed using the region of Layer 3-3 as a criterion, not only address terms (AT4-1 and AT4-2) of the subordinate layer regions (Layer 4-1 and Layer 4-2) included in Layer 3-3 but also an address term (AT4-3) of the subordinate layer region (Layer 4-3) included in Layer 3-4 can be displayed.
  • Here, since both of the address terms AT4-1 and AT4-3 originally overlap as “Hand,” the address term definition function 153 adds sequence numbers to distinguish the address terms from each other, so that the address term AT4-1 becomes “Hand1” and the address term AT4-3 becomes “Hand2.” When only one of the address terms AT4-1 and AT4-3 is displayed on the display 110, the address term definition function 153 may omit a load of the sequence numbers.
  • The above-described display is possible irrespective of whether only a part of a subordinate layer region is included in a superordinate layer region or the entire subordinate layer region is included in a superordinate layer region. That is, regions in an image can be configured as in the example illustrated in FIG. 7 and can also be configured as in the example illustrated in FIG. 11. For example, when an image is expanded and displayed using the region of Layer 3-3 which is a subordinate layer region of the region of Layer 2-2 as the criterion, the address terms of Layers 4-1 and Layer 4-2 which are subordinate layer regions of the region of Layer 3-3 are temporarily displayed. In addition, the region of Layer 3-4 included in a display range of the image and the address terms of Layer 4-3 and Layer 4-4 which are subordinate layer regions of the region of Layer 3-4 can also be temporarily displayed. However, when the address term of the region of Layer 4-1 or Layer 4-2 and the address term of the region of Layer 4-3 or Layer 4-4 overlap, the address terms are distinguished from each other by adding sequence numbers, as described above, or the overlapping can be resolved by not displaying the overlapping address term of the region of Layer 4-3 or Layer 4-4.
  • 3. Third Embodiment
  • Next, a third embodiment of the present disclosure will be described. Since a configuration in the third embodiment is almost the same as that in the foregoing first embodiment except for display examples of address terms to be described below, detailed description other than that of the display examples of the address terms will be omitted.
  • FIG. 13 is a diagram illustrating a display example of address terms in the third embodiment of the present disclosure. In the embodiment, for example, as illustrated in the example of FIG. 13, an address term is defined for each of the display regions of a plurality of sub-images included in an image displayed on the display 110. In the illustrated example, the region of Layer 1 is defined in the entire image displayed on the display 110 and regions of Layer 2 are defined for display regions of the sub-images included in the image. The image displayed on the display 110 is an image displayed in a presentation by a presenter and includes a currently displayed slide (page) and sub-images indicating previously displayed graphs, pages, or the like. The address term definition function 153 can define address terms to correspond to display regions of main text (Layer 2-1) and a right graph (Layer 2-2) in a title “Current Slide,” and a last graph (Layer 2-3), a last page (Layer 2-4), and a next page (Layer 2-5) displayed separately from “Current Slide.”
  • At this time, the display control function 155 displays the image (the region of Layer 1) on the display 110 and temporarily displays an address term defined for a region corresponding to each sub-image and a frame border indicating a region corresponding to the address term on the display 110. In the drawing, an address term AT2-1 corresponding to the region of Layer 2-1 (the main text), an address term AT2-2 corresponding to the region of Layer 2-2 (the right graph), an address term AT2-3 corresponding to the region of Layer 2-3 (the last graph), an address term AT2-4 corresponding to the region of Layer 2-4 (the last page), and an address term AT2-5 corresponding to the region of Layer 2-5 (the next page) are temporarily displayed on the display 110 by the display control function 155.
  • In the embodiment, a voice input of the user given to the display device 100 can be, for example, “Show ‘Last graph’” or “Go to ‘Next page.’” In response to the voice input, the command issuing function 159 issues a command to give a request for enlarging and displaying the graph referred to previously again, a command to give a request for proceeding to a next slide (page), or the like to the image generation function 151. For example, when the image generation function 151 newly displays a slide (page) displayed as “Next page” in the drawing in the region of “Current Slide” based on the command to give a request for proceeding to the next slide (page), the slide (page) displayed as “Current Slide” up to that time is displayed as “Last page” at this time.
  • In the embodiment, the address term definition function 153 may define address terms based on a chronological change in display forms of the regions. For example, when the slide (page) displayed as “Current Slide” at that time is displayed to be smaller at a location separate from “Current slide,” the address term definition function 153 may define an address term with the same meaning as that of “Last page” for the region corresponding to this slide (page).
  • 4. Supplement
  • Embodiments of the present disclosure can include, for example, the information processing device described above (described as the display device), a system, an information processing method executed in the information processing device or the system, a program causing the information processing device to function, and a non-transitory computer-readable storage medium having a program stored therein.
  • The preferred embodiments of the present disclosure have been described above in detail with reference to the appended drawings, but embodiments of the present disclosure are not limited to the technical scope of the present disclosure. It should be apparent to those skilled in the art that various modifications or corrections may occur within the technical scope described in the claims and are, of course, understood to pertain to the technical scope of the present disclosure. Additionally, the present technology may also be configured as below.
    • (1) An information processing device including:
  • a processor configured to realize
      • an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display,
      • a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region,
      • a voice input acquisition function of acquiring a voice input for the image, and
      • a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
    • (2) The information processing device according to (1),
      • wherein the region includes a superordinate layer region and a subordinate layer region, and
      • wherein, when the superordinate layer region is displayed on the display, the display control function displays an address term of the subordinate layer region at least a part of which is included in the displayed superordinate layer region.
    • (3) The information processing device according to (2), wherein the command issuing function issues a command to expand and display the subordinate layer region.
    • (4) The information processing device according to (3),
  • wherein the region includes a more subordinate layer region of the subordinate layer region, and
  • wherein, when the subordinate layer region is expanded and displayed, the display control function displays an address term of the more subordinate layer region of the subordinate layer region at least a part of which is included in the subordinate layer region.
    • (5) The information processing device according to any one of (2) to (4), wherein the command issuing function issues a command to allow the subordinate layer region to enter a selection state.
    • (6) The information processing device according to (5),
  • wherein the region includes a more subordinate layer region of the subordinate layer region, and
  • wherein, when the subordinate layer region enters the selection state, the display control function displays an address term of the more subordinate layer region of the subordinate layer region at least a part of which is included in the subordinate layer region.
    • (7) The information processing device according to (6), wherein the command issuing function issues a command to expand and display the more subordinate layer region of the subordinate layer region.
    • (8) The information processing device according to any one of (2) to (7), wherein the superordinate layer region corresponds to an entire region of the image, and
  • wherein the subordinate layer region corresponds to a display region of each of a plurality of sub-images included in the image.
    • (9) The information processing device according to (8), wherein the command issuing function issues a command to expand and display, in the entire region of the image, the sub-image corresponding to an address term included in the voice input among the plurality of sub-images.
    • (10) The information processing device according to any one of (2) to (7),
  • wherein the superordinate layer region corresponds to an application image displayed in an entire region of the image, and
  • wherein the subordinate layer region corresponds to a region of a GUI component included in the application image.
    • (11) The information processing device according to (2),
  • wherein the superordinate layer region corresponds to an entire region of the image, and
  • wherein the subordinate layer region corresponds to each of a plurality of subject regions recognized in the image.
    • (12) The information processing device according to (11), wherein the command issuing function issues a command to expand and display the image using, as a criterion, the subject region corresponding to an address term included in the voice input among the plurality of subject regions.
    • (13) The information processing device according to any one of (2) to (12), wherein the subordinate layer region is entirely included in the superordinate layer region.
    • (14) The information processing device according to any one of (1) to (13), wherein the address term definition function defines the address term based on set information regarding the region.
    • (15) The information processing device according to any one of (1) to (14), wherein the address term definition function defines the address term based on a location of the region in the image.
    • (16) The information processing device according to any one of (1) to (15), wherein the address term definition function defines the address term based on a chronological change in a display form of the region.
    • (17) The information processing device according to any one of (1) to (16), wherein the display control function displays, on the display, an address term included in the voice input.
    • (18) The information processing device according to any one of (1) to (17), wherein the display control function displays the issued command on the display.
    • (19) An information processing method including, by a processor:
  • defining an address term for at least a partial region of an image to be displayed on a display;
  • displaying the image on the display and temporarily displaying the address term on the display in association with the region;
  • acquiring a voice input for the image; and
  • issuing a command relevant to the region when the address term is included in the voice input.
    • (20) A program causing a computer to realize:
  • an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display;
  • a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region;
  • a voice input acquisition function of acquiring a voice input for the image; and
  • a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.

Claims (20)

What is claimed is:
1. An information processing device comprising:
a processor configured to realize
an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display,
a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region,
a voice input acquisition function of acquiring a voice input for the image, and
a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
2. The information processing device according to claim 1,
wherein the region includes a superordinate layer region and a subordinate layer region, and
wherein, when the superordinate layer region is displayed on the display, the display control function displays an address term of the subordinate layer region at least a part of which is included in the displayed superordinate layer region.
3. The information processing device according to claim 2, wherein the command issuing function issues a command to expand and display the subordinate layer region.
4. The information processing device according to claim 3,
wherein the region includes a more subordinate layer region of the subordinate layer region, and
wherein, when the subordinate layer region is expanded and displayed, the display control function displays an address term of the more subordinate layer region of the subordinate layer region at least a part of which is included in the subordinate layer region.
5. The information processing device according to claim 2, wherein the command issuing function issues a command to allow the subordinate layer region to enter a selection state.
6. The information processing device according to claim 5,
wherein the region includes a more subordinate layer region of the subordinate layer region, and
wherein, when the subordinate layer region enters the selection state, the display control function displays an address term of the more subordinate layer region of the subordinate layer region at least a part of which is included in the subordinate layer region.
7. The information processing device according to claim 6, wherein the command issuing function issues a command to expand and display the more subordinate layer region of the subordinate layer region.
8. The information processing device according to claim 2,
wherein the superordinate layer region corresponds to an entire region of the image, and
wherein the subordinate layer region corresponds to a display region of each of a plurality of sub-images included in the image.
9. The information processing device according to claim 8, wherein the command issuing function issues a command to expand and display, in the entire region of the image, the sub-image corresponding to an address term included in the voice input among the plurality of sub-images.
10. The information processing device according to claim 2,
wherein the superordinate layer region corresponds to an application image displayed in an entire region of the image, and
wherein the subordinate layer region corresponds to a region of a GUI component included in the application image.
11. The information processing device according to claim 2,
wherein the superordinate layer region corresponds to an entire region of the image, and
wherein the subordinate layer region corresponds to each of a plurality of subject regions recognized in the image.
12. The information processing device according to claim 11, wherein the command issuing function issues a command to expand and display the image using, as a criterion, the subject region corresponding to an address term included in the voice input among the plurality of subject regions.
13. The information processing device according to claim 2, wherein the subordinate layer region is entirely included in the superordinate layer region.
14. The information processing device according to claim 1, wherein the address term definition function defines the address term based on set information regarding the region.
15. The information processing device according to claim 1, wherein the address term definition function defines the address term based on a location of the region in the image.
16. The information processing device according to claim 1, wherein the address term definition function defines the address term based on a chronological change in a display form of the region.
17. The information processing device according to claim 1, wherein the display control function displays, on the display, an address term included in the voice input.
18. The information processing device according to claim 1, wherein the display control function displays the issued command on the display.
19. An information processing method comprising, by a processor:
defining an address term for at least a partial region of an image to be displayed on a display;
displaying the image on the display and temporarily displaying the address term on the display in association with the region;
acquiring a voice input for the image; and
issuing a command relevant to the region when the address term is included in the voice input.
20. A program causing a computer to realize:
an address term definition function of defining an address term for at least a partial region of an image to be displayed on a display;
a display control function of displaying the image on the display and temporarily displaying the address term on the display in association with the region;
a voice input acquisition function of acquiring a voice input for the image; and
a command issuing function of issuing a command relevant to the region when the address term is included in the voice input.
US14/273,735 2013-07-10 2014-05-09 Information processing device, information processing method, and program Abandoned US20150019974A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/049,228 US10725734B2 (en) 2013-07-10 2018-07-30 Voice input apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2013-144449 2013-07-10
JP2013144449A JP6102588B2 (en) 2013-07-10 2013-07-10 Information processing apparatus, information processing method, and program

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/049,228 Continuation US10725734B2 (en) 2013-07-10 2018-07-30 Voice input apparatus

Publications (1)

Publication Number Publication Date
US20150019974A1 true US20150019974A1 (en) 2015-01-15

Family

ID=51033001

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/273,735 Abandoned US20150019974A1 (en) 2013-07-10 2014-05-09 Information processing device, information processing method, and program
US16/049,228 Active US10725734B2 (en) 2013-07-10 2018-07-30 Voice input apparatus

Family Applications After (1)

Application Number Title Priority Date Filing Date
US16/049,228 Active US10725734B2 (en) 2013-07-10 2018-07-30 Voice input apparatus

Country Status (4)

Country Link
US (2) US20150019974A1 (en)
EP (1) EP2824564B1 (en)
JP (1) JP6102588B2 (en)
CN (1) CN104281259B (en)

Cited By (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10283109B2 (en) 2015-09-09 2019-05-07 Samsung Electronics Co., Ltd. Nickname management method and apparatus
US10528659B2 (en) 2015-02-18 2020-01-07 Sony Corporation Information processing device and information processing method
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US11210061B2 (en) 2017-03-31 2021-12-28 Brother Kogyo Kabushiki Kaisha Non-transitory computer-readable recording medium storing computer-readable instructions for causing information processing device to execute communication processing with image processing program and voice-recognition program, information processing device, and method of controlling information processing device
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
USD1014542S1 (en) * 2019-09-11 2024-02-13 Ford Global Technologies, Llc Display screen with graphical user interface
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11954405B2 (en) 2022-11-07 2024-04-09 Apple Inc. Zero latency digital assistant

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102017219596A1 (en) * 2016-12-22 2018-06-28 Volkswagen Aktiengesellschaft Speech output voice of a voice control system
JP2017102939A (en) * 2016-12-26 2017-06-08 株式会社プロフィールド Authoring device, authoring method, and program
US11029834B2 (en) * 2017-12-20 2021-06-08 International Business Machines Corporation Utilizing biometric feedback to allow users to scroll content into a viewable display area
CN108538291A (en) * 2018-04-11 2018-09-14 百度在线网络技术(北京)有限公司 Sound control method, terminal device, cloud server and system
CN110989893A (en) * 2019-12-06 2020-04-10 北京金山安全软件有限公司 Click operation response method and device and electronic equipment
WO2021220769A1 (en) * 2020-04-27 2021-11-04 ソニーグループ株式会社 Information processing device and information processing method

Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020052746A1 (en) * 1996-12-31 2002-05-02 News Datacom Limited Corporation Voice activated communication system and program guide
US6385582B1 (en) * 1999-05-03 2002-05-07 Pioneer Corporation Man-machine system equipped with speech recognition device
US6428449B1 (en) * 2000-05-17 2002-08-06 Stanford Apseloff Interactive video system responsive to motion and voice command
US20030005461A1 (en) * 2001-07-02 2003-01-02 Sony Corporation System and method for linking closed captioning to web site
US6508706B2 (en) * 2001-06-21 2003-01-21 David Howard Sitrick Electronic interactive gaming apparatus, system and methodology
US6545669B1 (en) * 1999-03-26 2003-04-08 Husam Kinawi Object-drag continuity between discontinuous touch-screens
US20030235276A1 (en) * 2002-06-25 2003-12-25 Masahiko Tateishi Voice control system notifying execution result including uttered speech content
US6718308B1 (en) * 2000-02-22 2004-04-06 Daniel L. Nolting Media presentation system controlled by voice to text commands
US6882974B2 (en) * 2002-02-15 2005-04-19 Sap Aktiengesellschaft Voice-control for a user interface
US20050195221A1 (en) * 2004-03-04 2005-09-08 Adam Berger System and method for facilitating the presentation of content via device displays
US20050210416A1 (en) * 2004-03-16 2005-09-22 Maclaurin Matthew B Interactive preview of group contents via axial controller
US7136814B1 (en) * 2000-11-03 2006-11-14 The Procter & Gamble Company Syntax-driven, operator assisted voice recognition system and methods
US7213206B2 (en) * 2003-09-09 2007-05-01 Fogg Brian J Relationship user interface
US20070118382A1 (en) * 2005-11-18 2007-05-24 Canon Kabushiki Kaisha Information processing apparatus and information processing method
US20070174043A1 (en) * 2004-02-25 2007-07-26 Mikko Makela Method and an apparatus for requesting a service in a network
US7426467B2 (en) * 2000-07-24 2008-09-16 Sony Corporation System and method for supporting interactive user interface operations and storage medium
US7468742B2 (en) * 2004-01-14 2008-12-23 Korea Institute Of Science And Technology Interactive presentation system
US20100031150A1 (en) * 2005-10-17 2010-02-04 Microsoft Corporation Raising the visibility of a voice-activated user interface
US20110178990A1 (en) * 2010-01-20 2011-07-21 International Business Machines Corporation Information processor, information processing system, data archiving method, and data deletion method
US8060227B2 (en) * 2007-09-10 2011-11-15 Palo Alto Research Center Incorporated Digital media player and method for facilitating social music discovery through sampling, identification, and logging
US20120098946A1 (en) * 2010-10-26 2012-04-26 Samsung Electronics Co., Ltd. Image processing apparatus and methods of associating audio data with image data therein
US20120236025A1 (en) * 2010-09-20 2012-09-20 Kopin Corporation Advanced remote control of host application using motion and voice commands
US20120257121A1 (en) * 2011-04-07 2012-10-11 Sony Corporation Next generation user interface for audio video display device such as tv with multiple user input modes and hierarchy thereof
US20130033643A1 (en) * 2011-08-05 2013-02-07 Samsung Electronics Co., Ltd. Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
US20130117658A1 (en) * 2011-11-08 2013-05-09 Research In Motion Limited Block zoom on a mobile electronic device
US20130326380A1 (en) * 2012-06-05 2013-12-05 Apple Inc. Triage Tool for Problem Reporting in Maps
US20130326425A1 (en) * 2012-06-05 2013-12-05 Apple Inc. Mapping application with 3d presentation
US20140196087A1 (en) * 2013-01-07 2014-07-10 Samsung Electronics Co., Ltd. Electronic apparatus controlled by a user's voice and control method thereof
US8903728B2 (en) * 2011-10-25 2014-12-02 Olympus Medical Systems Corp. System for endoscopic surgery having a function of controlling through voice recognition
US9111255B2 (en) * 2010-08-31 2015-08-18 Nokia Technologies Oy Methods, apparatuses and computer program products for determining shared friends of individuals

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4989253A (en) * 1988-04-15 1991-01-29 The Montefiore Hospital Association Of Western Pennsylvania Voice activated microscope
US5970457A (en) * 1995-10-25 1999-10-19 Johns Hopkins University Voice command and control medical care system
US6847336B1 (en) * 1996-10-02 2005-01-25 Jerome H. Lemelson Selectively controllable heads-up display system
US6154723A (en) * 1996-12-06 2000-11-28 The Board Of Trustees Of The University Of Illinois Virtual reality 3D interface system for data creation, viewing and editing
US6224542B1 (en) * 1999-01-04 2001-05-01 Stryker Corporation Endoscopic camera system with non-mechanical zoom
US6514201B1 (en) * 1999-01-29 2003-02-04 Acuson Corporation Voice-enhanced diagnostic medical ultrasound system and review station
JP2002041023A (en) * 2000-07-06 2002-02-08 Internatl Business Mach Corp <Ibm> Computer system, display control device, display device, display control method, recording medium and program transmission device
JP2004029933A (en) * 2002-06-21 2004-01-29 Mitsubishi Heavy Ind Ltd Display controller and display control method
US7792678B2 (en) * 2006-02-13 2010-09-07 Hon Hai Precision Industry Co., Ltd. Method and device for enhancing accuracy of voice control with image characteristic
US8207936B2 (en) * 2006-06-30 2012-06-26 Sony Ericsson Mobile Communications Ab Voice remote control
JP2008277903A (en) * 2007-04-25 2008-11-13 Sony Corp Imaging apparatus and object to be focused determination method
JP2009109587A (en) * 2007-10-26 2009-05-21 Panasonic Electric Works Co Ltd Voice recognition control device
US8155479B2 (en) * 2008-03-28 2012-04-10 Intuitive Surgical Operations Inc. Automated panning and digital zooming for robotic surgical systems
JP5151644B2 (en) * 2008-04-16 2013-02-27 ソニー株式会社 Remote control system and remote control signal processing method
US20100275122A1 (en) * 2009-04-27 2010-10-28 Microsoft Corporation Click-through controller for mobile interaction
JP2011035771A (en) * 2009-08-04 2011-02-17 Olympus Corp Image capturing apparatus, editing device, and image capturing system
JP5340895B2 (en) * 2009-11-24 2013-11-13 株式会社ソニー・コンピュータエンタテインメント Image data creation support apparatus and image data creation support method
US8928579B2 (en) * 2010-02-22 2015-01-06 Andrew David Wilson Interacting with an omni-directionally projected display
CN101886833B (en) * 2010-07-22 2012-06-27 慈溪拓赢电器有限公司 Small all-in-one cold and hot air conditioner adopting oblique drawing rod
CN101950244A (en) * 2010-09-20 2011-01-19 宇龙计算机通信科技(深圳)有限公司 Method and device for giving prompt for content information on user interface
US9316827B2 (en) * 2010-09-20 2016-04-19 Kopin Corporation LifeBoard—series of home pages for head mounted displays (HMD) that respond to head tracking
US20120110456A1 (en) * 2010-11-01 2012-05-03 Microsoft Corporation Integrated voice command modal user interface
JP5636888B2 (en) * 2010-11-09 2014-12-10 ソニー株式会社 Information processing apparatus, program, and command generation method
GB2501471A (en) * 2012-04-18 2013-10-30 Barco Nv Electronic conference arrangement
IL221863A (en) * 2012-09-10 2014-01-30 Elbit Systems Ltd Digital system for surgical video capturing and display
US9020825B1 (en) * 2012-09-25 2015-04-28 Rawles Llc Voice gestures
US9495266B2 (en) * 2013-05-16 2016-11-15 Advantest Corporation Voice recognition virtual test engineering assistant

Patent Citations (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020052746A1 (en) * 1996-12-31 2002-05-02 News Datacom Limited Corporation Voice activated communication system and program guide
US6545669B1 (en) * 1999-03-26 2003-04-08 Husam Kinawi Object-drag continuity between discontinuous touch-screens
US6385582B1 (en) * 1999-05-03 2002-05-07 Pioneer Corporation Man-machine system equipped with speech recognition device
US6718308B1 (en) * 2000-02-22 2004-04-06 Daniel L. Nolting Media presentation system controlled by voice to text commands
US6428449B1 (en) * 2000-05-17 2002-08-06 Stanford Apseloff Interactive video system responsive to motion and voice command
US7426467B2 (en) * 2000-07-24 2008-09-16 Sony Corporation System and method for supporting interactive user interface operations and storage medium
US7136814B1 (en) * 2000-11-03 2006-11-14 The Procter & Gamble Company Syntax-driven, operator assisted voice recognition system and methods
US6508706B2 (en) * 2001-06-21 2003-01-21 David Howard Sitrick Electronic interactive gaming apparatus, system and methodology
US20030005461A1 (en) * 2001-07-02 2003-01-02 Sony Corporation System and method for linking closed captioning to web site
US6882974B2 (en) * 2002-02-15 2005-04-19 Sap Aktiengesellschaft Voice-control for a user interface
US20030235276A1 (en) * 2002-06-25 2003-12-25 Masahiko Tateishi Voice control system notifying execution result including uttered speech content
US7213206B2 (en) * 2003-09-09 2007-05-01 Fogg Brian J Relationship user interface
US7468742B2 (en) * 2004-01-14 2008-12-23 Korea Institute Of Science And Technology Interactive presentation system
US20070174043A1 (en) * 2004-02-25 2007-07-26 Mikko Makela Method and an apparatus for requesting a service in a network
US20050195221A1 (en) * 2004-03-04 2005-09-08 Adam Berger System and method for facilitating the presentation of content via device displays
US20050210416A1 (en) * 2004-03-16 2005-09-22 Maclaurin Matthew B Interactive preview of group contents via axial controller
US20100031150A1 (en) * 2005-10-17 2010-02-04 Microsoft Corporation Raising the visibility of a voice-activated user interface
US20070118382A1 (en) * 2005-11-18 2007-05-24 Canon Kabushiki Kaisha Information processing apparatus and information processing method
US8060227B2 (en) * 2007-09-10 2011-11-15 Palo Alto Research Center Incorporated Digital media player and method for facilitating social music discovery through sampling, identification, and logging
US20110178990A1 (en) * 2010-01-20 2011-07-21 International Business Machines Corporation Information processor, information processing system, data archiving method, and data deletion method
US9111255B2 (en) * 2010-08-31 2015-08-18 Nokia Technologies Oy Methods, apparatuses and computer program products for determining shared friends of individuals
US20120236025A1 (en) * 2010-09-20 2012-09-20 Kopin Corporation Advanced remote control of host application using motion and voice commands
US20120098946A1 (en) * 2010-10-26 2012-04-26 Samsung Electronics Co., Ltd. Image processing apparatus and methods of associating audio data with image data therein
US20120257121A1 (en) * 2011-04-07 2012-10-11 Sony Corporation Next generation user interface for audio video display device such as tv with multiple user input modes and hierarchy thereof
US8499320B2 (en) * 2011-04-07 2013-07-30 Sony Corporation Next generation user interface for audio video display device such as TV with multiple user input modes and hierarchy thereof
US20130033643A1 (en) * 2011-08-05 2013-02-07 Samsung Electronics Co., Ltd. Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
US8903728B2 (en) * 2011-10-25 2014-12-02 Olympus Medical Systems Corp. System for endoscopic surgery having a function of controlling through voice recognition
US20130117658A1 (en) * 2011-11-08 2013-05-09 Research In Motion Limited Block zoom on a mobile electronic device
US20130326380A1 (en) * 2012-06-05 2013-12-05 Apple Inc. Triage Tool for Problem Reporting in Maps
US20130326425A1 (en) * 2012-06-05 2013-12-05 Apple Inc. Mapping application with 3d presentation
US20140196087A1 (en) * 2013-01-07 2014-07-10 Samsung Electronics Co., Ltd. Electronic apparatus controlled by a user's voice and control method thereof

Cited By (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11900936B2 (en) 2008-10-02 2024-02-13 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11636869B2 (en) 2013-02-07 2023-04-25 Apple Inc. Voice trigger for a digital assistant
US11862186B2 (en) 2013-02-07 2024-01-02 Apple Inc. Voice trigger for a digital assistant
US11557310B2 (en) 2013-02-07 2023-01-17 Apple Inc. Voice trigger for a digital assistant
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US11670289B2 (en) 2014-05-30 2023-06-06 Apple Inc. Multi-command single utterance input method
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US11838579B2 (en) 2014-06-30 2023-12-05 Apple Inc. Intelligent automated assistant for TV user interactions
US10528659B2 (en) 2015-02-18 2020-01-07 Sony Corporation Information processing device and information processing method
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US10283109B2 (en) 2015-09-09 2019-05-07 Samsung Electronics Co., Ltd. Nickname management method and apparatus
US11809886B2 (en) 2015-11-06 2023-11-07 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US11853647B2 (en) 2015-12-23 2023-12-26 Apple Inc. Proactive assistance based on dialog communication between devices
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US11210061B2 (en) 2017-03-31 2021-12-28 Brother Kogyo Kabushiki Kaisha Non-transitory computer-readable recording medium storing computer-readable instructions for causing information processing device to execute communication processing with image processing program and voice-recognition program, information processing device, and method of controlling information processing device
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11538469B2 (en) 2017-05-12 2022-12-27 Apple Inc. Low-latency intelligent automated assistant
US11862151B2 (en) 2017-05-12 2024-01-02 Apple Inc. Low-latency intelligent automated assistant
US11837237B2 (en) 2017-05-12 2023-12-05 Apple Inc. User-specific acoustic models
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US11675829B2 (en) 2017-05-16 2023-06-13 Apple Inc. Intelligent automated assistant for media exploration
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11907436B2 (en) 2018-05-07 2024-02-20 Apple Inc. Raise to speak
US11630525B2 (en) 2018-06-01 2023-04-18 Apple Inc. Attention aware virtual assistant dismissal
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11360739B2 (en) 2019-05-31 2022-06-14 Apple Inc. User activity shortcut suggestions
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
USD1014542S1 (en) * 2019-09-11 2024-02-13 Ford Global Technologies, Llc Display screen with graphical user interface
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11924254B2 (en) 2020-05-11 2024-03-05 Apple Inc. Digital assistant hardware abstraction
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11750962B2 (en) 2020-07-21 2023-09-05 Apple Inc. User identification using headphones
US11954405B2 (en) 2022-11-07 2024-04-09 Apple Inc. Zero latency digital assistant

Also Published As

Publication number Publication date
US10725734B2 (en) 2020-07-28
CN104281259B (en) 2018-12-25
JP2015018365A (en) 2015-01-29
US20190012140A1 (en) 2019-01-10
CN104281259A (en) 2015-01-14
EP2824564B1 (en) 2020-02-05
EP2824564A1 (en) 2015-01-14
JP6102588B2 (en) 2017-03-29

Similar Documents

Publication Publication Date Title
US10725734B2 (en) Voice input apparatus
US10747417B2 (en) Information processing apparatus, information processing method and information processing program for using a cursor
US9733895B2 (en) Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
AU2012293060B2 (en) Electronic apparatus and method for providing user interface thereof
EP3093755B1 (en) Mobile terminal and control method thereof
US20130035942A1 (en) Electronic apparatus and method for providing user interface thereof
US20140100850A1 (en) Method and apparatus for performing preset operation mode using voice recognition
US10452777B2 (en) Display apparatus and character correcting method thereof
KR20090129628A (en) Control device and controlling method thereof
CN104756484A (en) Information processing device, reproduction state control method, and program
US20150163536A1 (en) Display apparatus and method for controlling the same
US10939171B2 (en) Method, apparatus, and computer readable recording medium for automatic grouping and management of content in real-time
US20190012129A1 (en) Display apparatus and method for controlling display apparatus
KR20190021016A (en) Electronic device and control method thereof
WO2018112924A1 (en) Information display method, device and terminal device
US20130111327A1 (en) Electronic apparatus and display control method
US20160132478A1 (en) Method of displaying memo and device therefor
KR20190055489A (en) Electronic device and control method thereof
JP5752759B2 (en) Electronic device, method, and program
US10795537B2 (en) Display device and method therefor
US9817633B2 (en) Information processing apparatus and information processing method
JP6223007B2 (en) Document display apparatus and method, program and data structure thereof
KR101131215B1 (en) Method of processing hyperlink, mobile communication terminal for implementing the method and computer-readable store media
WO2015159498A1 (en) Method and apparatus for displaying additional objects on a graphical user interface based on pinch gesture
JP2006092499A (en) System and computer program for browsing contents

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DOI, SHOUICHI;TAKEOKA, YOSHIKI;TAKADA, MASAYUKI;REEL/FRAME:032863/0149

Effective date: 20140501

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION