US20120314899A1 - Natural user interfaces for mobile image viewing - Google Patents

Natural user interfaces for mobile image viewing Download PDF

Info

Publication number
US20120314899A1
US20120314899A1 US13/159,010 US201113159010A US2012314899A1 US 20120314899 A1 US20120314899 A1 US 20120314899A1 US 201113159010 A US201113159010 A US 201113159010A US 2012314899 A1 US2012314899 A1 US 2012314899A1
Authority
US
United States
Prior art keywords
mobile device
face
imagery
user
screen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/159,010
Inventor
Michael F. Cohen
Neel Suresh Joshi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/159,010 priority Critical patent/US20120314899A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COHEN, MICHAEL F., JOSHI, NEEL SURESH
Publication of US20120314899A1 publication Critical patent/US20120314899A1/en
Priority to US14/487,240 priority patent/US10275020B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2200/00Indexing scheme relating to G06F1/04 - G06F1/32
    • G06F2200/16Indexing scheme relating to G06F1/16 - G06F1/18
    • G06F2200/163Indexing scheme relating to constructional details of the computer
    • G06F2200/1637Sensing arrangement for detection of housing movement or orientation, e.g. for controlling scrolling or cursor movement on the display of an handheld computer

Definitions

  • imagery and types available continues to expand beyond traditional flat images, e.g., high resolution, multi-perspective, and panoramic imagery.
  • the typical viewing size has decreased as an increasingly significant fraction of photo viewing takes place on a mobile device with limited screen size and resolution.
  • the mismatch between imagery and display has become even more obvious. While there are obvious limitations due to screen size on mobile devices, one significant benefit is that they are outfitted with numerous sensors including accelerometers, gyros, and cameras. The sensors, are currently ignored in the image viewing process.
  • the mobile image viewing technique described herein provides a hands-free interface for viewing large imagery (e.g., 360° panoramas, parallax image sequences, and long multi-perspective panoramas) on mobile devices.
  • the technique controls a display on a mobile device, such as, for example, a mobile phone, by movement of the mobile device.
  • the technique uses sensors to track the mobile device's orientation and position, and front facing camera to track the user's viewing distance and viewing angle.
  • the technique adjusts the view of a rendered image on the mobile device's display according to the tracked data.
  • the technique employs a sensor fusion methodology that combines viewer tracking using a front facing camera with gyroscope data from the mobile device to produce a robust signal that defines the viewer's 3D position relative to the display.
  • viewer tracking can be achieved by face tracking, color-blob/skin tracking, tracking feature points of the face and other types of ego-motion and optical flow tracking.
  • the gyroscopic data provides both low latency feedback and allows extrapolation of the face position beyond the field-of-view of the front facing camera.
  • the technique employs a hybrid position and rate control that uses the viewer's 3D position to drive viewing and exploration of very large image spaces on the mobile device.
  • FIG. 1 depicts a flow diagram of an exemplary process for practicing one embodiment of the mobile image viewing technique described herein.
  • FIG. 2 depicts another flow diagram of another exemplary process for practicing the mobile image viewing technique described herein.
  • FIG. 3 is an exemplary architecture for practicing one exemplary embodiment of the mobile image viewing technique described herein.
  • FIG. 4 shows that a gyroscope alone cannot distinguish between situations in Case B and Case C.
  • the drift signal, ⁇ D disambiguates these and brings the control in line with ⁇ F .
  • FIG. 5 depicts the face offset angle and distance that is computed from a face tracked in a camera situated to the side of the display of a mobile device.
  • FIG. 6 is a schematic of an exemplary computing environment which can be used to practice the mobile image viewing technique.
  • the mobile image viewing technique described herein allows a user to perform image viewing on mobile devices, leveraging the many sensors on typical mobile devices, such as, for example, cell phones or smart phones.
  • the technique uses low latency gyros on a mobile device to sense changes in direction of the device as well as the front-facing camera to detect and track the position of a user/viewer relative to a display on the mobile device, albeit with higher noise and latency. Fusion of these two sensor streams provides the functionality to create compelling interfaces to view a range of imagery.
  • the technique provides for natural user interfaces for viewing many forms of complex imagery ranging from multiple images stitched to create a single viewpoint 360° panorama, multi-viewpoint image sets depicting parallax in a scene, and street side interfaces integrating both multi-perspective panoramas and single viewpoint 360° panoramas.
  • One aspect of large format and/or very wide angle imagery is that there is a natural tension between a desire for direct positional control, i.e., a direct mapping of sensor output to position in the image, versus rate control, mapping sensor position to velocity of motion across an image.
  • the technique employs a hybrid rate/position control through a single relationship between sensors and output.
  • FIG. 1 provides an exemplary process for practicing one embodiment of the mobile image viewing technique.
  • a mobile device's for example, a mobile phone's
  • orientation and position are tracked using instrumentation on the device.
  • this mobile device could be a smart phone, Personal Data Assistant (PDA), or other cellular phone with a screen for viewing imagery.
  • Tracking could be, for example, using a gyroscope on the mobile device, a digital compass, an accelerometer, or some other type of instrumentation that can determine orientation and position of the mobile device.
  • a camera and viewer tracker on the mobile device is also simultaneously used to track a user's face looking at a screen on the mobile device, as shown in block 104 .
  • the camera could be a front facing camera facing the user/viewer, disposed on the same side of the mobile device as the screen of the mobile device.
  • the viewer tracker could be a face tracker, color-blob/skin tracker, tracker for tracking feature points of the face and other types of ego-motion and optical flow tracker.
  • a viewing angle and a viewing distance between the user and the screen on the mobile device are computed by using the tracked orientation and position of the mobile device, and the tracked position of the user's face relative to screen of the mobile device, as shown in block 106 .
  • the details of computing this viewing angle and viewing distance are provided in Section 3.
  • Image transformations of imagery to be rendered on the screen of the mobile device are then computed using the computed viewing angle and viewing distance to allow the user to control viewing of the rendered imagery, as shown in block 108 .
  • the imagery can include any type of images including single viewpoint panoramas, multi-viewpoint image sets depicting parallax in a scene, multi-perspective panoramas or a combination of these.
  • the user can change the view of the imagery by merely moving the mobile device relative to his or her face.
  • FIG. 2 provides another exemplary process for practicing another embodiment of the mobile image viewing technique.
  • a mobile device's for example, a mobile phone's
  • orientation and position is tracked using a gyroscope (although other similar instrumentation could be used).
  • a camera and viewer tracker on the mobile device is also used to track a user's face looking at a screen on the mobile device, as shown in block 204 .
  • the mobile device's orientation and position from the gyroscope and the position of the user's face obtained by the viewer tracker is used to determine a combined position and rate control for viewing imagery on the screen of the mobile device, as shown in block 206 .
  • the details of the computation for determining this combined position and rate control are provided in Section 3.
  • Image transformations of imagery to be rendered on the screen of the mobile device are then computed using the computed combined position and rate control to allow the user to display different points of the rendered imagery, as shown in block 208 .
  • the combined position and rate control values are mapped to coordinates in the imagery in order to determine which portion of the imagery to render.
  • FIG. 3 shows an exemplary architecture 300 for practicing one embodiment of the mobile image viewing technique.
  • a mobile imagery computing module 302 is located on a computing device 600 , which will be described in greater detail with respect to FIG. 6 .
  • This computing device 600 is preferably mobile, such as, for example a mobile phone or smart phone.
  • the mobile computing device 600 includes a camera 304 that can be used to capture the face of a user 306 of the mobile computing device 600 .
  • the mobile computing device 600 includes instrumentation such as, for example, a gyroscope 308 that is used to track the mobile computing device's orientation and position. It should be noted, however that other instrumentation capable of determining the mobile devices orientation and position could equally well be used.
  • the mobile computing device 600 includes a viewer tracker 310 (e.g., a face tracker, optical flow on the camera, point tracker) that is used to track a user's face, looking at a screen 312 on the mobile device, which is captured by the camera 304 .
  • the mobile device's tracked orientation and position, and the position of the user's face obtained by the viewer tracker are used to determine a viewing angle in a viewing angle computation module 312 from the mobile computing device 600 to the user 306 .
  • the distance between the mobile computing device and the user are determined in a distance computation module 314 .
  • a combined position and rate control for viewing imagery 318 on the screen 312 of the mobile device in a combined position and rate control computation module 316 .
  • the output of the combined position and rate control module 316 is used to compute image transformations of imagery to be rendered in an image transformation module 320 .
  • the computed image transformations are used to create transformed imagery 322 to be rendered on the screen 312 of the mobile device 600 .
  • the user can display different views of the rendered imagery on the screen simply by moving the camera relative to his or her face.
  • mobile devices offer a wide variety of sensors (touch, gyroscopes, accelerometers, compass, and cameras) that can help overcome the lack of traditional navigation controls and provide a richer and more natural interface to image viewing.
  • the mobile image viewing technique described herein has been used with various applications that cover a variety of image (scene) viewing scenarios in which the imagery covers either a large field of view, a wide strip multi-perspective panorama, multi-views, or a combination of these.
  • interfaces for 360° panoramas, multi-view strips exhibiting parallax, and Microsoft® Corporation's BingTM for iOS StreetSideTM interface that combines very long multi-perspective strip panoramas with single view 360° views.
  • a common aspect of all of these is that the imagery requires exploration to view the full breadth of the data. Details of these exemplary applications are described in Section 3.
  • a person moves his or her gaze relative to a scene, or moves an object relative to their gaze to fully explore a scene (or object). In both cases, their head is moving relative to the scene. If one considers an image as a representation of a scene on a device, tracking the head relative to the device as an affordance for navigation seems like a natural fit.
  • Viewer tracking such as, for example, face tracking alone can, in theory, provide a complete 3D input affordance, (x,y) position based on face location, and (z) depth based on face size.
  • viewer tracking alone exhibits a few robustness problems.
  • Viewer tracking, such as face tracking is costly and thus incurs some latency.
  • the vision algorithms for tracking face position and size are inherently noisy as small changes in face shape and illumination can produce unexpected signals. This can be overcome somewhat through filtering albeit at the price of more latency.
  • viewer tracking is lost beyond an offset angle beyond the field of view of the front facing camera (it has been experimentally found that this limit is about ⁇ 15 degrees). Nonetheless, viewer tracking is unique in its ability to deliver a 3D signal that is directly relevant to image viewing applications.
  • Gyroscopes provide a more robust and lower latency alternative for the 2D (x,y) angular position.
  • the gyros provide a superior signal, however they do drift considerably. It is common to see 5 degree drifts during a 360° rotation over 15 seconds.
  • gyros alone cannot disambiguate between the cases shown in FIG. 4 Case B and FIG. 4 Case C.
  • the technique creates a sensor fusion that is a hybrid of the gyro plus viewer tracking using the front facing camera.
  • accelerometers were decided not to use accelerometers for positions tracking based on empirical experience that has shown that aside from the direction of gravity and fairly sudden moves, the noise from the accelerometers overwhelms subtle motions.
  • accelerometers, compasses and other tracking devices could feasibly be used to track the mobile device.
  • a face is first located in the front facing camera via a face finder.
  • Various conventional face finders can be used for this purpose.
  • the technique finds the user's face using a conventional face finder and returns a rectangle for the size and location of the face.
  • a face template is recorded from this rectangle along with the position and size.
  • This template is then matched at varying (x,y) positions and scales around the current (position, scale) at each subsequent frame.
  • the (position, scale) with the highest correlation to the original template in the new frame is considered the current location of the face.
  • the technique searches over a rectangle 3 ⁇ the size of the previous face in x and y and over 3 scales between ⁇ 5% of the previous scale.
  • the technique tracks, ⁇ F ′, 502 , the angular offset of the face from the normal to the display (from the front-facing camera), and ⁇ G , 504 , the change in rotation about the vertical axis tangent to the display (from the gyros).
  • the technique estimates the distance d 506 from the camera 508 from face width. Given the fixed offset of the camera 508 from the center of the display 512 and ⁇ G , 504 , the technique derives ⁇ F , 510 , the face's angular offset from the display center. It is thus possible to compute the value, ⁇ , which is mapped to the position and rate control for the user interface.
  • ⁇ t ⁇ t ⁇ 1 +(1 ⁇ ) ⁇ ( ⁇ t G + ⁇ t D ) (1)
  • ⁇ t represents the value at time t that the technique will map to its control functions.
  • the variable a serves to provide a small amount of hysteresis to smooth this signal. It was found that a value of 0.1 provides a small smoothing without adding noticeable lag.
  • ⁇ t G is the time integrated gyro signal, i.e., the total rotation of the device including any potential drift:
  • ⁇ t G represents the direct readings from the gyro.
  • ⁇ t D represents a smoothed signal of the difference between the face position, ⁇ F and the integrated gyro angle, ⁇ G .
  • This quantity encompasses any drift incurred by the gyro as well as any rotation of the user himself (see FIG. 4 Case C). Since the viewer tracker runs more slowly than the gyro readings (in one embodiment, 1 to 10 HZ for the viewer tracker and 50 Hz for the gyro), the technique records both the face position and gyro values each time a face position is received.
  • ⁇ D is thus defined by
  • ⁇ t D ⁇ t ⁇ 1 D +(1 ⁇ ) ⁇ ( ⁇ * F ⁇ * G ) (3)
  • ⁇ t represents a best guess of the face position relative to the device even when the face is beyond the field of view of the device.
  • viewer tracking such as, for example, face tracking
  • the gyro signal serves as a lively proxy with good accuracy over short time intervals.
  • the viewer tracker is used to continuously correct the gyro input to bring it back in line with where the face is seen from the front-facing camera.
  • the technique uses the face width in the camera's view as as proxy for the face's distance from the device.
  • the technique uses a time smoothed face size for this signal.
  • mapping Given the angular offset, ⁇ t , one is now left with the mapping between this value and the controls for viewing the imagery.
  • the simplest and most intuitive mapping is a position control, in which the ⁇ t is mapped through some linear function to the position on the imagery (i.e., angle in a panorama, position on a large flat image, or viewing position in a multi-view parallax image set).
  • Position mapping can provide fine control over short distances and is almost always the control of choice for displaying imagery when applicable.
  • Z t is linearly mapped to zoom level.
  • the technique caps the minimum zoom level at a bit less than arm's length.
  • the street side application has a fixed zoom level at which a mode change takes place between the multi-perspective panoramas and cylindrical panoramas. To avoid rapid mode changes near this transition point, the technique eases in a small offset to the zoom level after the mode switch and then eases out the offset after the mode switches back.
  • the values of the controls are mapped to the imagery to be rendered on the screen.
  • the output of the position and velocity control can be mapped to: the viewing angle in a 360 panorama or viewpoint selection in a multi-point panorama.
  • the zoom control can be used to scale the field of view, i.e., literally zoom in/out on an image or to switch between modes as is described in the previous paragraph.
  • the interaction paradigm of the technique described above has been applied to a number of image viewing applications. These include wide angle imagery such as 360° panoramas and parallax photos consisting of a series of side-by-side images. Also, the technique has been applied to very long multi-perspective images and 360° panoramas.
  • the technique By interpreting ⁇ X t at each frame time as a change in orientation, and Z t as the zoom factor, the technique provides an interface to such imagery that does not require two-handed input or standing and physically turning in place.
  • ⁇ X t at each frame time represents a relative offset of the virtual camera.
  • One embodiment of the technique provides an interface to such imagery that creates a feeling of peering into a virtual environment. In this case, the position control and thus the gyro input dominates. The viewer tracker's role is primarily to counteract gyro drift.
  • a new interface for viewing street side imagery was demonstrated in Microsoft® Corporation's StreetSlideTM application.
  • the original imagery consists of a series of 360° panoramas set at approximately 2 meter intervals along a street.
  • the StreetSlideTM paradigm was subsequently adapted to create long multi-perspective strip panoramas constructed by clipping out and stitching parts of the series of panoramas.
  • the StreetSlideTM application automatically flips between the long strip panoramas and the 360° panoramas depending on zoom level.
  • Other similar applications use traditional finger swipes and pinch operations.
  • the present mobile image viewing technique was applied as a new user interface on top of the StreetSlideTM application. It could equally well be applied to similar applications. Since there are two modes, the meaning of ⁇ X t switches. In slide mode, ⁇ X t moves the view left and right along the street side. Z t zooms the strip panorama in and out. At a given zoom level, the mode switches automatically to the corresponding 360° panorama at that location on the street. At this point, the technique revert to the panorama control described above. Zooming out once more returns to the slide mode. Navigation now requires only one hand leaving the other hand free for unambiguous access to other navigation aids and information overlaid on the location imagery.
  • the technique can be applied to an interface to mapping applications. Being able to zoom out from a street in San Francisco, pan across the country, and back in to a New York street, for example, would be achievable by simply moving the device away, tilting it “east” and pulling the device back towards the viewer.
  • FIG. 6 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the mobile image viewing technique, as described herein, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 6 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • FIG. 6 shows a general system diagram showing a simplified computing device 600 .
  • Such computing devices can be typically found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • the device should have a sufficient computational capability and system memory to enable basic computational operations.
  • the computational capability is generally illustrated by one or more processing unit(s) 610 , and may also include one or more GPUs 615 , either or both in communication with system memory 620 .
  • the processing unit(s) 610 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • the simplified computing device of FIG. 6 may also include other components, such as, for example, a communications interface 630 .
  • the simplified computing device of FIG. 6 may also include one or more conventional computer input devices 640 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.).
  • the simplified computing device of FIG. 6 may also include other optional components, such as, for example, one or more conventional computer output devices 650 (e.g., display device(s) 655 , audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.).
  • typical communications interfaces 630 , input devices 640 , output devices 650 , and storage devices 660 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • the simplified computing device of FIG. 6 may also include a variety of computer readable media.
  • Computer readable media can be any available media that can be accessed by computer 600 via storage devices 660 and includes both volatile and nonvolatile media that is either removable 670 and/or non-removable 480 , for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data.
  • Computer readable media may comprise computer storage media and communication media.
  • Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • modulated data signal or “carrier wave” generally refer a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • software, programs, and/or computer program products embodying the some or all of the various embodiments of the mobile image viewing technique described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
  • the embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks.
  • program modules may be located in both local and remote computer storage media including media storage devices.
  • the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.

Abstract

The mobile image viewing technique described herein provides a hands-free interface for viewing large imagery (e.g., 360 degree panoramas, parallax image sequences, and long multi-perspective panoramas) on mobile devices. The technique controls the imagery displayed on a display of a mobile device by movement of the mobile device. The technique uses sensors to track the mobile device's orientation and position, and front facing camera to track the user's viewing distance and viewing angle. The technique adjusts the view of a rendered imagery on the mobile device's display according to the tracked data. In one embodiment the technique can employ a sensor fusion methodology that combines viewer tracking using a front facing camera with gyroscope data from the mobile device to produce a robust signal that defines the viewer's 3D position relative to the display.

Description

    BACKGROUND
  • Most viewing of photographs now takes place on an electronic display rather than in print form. Yet, almost all interfaces for viewing photos on an electronic display still try to mimic a static piece of paper by “pasting the photo on the back of the glass”, in other words, simply scaling the image to fit the display. This approach ignores the inherent flexibility of displays while also living with the constraints of limited pixel resolution.
  • In addition, the resolution and types of imagery available continues to expand beyond traditional flat images, e.g., high resolution, multi-perspective, and panoramic imagery. Paradoxically, as the size and dimensionality of available imagery has increased, the typical viewing size has decreased as an increasingly significant fraction of photo viewing takes place on a mobile device with limited screen size and resolution. As a result, the mismatch between imagery and display has become even more obvious. While there are obvious limitations due to screen size on mobile devices, one significant benefit is that they are outfitted with numerous sensors including accelerometers, gyros, and cameras. The sensors, are currently ignored in the image viewing process.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • The mobile image viewing technique described herein provides a hands-free interface for viewing large imagery (e.g., 360° panoramas, parallax image sequences, and long multi-perspective panoramas) on mobile devices. The technique controls a display on a mobile device, such as, for example, a mobile phone, by movement of the mobile device. The technique uses sensors to track the mobile device's orientation and position, and front facing camera to track the user's viewing distance and viewing angle. The technique adjusts the view of a rendered image on the mobile device's display according to the tracked data.
  • More particularly, in one embodiment, the technique employs a sensor fusion methodology that combines viewer tracking using a front facing camera with gyroscope data from the mobile device to produce a robust signal that defines the viewer's 3D position relative to the display. For example, viewer tracking can be achieved by face tracking, color-blob/skin tracking, tracking feature points of the face and other types of ego-motion and optical flow tracking. The gyroscopic data provides both low latency feedback and allows extrapolation of the face position beyond the field-of-view of the front facing camera. The technique employs a hybrid position and rate control that uses the viewer's 3D position to drive viewing and exploration of very large image spaces on the mobile device.
  • DESCRIPTION OF THE DRAWINGS
  • The specific features, aspects, and advantages of the disclosure will become better understood with regard to the following description, appended claims, and accompanying drawings where:
  • FIG. 1 depicts a flow diagram of an exemplary process for practicing one embodiment of the mobile image viewing technique described herein.
  • FIG. 2 depicts another flow diagram of another exemplary process for practicing the mobile image viewing technique described herein.
  • FIG. 3 is an exemplary architecture for practicing one exemplary embodiment of the mobile image viewing technique described herein.
  • FIG. 4 shows that a gyroscope alone cannot distinguish between situations in Case B and Case C. The drift signal, θD, disambiguates these and brings the control in line with θF.
  • FIG. 5 depicts the face offset angle and distance that is computed from a face tracked in a camera situated to the side of the display of a mobile device.
  • FIG. 6 is a schematic of an exemplary computing environment which can be used to practice the mobile image viewing technique.
  • DETAILED DESCRIPTION
  • In the following description of the mobile image viewing technique, reference is made to the accompanying drawings, which form a part thereof, and which show by way of illustration examples by which the mobile image viewing technique described herein may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the claimed subject matter.
  • 1.0 Mobile Image Viewing Technique
  • The following sections provide an overview of the mobile image viewing technique, exemplary processes and an exemplary architecture for practicing the technique, as well as details of the mathematical computations employed in some embodiments of the technique.
  • 1.1 Overview of the Technique
  • The mobile image viewing technique described herein allows a user to perform image viewing on mobile devices, leveraging the many sensors on typical mobile devices, such as, for example, cell phones or smart phones. In particular, in one embodiment, the technique uses low latency gyros on a mobile device to sense changes in direction of the device as well as the front-facing camera to detect and track the position of a user/viewer relative to a display on the mobile device, albeit with higher noise and latency. Fusion of these two sensor streams provides the functionality to create compelling interfaces to view a range of imagery. The technique provides for natural user interfaces for viewing many forms of complex imagery ranging from multiple images stitched to create a single viewpoint 360° panorama, multi-viewpoint image sets depicting parallax in a scene, and street side interfaces integrating both multi-perspective panoramas and single viewpoint 360° panoramas.
  • One aspect of large format and/or very wide angle imagery is that there is a natural tension between a desire for direct positional control, i.e., a direct mapping of sensor output to position in the image, versus rate control, mapping sensor position to velocity of motion across an image. In one embodiment, the technique employs a hybrid rate/position control through a single relationship between sensors and output. Some technical contributions of the technique thus include the sensor fusion between the gyro and viewer tracking from front facing camera, as well as novel functional relationships between this sensing and control of image viewing across numerous modalities.
  • The following sections provide exemplary processes for practicing the technique, an exemplary architecture for practicing the technique and details of various embodiments of the technique. Details for those processes, and the exemplary architecture are described in Section 2.
  • 1.2 Exemplary Processes for Practicing the Technique
  • FIG. 1 provides an exemplary process for practicing one embodiment of the mobile image viewing technique. As shown if FIG. 1, block 102, a mobile device's (for example, a mobile phone's) orientation and position are tracked using instrumentation on the device. For example, this mobile device could be a smart phone, Personal Data Assistant (PDA), or other cellular phone with a screen for viewing imagery. Tracking could be, for example, using a gyroscope on the mobile device, a digital compass, an accelerometer, or some other type of instrumentation that can determine orientation and position of the mobile device. A camera and viewer tracker on the mobile device is also simultaneously used to track a user's face looking at a screen on the mobile device, as shown in block 104. For example, the camera could be a front facing camera facing the user/viewer, disposed on the same side of the mobile device as the screen of the mobile device. The viewer tracker could be a face tracker, color-blob/skin tracker, tracker for tracking feature points of the face and other types of ego-motion and optical flow tracker.
  • A viewing angle and a viewing distance between the user and the screen on the mobile device are computed by using the tracked orientation and position of the mobile device, and the tracked position of the user's face relative to screen of the mobile device, as shown in block 106. The details of computing this viewing angle and viewing distance are provided in Section 3.
  • Image transformations of imagery to be rendered on the screen of the mobile device are then computed using the computed viewing angle and viewing distance to allow the user to control viewing of the rendered imagery, as shown in block 108. For example, the imagery can include any type of images including single viewpoint panoramas, multi-viewpoint image sets depicting parallax in a scene, multi-perspective panoramas or a combination of these. The user can change the view of the imagery by merely moving the mobile device relative to his or her face.
  • FIG. 2 provides another exemplary process for practicing another embodiment of the mobile image viewing technique. As shown if FIG. 2, block 202, a mobile device's (for example, a mobile phone's) orientation and position is tracked using a gyroscope (although other similar instrumentation could be used). A camera and viewer tracker on the mobile device is also used to track a user's face looking at a screen on the mobile device, as shown in block 204.
  • The mobile device's orientation and position from the gyroscope and the position of the user's face obtained by the viewer tracker is used to determine a combined position and rate control for viewing imagery on the screen of the mobile device, as shown in block 206. The details of the computation for determining this combined position and rate control are provided in Section 3.
  • Image transformations of imagery to be rendered on the screen of the mobile device are then computed using the computed combined position and rate control to allow the user to display different points of the rendered imagery, as shown in block 208. In general, the combined position and rate control values are mapped to coordinates in the imagery in order to determine which portion of the imagery to render. When the user moves the mobile device relative to his face the imagery on the device will change based on the distance and the angle the user holds the device.
  • 1.3 Exemplary Architecture
  • FIG. 3 shows an exemplary architecture 300 for practicing one embodiment of the mobile image viewing technique. As shown if FIG. 3, a mobile imagery computing module 302 is located on a computing device 600, which will be described in greater detail with respect to FIG. 6. This computing device 600 is preferably mobile, such as, for example a mobile phone or smart phone. The mobile computing device 600 includes a camera 304 that can be used to capture the face of a user 306 of the mobile computing device 600. The mobile computing device 600 includes instrumentation such as, for example, a gyroscope 308 that is used to track the mobile computing device's orientation and position. It should be noted, however that other instrumentation capable of determining the mobile devices orientation and position could equally well be used.
  • The mobile computing device 600 includes a viewer tracker 310 (e.g., a face tracker, optical flow on the camera, point tracker) that is used to track a user's face, looking at a screen 312 on the mobile device, which is captured by the camera 304. The mobile device's tracked orientation and position, and the position of the user's face obtained by the viewer tracker are used to determine a viewing angle in a viewing angle computation module 312 from the mobile computing device 600 to the user 306. In addition, the distance between the mobile computing device and the user are determined in a distance computation module 314. A combined position and rate control for viewing imagery 318 on the screen 312 of the mobile device in a combined position and rate control computation module 316. The output of the combined position and rate control module 316 is used to compute image transformations of imagery to be rendered in an image transformation module 320. The computed image transformations are used to create transformed imagery 322 to be rendered on the screen 312 of the mobile device 600. Using the transformed imagery 322 the user can display different views of the rendered imagery on the screen simply by moving the camera relative to his or her face.
  • 2.0 Exemplary Computations for Embodiments of the Technique
  • Exemplary processes and an exemplary architecture having been described, the following sections provide details and exemplary calculations for implementing various embodiments of the technique.
  • 2.1 Mapping Sensors to Image Transformations
  • Despite the lack of many traditional affordances found in a desktop setting (large display, keyboard, mouse, etc.), mobile devices offer a wide variety of sensors (touch, gyroscopes, accelerometers, compass, and cameras) that can help overcome the lack of traditional navigation controls and provide a richer and more natural interface to image viewing. The mobile image viewing technique described herein has been used with various applications that cover a variety of image (scene) viewing scenarios in which the imagery covers either a large field of view, a wide strip multi-perspective panorama, multi-views, or a combination of these. In particular, interfaces for 360° panoramas, multi-view strips exhibiting parallax, and Microsoft® Corporation's Bing™ for iOS StreetSide™ interface that combines very long multi-perspective strip panoramas with single view 360° views. A common aspect of all of these is that the imagery requires exploration to view the full breadth of the data. Details of these exemplary applications are described in Section 3.
  • The most obvious way to explore imagery that cannot fit in the display is to use touch sensing to mimic a traditional interface. Users have become accustomed to sliding a finger to pan and performing a two fingered pinch for zooming. These affordances have four main drawbacks, however. First, a user's fingers and hand obscure a significant portion of the display. Second, it becomes difficult to disambiguate touches designed for purposes other than navigation, for example, a touch designed to select a link embedded with the imagery. Third, using the touch screen generally requires two hands. Finally, combined motions require sequential gestures, e.g., a “pan and zoom” action requires first a swipe and then a pinch. The mobile image viewing technique described herein instead uses more natural interfaces involving one-handed motion of the device itself for image navigation.
  • 2.2 Hybrid Gyro Plus Viewer Tracking
  • In the real world, a person moves his or her gaze relative to a scene, or moves an object relative to their gaze to fully explore a scene (or object). In both cases, their head is moving relative to the scene. If one considers an image as a representation of a scene on a device, tracking the head relative to the device as an affordance for navigation seems like a natural fit.
  • Viewer tracking, such as, for example, face tracking alone can, in theory, provide a complete 3D input affordance, (x,y) position based on face location, and (z) depth based on face size. However, viewer tracking alone exhibits a few robustness problems. Viewer tracking, such as face tracking, is costly and thus incurs some latency. In addition, the vision algorithms for tracking face position and size are inherently noisy as small changes in face shape and illumination can produce unexpected signals. This can be overcome somewhat through filtering albeit at the price of more latency. Finally, viewer tracking is lost beyond an offset angle beyond the field of view of the front facing camera (it has been experimentally found that this limit is about ±15 degrees). Nonetheless, viewer tracking is unique in its ability to deliver a 3D signal that is directly relevant to image viewing applications.
  • Gyroscopes provide a more robust and lower latency alternative for the 2D (x,y) angular position. For relative orientation, the gyros provide a superior signal, however they do drift considerably. It is common to see 5 degree drifts during a 360° rotation over 15 seconds. In addition, gyros alone cannot disambiguate between the cases shown in FIG. 4 Case B and FIG. 4 Case C. In the first case, the user 402 has rotated the device 404. In the second case, the user 402 has rotated themselves carrying that same rotation to the device 404. To achieve both robustness and liveliness and reduced ambiguity, the technique creates a sensor fusion that is a hybrid of the gyro plus viewer tracking using the front facing camera.
  • In one embodiment of the technique, it was decided not to use accelerometers for positions tracking based on empirical experience that has shown that aside from the direction of gravity and fairly sudden moves, the noise from the accelerometers overwhelms subtle motions. However, it should be noted that accelerometers, compasses and other tracking devices could feasibly be used to track the mobile device.
  • 2.2.1 Viewer Tracker
  • In one embodiment of the technique, a face is first located in the front facing camera via a face finder. Various conventional face finders can be used for this purpose. In one embodiment, the technique finds the user's face using a conventional face finder and returns a rectangle for the size and location of the face. A face template is recorded from this rectangle along with the position and size. This template is then matched at varying (x,y) positions and scales around the current (position, scale) at each subsequent frame. The (position, scale) with the highest correlation to the original template in the new frame is considered the current location of the face. In one embodiment, the technique searches over a rectangle 3×the size of the previous face in x and y and over 3 scales between ±5% of the previous scale. If the face is lost, the slower full-frame face finder is re-run until the face is found. Given the field of view of the front facing camera, position is trivially transformed to horizontal and vertical angular offsets, θx F′ and θy F′. From here on, only the more important horizontal offset, θx F′ will be referred to, and the x subscript will be dropped. As previously mentioned, however, other methods of tracking a viewer can be used.
  • 2.2.2 Horizontal Angle
  • Referring to FIG. 5, there are two direct signals the technique tracks, θF′, 502, the angular offset of the face from the normal to the display (from the front-facing camera), and ΔθG, 504, the change in rotation about the vertical axis tangent to the display (from the gyros). The technique estimates the distance d 506 from the camera 508 from face width. Given the fixed offset of the camera 508 from the center of the display 512 and ΔθG, 504, the technique derives θF, 510, the face's angular offset from the display center. It is thus possible to compute the value, Θ, which is mapped to the position and rate control for the user interface.

  • Θt=α·Θt−1+(1−α)·(θt Gt D)   (1)
  • Θt represents the value at time t that the technique will map to its control functions. The variable a serves to provide a small amount of hysteresis to smooth this signal. It was found that a value of 0.1 provides a small smoothing without adding noticeable lag. θt G is the time integrated gyro signal, i.e., the total rotation of the device including any potential drift:

  • θt Gt−1 G+Δθt G   (2)
  • where Δθt G represents the direct readings from the gyro. θt D represents a smoothed signal of the difference between the face position, θF and the integrated gyro angle, θG. This quantity encompasses any drift incurred by the gyro as well as any rotation of the user himself (see FIG. 4 Case C). Since the viewer tracker runs more slowly than the gyro readings (in one embodiment, 1 to 10 HZ for the viewer tracker and 50 Hz for the gyro), the technique records both the face position and gyro values each time a face position is received. θD is thus defined by

  • θt D=β·θt−1 D+(1−β)·(θ*F−θ*G)   (3)
  • where “*” represents the time of the most recent face track, and β serves to smooth the face signal and add hysteresis. In one embodiment, the technique uses a much higher value of β=0.9 in this case. This produces a some lag time which actually adds a side benefit discussed in the context of the control mapping.
  • To summarize, Θt represents a best guess of the face position relative to the device even when the face is beyond the field of view of the device. Although viewer tracking, such as, for example, face tracking, is inherently slow and noisy, the gyro signal serves as a lively proxy with good accuracy over short time intervals. The viewer tracker is used to continuously correct the gyro input to bring it back in line with where the face is seen from the front-facing camera.
  • 2.2.3 Distance
  • In one embodiment, the technique uses the face width in the camera's view as as proxy for the face's distance from the device. The technique uses a time smoothed face size for this signal.

  • Z t =γ·Z t−1+(1−γ)·(1/FaceSize)   (4)
  • where γ=0.9 to smooth over noisy readings albeit at some cost of latency.
  • 2.3 Hybrid Position and Rate Control
  • Given the angular offset, Θt, one is now left with the mapping between this value and the controls for viewing the imagery. The simplest and most intuitive mapping is a position control, in which the Θt is mapped through some linear function to the position on the imagery (i.e., angle in a panorama, position on a large flat image, or viewing position in a multi-view parallax image set). Position mapping can provide fine control over short distances and is almost always the control of choice for displaying imagery when applicable.
  • Unfortunately, such a simple mapping has severe limitations for viewing large imagery. The useful domain of Θt is between ±40° since beyond this angle of a typical mobile device/phone display becomes severely foreshortened and unviewable. For 360° panoramas or very long multi-perspective images this range is very limited. The alternatives are to provide clutching or to create a rate control in which Θt is mapped to a velocity across the imagery. Although rate controls provide an infinite range as the integrated position continues to increase over time, they have been shown to lack fine precision positioning as well as suffering from a tendency to overshoot.
  • 2.4 Zoom Control
  • In panorama and street side applications, Zt is linearly mapped to zoom level. The technique caps the minimum zoom level at a bit less than arm's length. The street side application has a fixed zoom level at which a mode change takes place between the multi-perspective panoramas and cylindrical panoramas. To avoid rapid mode changes near this transition point, the technique eases in a small offset to the zoom level after the mode switch and then eases out the offset after the mode switches back.
  • 2.5 Mapping Controls to Imagery
  • Once the values of the controls are obtained they are mapped to the imagery to be rendered on the screen. For example, the output of the position and velocity control can be mapped to: the viewing angle in a 360 panorama or viewpoint selection in a multi-point panorama. The zoom control can be used to scale the field of view, i.e., literally zoom in/out on an image or to switch between modes as is described in the previous paragraph.
  • 3.0 Exemplary Applications
  • The interaction paradigm of the technique described above has been applied to a number of image viewing applications. These include wide angle imagery such as 360° panoramas and parallax photos consisting of a series of side-by-side images. Also, the technique has been applied to very long multi-perspective images and 360° panoramas.
  • 3.1 Panoramas
  • Wide angle and 360° panoramas have become a popular form of imagery especially as new technologies arrive making their construction easier. Sites, which hosts high resolution panoramas, and the bubbles of street side imagery are two examples.
  • By interpreting ΔXt at each frame time as a change in orientation, and Zt as the zoom factor, the technique provides an interface to such imagery that does not require two-handed input or standing and physically turning in place.
  • 3.2 Parallax Images
  • By sliding a camera sideways and capturing a series of images one can create a virtual environment by simply flipping between the images. Automated and less constrained versions for capture and display of parallax photos also exist.
  • In one embodiment, ΔXt at each frame time represents a relative offset of the virtual camera. One embodiment of the technique provides an interface to such imagery that creates a feeling of peering into a virtual environment. In this case, the position control and thus the gyro input dominates. The viewer tracker's role is primarily to counteract gyro drift.
  • 3.3 Street Imagery
  • A new interface for viewing street side imagery was demonstrated in Microsoft® Corporation's StreetSlide™ application. The original imagery consists of a series of 360° panoramas set at approximately 2 meter intervals along a street. The StreetSlide™ paradigm was subsequently adapted to create long multi-perspective strip panoramas constructed by clipping out and stitching parts of the series of panoramas. The StreetSlide™ application automatically flips between the long strip panoramas and the 360° panoramas depending on zoom level. Other similar applications use traditional finger swipes and pinch operations.
  • The present mobile image viewing technique was applied as a new user interface on top of the StreetSlide™ application. It could equally well be applied to similar applications. Since there are two modes, the meaning of ΔXt switches. In slide mode, ΔXt moves the view left and right along the street side. Zt zooms the strip panorama in and out. At a given zoom level, the mode switches automatically to the corresponding 360° panorama at that location on the street. At this point, the technique revert to the panorama control described above. Zooming out once more returns to the slide mode. Navigation now requires only one hand leaving the other hand free for unambiguous access to other navigation aids and information overlaid on the location imagery.
  • 3.4 Alternate Embodiments
  • Many other types of media could be viewing using the mobile image viewing technique. For example, the technique can be applied to an interface to mapping applications. Being able to zoom out from a street in San Francisco, pan across the country, and back in to a New York street, for example, would be achievable by simply moving the device away, tilting it “east” and pulling the device back towards the viewer.
  • 4.0 Exemplary Operating Environments:
  • The mobile image viewing technique described herein is operational within numerous types of general purpose or special purpose computing system environments or configurations. FIG. 6 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the mobile image viewing technique, as described herein, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 6 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • For example, FIG. 6 shows a general system diagram showing a simplified computing device 600. Such computing devices can be typically found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • To allow a device to implement the mobile image viewing technique, the device should have a sufficient computational capability and system memory to enable basic computational operations. In particular, as illustrated by FIG. 6, the computational capability is generally illustrated by one or more processing unit(s) 610, and may also include one or more GPUs 615, either or both in communication with system memory 620. Note that that the processing unit(s) 610 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • In addition, the simplified computing device of FIG. 6 may also include other components, such as, for example, a communications interface 630. The simplified computing device of FIG. 6 may also include one or more conventional computer input devices 640 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.). The simplified computing device of FIG. 6 may also include other optional components, such as, for example, one or more conventional computer output devices 650 (e.g., display device(s) 655, audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.). Note that typical communications interfaces 630, input devices 640, output devices 650, and storage devices 660 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • The simplified computing device of FIG. 6 may also include a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 600 via storage devices 660 and includes both volatile and nonvolatile media that is either removable 670 and/or non-removable 480, for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • Storage of information such as computer-readable or computer-executable instructions, data structures, program modules, etc., can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism. Note that the terms “modulated data signal” or “carrier wave” generally refer a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • Further, software, programs, and/or computer program products embodying the some or all of the various embodiments of the mobile image viewing technique described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • Finally, the mobile image viewing technique described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks. In a distributed computing environment, program modules may be located in both local and remote computer storage media including media storage devices. Still further, the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.
  • It should also be noted that any or all of the aforementioned alternate embodiments described herein may be used in any combination desired to form additional hybrid embodiments. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. The specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (20)

1. A computer-implemented process for viewing large scale imagery on a mobile device, comprising:
tracking a mobile device's orientation and position;
using a camera and viewer tracker on the mobile device to track a user's face looking at a screen on the mobile device;
computing a viewing angle and a viewing distance between the user and the screen on the mobile device by using the tracked orientation and position of the mobile device, and the tracked position of the user's face relative to screen of the mobile device; and
computing image transformations of an imagery rendered on the screen of the mobile device using the computed viewing angle and viewing distance to allow the user to control viewing of the rendered imagery.
2. The computer-implemented process of claim 1, further comprising the user changing the viewpoint of imagery rendered on the screen by moving the mobile device relative to the user's face.
3. The computer-implemented process of claim 1, further comprising zooming in or out of the imagery rendered on the screen by changing the distance of the mobile device relative to the user's face.
4. The computer-implemented process of claim 3 wherein the distance of the mobile device relative to the user's face is approximated using face width.
5. The computer-implemented process of claim 1, further comprising panning around the imagery rendered on the screen by changing the position of the mobile device laterally in relation to the user's face.
6. The computer-implemented process of claim 1, further comprising mapping the angular offset of the user's face from the normal to the screen of the mobile device and the change in rotation about the vertical axis tangent to the screen to a position in the imagery rendered in computing the image transformations.
7. The computer-implemented process of claim 6, further comprising fusing the tracked mobile device's orientation and position and tracked user's face to map the angular offset of the user's face from the normal to the display of the mobile device and the change in rotation about the vertical axis tangent to the display of the mobile device to a position in the rendered imagery.
8. The computer-implemented process of claim 1 wherein the mobile device's orientation and position is determined by a gyroscope on the mobile device.
9. The computer-implemented process of claim 8 wherein the viewer tracker is used to correct for drift of the gyroscope.
10. A computer-implemented process for viewing large scale imagery on a mobile device, comprising:
tracking a mobile device's orientation and position with a gyroscope on the mobile device;
using a front-facing camera and viewer tracker on the mobile device to track a user's face looking at a screen on the mobile device;
using the mobile device's orientation and position from the gyroscope and the position of the user's face obtained by the viewer tracker to determine a combined position and rate control for viewing imagery on the screen of the mobile device.
using the combined position and rate control to compute image transformations of the imagery rendered on the screen of the mobile device to allow the user to display different viewpoints of the rendered imagery.
11. The computer-implemented process of claim 10, wherein the imagery is a 360 degree panorama and wherein the user can pan to the left and to the right in the rendered imagery by changing the viewing angle between the user and the screen of the mobile device, and can zoom into the imagery by changing the distance between the user and the screen of the mobile device.
12. The computer-implemented process of claim 10, wherein the imagery is a set of parallax images and wherein the combined position and rate control is used to determine a relative offset of a virtual camera.
13. The computer-implemented process of claim 10, wherein the imagery comprises a series of 360 degree panoramas of the same scene taken at fixed intervals, and a set of long perspective strip panoramas created by clipping out and stitching parts of the series of 360 degree panoramas.
14. The computer-implemented process of claim 13, wherein the user can view left and right in a 360 degree panorama of the series by changing the viewing angle between the user's face and the screen of the mobile device and can zoom into a different 360 degree panorama of the series by changing the viewing distance between the user's face and the screen of the mobile device.
15. A system for viewing large scale imagery, comprising:
a general purpose computing device;
a computer program comprising program modules executable by the general purpose computing device, wherein the computing device is directed by the program modules of the computer program to,
track a mobile device's orientation and position;
use a camera and viewer tracker on the mobile device to track a user's face looking at a screen on the mobile device;
use the mobile device's tracked orientation and position, and the position of the user's face obtained by the viewer tracker, to determine a combined position and rate control for viewing imagery on the screen of the mobile device,
using the combined position and rate control to compute image transformations of the imagery rendered on the screen of the mobile device to allow the user to display different view points of the rendered imagery.
16. The system of claim 15, wherein the module to determine the combined position and rate control for viewing imagery on the screen of the mobile device, further comprises a sub-module to:
compute a viewing angle and a viewing distance between the user and the screen on the mobile device by using the tracked orientation and position of the mobile device, and the tracked position of the user's face relative to screen of the mobile device.
17. The system of claim 16 wherein the user can change the viewpoint of the imagery rendered on the screen of the mobile device by changing the viewing angle of the mobile device relative to the user's face.
18. The system of claim 16, wherein the user's face can be outside of the field of view of the camera and wherein a gyroscope on the mobile device can be used to estimate the location of the face.
19. The system of claim 15, wherein the viewer tracker tracks the viewer's face by:
locating the viewer's face relative to the camera using a face finder which returns a rectangle for the size and location of the face;
recording a face template from the rectangle along with position and size;
matching the face template at varying positions and scales around the current position and scale at each subsequent frame recorded by the camera to find the face in subsequent frames;
if the face is lost, reacquiring the face with the face finder.
20. The system of claim 15, wherein the large scale imagery is one of a group comprising:
high resolution imagery;
wide field of view imagery; and
a multi-perspective panorama.
US13/159,010 2011-06-13 2011-06-13 Natural user interfaces for mobile image viewing Abandoned US20120314899A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/159,010 US20120314899A1 (en) 2011-06-13 2011-06-13 Natural user interfaces for mobile image viewing
US14/487,240 US10275020B2 (en) 2011-06-13 2014-09-16 Natural user interfaces for mobile image viewing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/159,010 US20120314899A1 (en) 2011-06-13 2011-06-13 Natural user interfaces for mobile image viewing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/487,240 Division US10275020B2 (en) 2011-06-13 2014-09-16 Natural user interfaces for mobile image viewing

Publications (1)

Publication Number Publication Date
US20120314899A1 true US20120314899A1 (en) 2012-12-13

Family

ID=47293234

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/159,010 Abandoned US20120314899A1 (en) 2011-06-13 2011-06-13 Natural user interfaces for mobile image viewing
US14/487,240 Active 2033-07-27 US10275020B2 (en) 2011-06-13 2014-09-16 Natural user interfaces for mobile image viewing

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/487,240 Active 2033-07-27 US10275020B2 (en) 2011-06-13 2014-09-16 Natural user interfaces for mobile image viewing

Country Status (1)

Country Link
US (2) US20120314899A1 (en)

Cited By (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130057573A1 (en) * 2011-09-02 2013-03-07 DigitalOptics Corporation Europe Limited Smart Display with Dynamic Face-Based User Preference Settings
US20130093667A1 (en) * 2011-10-12 2013-04-18 Research In Motion Limited Methods and devices for managing views displayed on an electronic device
US20130110482A1 (en) * 2011-11-02 2013-05-02 X-Rite Europe Gmbh Apparatus, Systems and Methods for Simulating A Material
US20130154971A1 (en) * 2011-12-15 2013-06-20 Samsung Electronics Co., Ltd. Display apparatus and method of changing screen mode using the same
US20130155096A1 (en) * 2011-12-15 2013-06-20 Christopher J. Legair-Bradley Monitor orientation awareness
US20130342459A1 (en) * 2012-06-20 2013-12-26 Amazon Technologies, Inc. Fingertip location for gesture input
US20140057675A1 (en) * 2012-08-22 2014-02-27 Don G. Meyers Adaptive visual output based on change in distance of a mobile device to a user
US20140152534A1 (en) * 2012-12-03 2014-06-05 Facebook, Inc. Selectively Steer Light from Display
US8836777B2 (en) 2011-02-25 2014-09-16 DigitalOptics Corporation Europe Limited Automatic detection of vertical gaze using an embedded imaging device
US20140298246A1 (en) * 2013-03-29 2014-10-02 Lenovo (Singapore) Pte, Ltd. Automatic display partitioning based on user number and orientation
EP2933605A1 (en) * 2014-04-17 2015-10-21 Nokia Technologies OY A device orientation correction method for panorama images
WO2015175006A1 (en) * 2014-05-13 2015-11-19 Citrix Systems, Inc. Navigation of virtual desktop content on client devices based on movement of these client devices
WO2015195445A1 (en) * 2014-06-17 2015-12-23 Amazon Technologies, Inc. Motion control for managing content
EP2829150A4 (en) * 2012-03-21 2016-01-13 Google Inc Using camera input to determine axis of rotation and navigation
US9274597B1 (en) * 2011-12-20 2016-03-01 Amazon Technologies, Inc. Tracking head position for rendering content
US9294670B2 (en) * 2014-01-24 2016-03-22 Amazon Technologies, Inc. Lenticular image capture
EP3023863A1 (en) * 2014-11-20 2016-05-25 Thomson Licensing Device and method for processing visual data, and related computer program product
US9367951B1 (en) * 2012-03-21 2016-06-14 Amazon Technologies, Inc. Creating realistic three-dimensional effects
US20160187989A1 (en) * 2014-12-26 2016-06-30 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US9400575B1 (en) 2012-06-20 2016-07-26 Amazon Technologies, Inc. Finger detection for element selection
US9486699B2 (en) * 2014-06-06 2016-11-08 Nintendo Co., Ltd. Information processing system, non-transitory computer-readable storage medium having stored therein information processing program, information processing apparatus, and information processing method
US9581431B1 (en) 2014-03-18 2017-02-28 Jeffrey M. Sieracki Method and system for parallactically synced acquisition of images about common target
US20170083216A1 (en) * 2014-03-14 2017-03-23 Volkswagen Aktiengesellschaft Method and a device for providing a graphical user interface in a vehicle
US20170132795A1 (en) * 2014-02-26 2017-05-11 Apeiros, Llc Mobile, wearable, automated target tracking system
US9658688B2 (en) 2013-10-15 2017-05-23 Microsoft Technology Licensing, Llc Automatic view adjustment
US20170359570A1 (en) * 2015-07-15 2017-12-14 Fyusion, Inc. Multi-View Interactive Digital Media Representation Lock Screen
US9865033B1 (en) * 2014-01-17 2018-01-09 Amazon Technologies, Inc. Motion-based image views
US9911395B1 (en) * 2014-12-23 2018-03-06 Amazon Technologies, Inc. Glare correction via pixel processing
WO2018093075A1 (en) * 2016-11-16 2018-05-24 삼성전자 주식회사 Electronic device and control method thereof
US10099134B1 (en) 2014-12-16 2018-10-16 Kabam, Inc. System and method to better engage passive users of a virtual space by providing panoramic point of views in real time
US10116874B2 (en) 2016-06-30 2018-10-30 Microsoft Technology Licensing, Llc Adaptive camera field-of-view
US10191629B2 (en) * 2014-07-25 2019-01-29 Andrew W Donoho Systems and methods for processing of visual content using affordances
KR20190045943A (en) * 2016-09-30 2019-05-03 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for creating navigation route
ES2711858A1 (en) * 2017-11-03 2019-05-07 The Mad Pixel Factory S L Photo camera support device for digitizing works of art (Machine-translation by Google Translate, not legally binding)
US10430995B2 (en) 2014-10-31 2019-10-01 Fyusion, Inc. System and method for infinite synthetic image generation from multi-directional structured image array
US10540773B2 (en) 2014-10-31 2020-01-21 Fyusion, Inc. System and method for infinite smoothing of image sequences
US10540809B2 (en) 2017-06-30 2020-01-21 Bobby Gene Burrough Methods and apparatus for tracking a light source in an environment surrounding a device
US10567564B2 (en) 2012-06-15 2020-02-18 Muzik, Inc. Interactive networked apparatus
US10620825B2 (en) * 2015-06-25 2020-04-14 Xiaomi Inc. Method and apparatus for controlling display and mobile terminal
US10691741B2 (en) 2017-04-26 2020-06-23 The Nielsen Company (Us), Llc Methods and apparatus to detect unconfined view media
US10841156B2 (en) 2017-12-11 2020-11-17 Ati Technologies Ulc Mobile application for monitoring and configuring second device
US10852902B2 (en) 2015-07-15 2020-12-01 Fyusion, Inc. Automatic tagging of objects on a multi-view interactive digital media representation of a dynamic entity
US11112604B2 (en) * 2017-09-25 2021-09-07 Continental Automotive Gmbh Head-up display
US11195314B2 (en) 2015-07-15 2021-12-07 Fyusion, Inc. Artificially rendering images using viewpoint interpolation and extrapolation
US11202017B2 (en) 2016-10-06 2021-12-14 Fyusion, Inc. Live style transfer on a mobile device
US20210405383A1 (en) * 2019-03-19 2021-12-30 Canon Kabushiki Kaisha Electronic device, method for controlling electronic device, and non-transitory computer readable storage medium
US11435869B2 (en) 2015-07-15 2022-09-06 Fyusion, Inc. Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations
US11488380B2 (en) 2018-04-26 2022-11-01 Fyusion, Inc. Method and apparatus for 3-D auto tagging
US11632533B2 (en) 2015-07-15 2023-04-18 Fyusion, Inc. System and method for generating combined embedded multi-view interactive digital media representations
US11636637B2 (en) 2015-07-15 2023-04-25 Fyusion, Inc. Artificially rendering images using viewpoint interpolation and extrapolation
US11776229B2 (en) 2017-06-26 2023-10-03 Fyusion, Inc. Modification of multi-view interactive digital media representation
US11783864B2 (en) 2015-09-22 2023-10-10 Fyusion, Inc. Integration of audio into a multi-view interactive digital media representation
US11876948B2 (en) 2017-05-22 2024-01-16 Fyusion, Inc. Snapshots at predefined intervals or angles
US11956412B2 (en) 2015-07-15 2024-04-09 Fyusion, Inc. Drone based capture of multi-view interactive digital media
US11960533B2 (en) 2017-01-18 2024-04-16 Fyusion, Inc. Visual search using multi-view interactive digital media representations

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2988140A1 (en) * 2015-06-17 2016-12-22 Wal-Mart Stores, Inc. Systems and methods for selecting media for meal plans
US10126813B2 (en) 2015-09-21 2018-11-13 Microsoft Technology Licensing, Llc Omni-directional camera
CN105872390A (en) * 2016-06-15 2016-08-17 南京快脚兽软件科技有限公司 Equipment for efficiently checking peripheral environment images
US10811136B2 (en) 2017-06-27 2020-10-20 Stryker Corporation Access systems for use with patient support apparatuses
US11382812B2 (en) 2017-06-27 2022-07-12 Stryker Corporation Patient support systems and methods for assisting caregivers with patient care
US11484451B1 (en) 2017-06-27 2022-11-01 Stryker Corporation Patient support apparatus user interfaces
US11096850B2 (en) 2017-06-27 2021-08-24 Stryker Corporation Patient support apparatus control systems
US11810667B2 (en) 2017-06-27 2023-11-07 Stryker Corporation Patient support systems and methods for assisting caregivers with patient care
US11202729B2 (en) 2017-06-27 2021-12-21 Stryker Corporation Patient support apparatus user interfaces
US11337872B2 (en) 2017-06-27 2022-05-24 Stryker Corporation Patient support systems and methods for assisting caregivers with patient care
CN109842793A (en) * 2017-09-22 2019-06-04 深圳超多维科技有限公司 A kind of naked eye 3D display method, apparatus and terminal
US11094095B2 (en) * 2017-11-07 2021-08-17 Disney Enterprises, Inc. Focal length compensated augmented reality
KR20200050042A (en) * 2018-10-31 2020-05-11 엔에이치엔 주식회사 A method for daptively magnifying graphic user interfaces and a mobile device for performing the same
WO2020164480A1 (en) * 2019-02-11 2020-08-20 Beijing Bytedance Network Technology Co., Ltd. Condition dependent video block partition
US11869213B2 (en) * 2020-01-17 2024-01-09 Samsung Electronics Co., Ltd. Electronic device for analyzing skin image and method for controlling the same

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050059488A1 (en) * 2003-09-15 2005-03-17 Sony Computer Entertainment Inc. Method and apparatus for adjusting a view of a scene being displayed according to tracked head motion
US20060061551A1 (en) * 1999-02-12 2006-03-23 Vega Vista, Inc. Motion detection and tracking system to control navigation and display of portable displays including on-chip gesture detection
US20090325607A1 (en) * 2008-05-28 2009-12-31 Conway David P Motion-controlled views on mobile computing devices
US20100064259A1 (en) * 2008-09-11 2010-03-11 Lg Electronics Inc. Controlling method of three-dimensional user interface switchover and mobile terminal using the same
US20100079371A1 (en) * 2008-05-12 2010-04-01 Takashi Kawakami Terminal apparatus, display control method, and display control program
US20100144436A1 (en) * 2008-12-05 2010-06-10 Sony Computer Entertainment Inc. Control Device for Communicating Visual Information
US20110102637A1 (en) * 2009-11-03 2011-05-05 Sony Ericsson Mobile Communications Ab Travel videos
US20110115883A1 (en) * 2009-11-16 2011-05-19 Marcus Kellerman Method And System For Adaptive Viewport For A Mobile Device Based On Viewing Angle
US20110216060A1 (en) * 2010-03-05 2011-09-08 Sony Computer Entertainment America Llc Maintaining Multiple Views on a Shared Stable Virtual Space
US20110248987A1 (en) * 2010-04-08 2011-10-13 Disney Enterprises, Inc. Interactive three dimensional displays on handheld devices
US20130091462A1 (en) * 2011-10-06 2013-04-11 Amazon Technologies, Inc. Multi-dimensional interface
US20130191787A1 (en) * 2012-01-06 2013-07-25 Tourwrist, Inc. Systems and Methods for Acceleration-Based Motion Control of Virtual Tour Applications
US20130197681A1 (en) * 2012-01-26 2013-08-01 Motorola Mobility, Inc. Portable electronic device and method for controlling operation thereof taking into account which limb possesses the electronic device
US8630458B2 (en) * 2012-03-21 2014-01-14 Google Inc. Using camera input to determine axis of rotation and navigation

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6029833A (en) 1983-07-28 1985-02-15 Canon Inc Image display device
US6466198B1 (en) * 1999-11-05 2002-10-15 Innoventions, Inc. View navigation and magnification of a hand-held device with a display
US7903084B2 (en) 2004-03-23 2011-03-08 Fujitsu Limited Selective engagement of motion input modes
US7301528B2 (en) 2004-03-23 2007-11-27 Fujitsu Limited Distinguishing tilt and translation motion components in handheld devices
KR100641182B1 (en) 2004-12-30 2006-11-02 엘지전자 주식회사 Apparatus and method for moving virtual screen in a mobile terminal
US8384718B2 (en) * 2008-01-10 2013-02-26 Sony Corporation System and method for navigating a 3D graphical user interface
JP2010086336A (en) * 2008-09-30 2010-04-15 Fujitsu Ltd Image control apparatus, image control program, and image control method
US8866809B2 (en) * 2008-09-30 2014-10-21 Apple Inc. System and method for rendering dynamic three-dimensional appearing imagery on a two-dimensional user interface
KR101699922B1 (en) * 2010-08-12 2017-01-25 삼성전자주식회사 Display system and method using hybrid user tracking sensor
US9274597B1 (en) * 2011-12-20 2016-03-01 Amazon Technologies, Inc. Tracking head position for rendering content

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060061551A1 (en) * 1999-02-12 2006-03-23 Vega Vista, Inc. Motion detection and tracking system to control navigation and display of portable displays including on-chip gesture detection
US20050059488A1 (en) * 2003-09-15 2005-03-17 Sony Computer Entertainment Inc. Method and apparatus for adjusting a view of a scene being displayed according to tracked head motion
US20100079371A1 (en) * 2008-05-12 2010-04-01 Takashi Kawakami Terminal apparatus, display control method, and display control program
US20090325607A1 (en) * 2008-05-28 2009-12-31 Conway David P Motion-controlled views on mobile computing devices
US20100064259A1 (en) * 2008-09-11 2010-03-11 Lg Electronics Inc. Controlling method of three-dimensional user interface switchover and mobile terminal using the same
US20100144436A1 (en) * 2008-12-05 2010-06-10 Sony Computer Entertainment Inc. Control Device for Communicating Visual Information
US20110102637A1 (en) * 2009-11-03 2011-05-05 Sony Ericsson Mobile Communications Ab Travel videos
US20110115883A1 (en) * 2009-11-16 2011-05-19 Marcus Kellerman Method And System For Adaptive Viewport For A Mobile Device Based On Viewing Angle
US20110216060A1 (en) * 2010-03-05 2011-09-08 Sony Computer Entertainment America Llc Maintaining Multiple Views on a Shared Stable Virtual Space
US20110248987A1 (en) * 2010-04-08 2011-10-13 Disney Enterprises, Inc. Interactive three dimensional displays on handheld devices
US8581905B2 (en) * 2010-04-08 2013-11-12 Disney Enterprises, Inc. Interactive three dimensional displays on handheld devices
US20130091462A1 (en) * 2011-10-06 2013-04-11 Amazon Technologies, Inc. Multi-dimensional interface
US20130191787A1 (en) * 2012-01-06 2013-07-25 Tourwrist, Inc. Systems and Methods for Acceleration-Based Motion Control of Virtual Tour Applications
US20130197681A1 (en) * 2012-01-26 2013-08-01 Motorola Mobility, Inc. Portable electronic device and method for controlling operation thereof taking into account which limb possesses the electronic device
US8630458B2 (en) * 2012-03-21 2014-01-14 Google Inc. Using camera input to determine axis of rotation and navigation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Agarwala et al., "Photographing long scenes with multi-viewpoint panoramas," July 2006, ACM Transactions on Graphics Vol. 25 Issue 3, Pages 853-861 *
Bleser, Gabriele, and Didier Stricker. "Advanced tracking through efficient image processing and visual–inertial sensor fusion." Computers & Graphics 33.1 (2009): 59-72. *
You, S.; Neumann, U., "Fusion of vision and gyro tracking for robust augmented reality registration," Virtual Reality, 2001. Proceedings. IEEE, pp.71,78, 17-17 March 2001. *
Yu-Jin Hong; Jae-In Hwang; Sun-Bum Youn; Sang Chul Ahn; Hyung-Gon Kim; Heedong Ko, "Interactive Panorama Video Viewer with Head Tracking Algorithms," 2010 3rd International Conference on Human-Centric Computing (HumanCom), pp.1,4, 11-13 Aug. 2010. *

Cited By (88)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8836777B2 (en) 2011-02-25 2014-09-16 DigitalOptics Corporation Europe Limited Automatic detection of vertical gaze using an embedded imaging device
US20130057573A1 (en) * 2011-09-02 2013-03-07 DigitalOptics Corporation Europe Limited Smart Display with Dynamic Face-Based User Preference Settings
US20130093667A1 (en) * 2011-10-12 2013-04-18 Research In Motion Limited Methods and devices for managing views displayed on an electronic device
US20130110482A1 (en) * 2011-11-02 2013-05-02 X-Rite Europe Gmbh Apparatus, Systems and Methods for Simulating A Material
US8780108B2 (en) * 2011-11-02 2014-07-15 X-Rite Switzerland GmbH Apparatus, systems and methods for simulating a material
US20130154971A1 (en) * 2011-12-15 2013-06-20 Samsung Electronics Co., Ltd. Display apparatus and method of changing screen mode using the same
US20130155096A1 (en) * 2011-12-15 2013-06-20 Christopher J. Legair-Bradley Monitor orientation awareness
US9274597B1 (en) * 2011-12-20 2016-03-01 Amazon Technologies, Inc. Tracking head position for rendering content
EP2829150A4 (en) * 2012-03-21 2016-01-13 Google Inc Using camera input to determine axis of rotation and navigation
US9367951B1 (en) * 2012-03-21 2016-06-14 Amazon Technologies, Inc. Creating realistic three-dimensional effects
US11924364B2 (en) 2012-06-15 2024-03-05 Muzik Inc. Interactive networked apparatus
US10567564B2 (en) 2012-06-15 2020-02-18 Muzik, Inc. Interactive networked apparatus
US9400575B1 (en) 2012-06-20 2016-07-26 Amazon Technologies, Inc. Finger detection for element selection
US9213436B2 (en) * 2012-06-20 2015-12-15 Amazon Technologies, Inc. Fingertip location for gesture input
US20130342459A1 (en) * 2012-06-20 2013-12-26 Amazon Technologies, Inc. Fingertip location for gesture input
US9690334B2 (en) * 2012-08-22 2017-06-27 Intel Corporation Adaptive visual output based on change in distance of a mobile device to a user
US20140057675A1 (en) * 2012-08-22 2014-02-27 Don G. Meyers Adaptive visual output based on change in distance of a mobile device to a user
US20140152534A1 (en) * 2012-12-03 2014-06-05 Facebook, Inc. Selectively Steer Light from Display
US10235592B1 (en) 2013-03-15 2019-03-19 Jeffrey M. Sieracki Method and system for parallactically synced acquisition of images about common target
US10853684B1 (en) 2013-03-15 2020-12-01 Jeffrey M. Sieracki Method and system for parallactically synced acquisition of images about common target
US9830525B1 (en) 2013-03-15 2017-11-28 Jeffrey M. Sieracki Method and system for parallactically synced acquisition of images about common target
US20140298246A1 (en) * 2013-03-29 2014-10-02 Lenovo (Singapore) Pte, Ltd. Automatic display partitioning based on user number and orientation
US9658688B2 (en) 2013-10-15 2017-05-23 Microsoft Technology Licensing, Llc Automatic view adjustment
US9865033B1 (en) * 2014-01-17 2018-01-09 Amazon Technologies, Inc. Motion-based image views
US9294670B2 (en) * 2014-01-24 2016-03-22 Amazon Technologies, Inc. Lenticular image capture
US20170132795A1 (en) * 2014-02-26 2017-05-11 Apeiros, Llc Mobile, wearable, automated target tracking system
US10592078B2 (en) * 2014-03-14 2020-03-17 Volkswagen Ag Method and device for a graphical user interface in a vehicle with a display that adapts to the relative position and operating intention of the user
US20170083216A1 (en) * 2014-03-14 2017-03-23 Volkswagen Aktiengesellschaft Method and a device for providing a graphical user interface in a vehicle
US9581431B1 (en) 2014-03-18 2017-02-28 Jeffrey M. Sieracki Method and system for parallactically synced acquisition of images about common target
EP2933605A1 (en) * 2014-04-17 2015-10-21 Nokia Technologies OY A device orientation correction method for panorama images
WO2015175006A1 (en) * 2014-05-13 2015-11-19 Citrix Systems, Inc. Navigation of virtual desktop content on client devices based on movement of these client devices
US9486699B2 (en) * 2014-06-06 2016-11-08 Nintendo Co., Ltd. Information processing system, non-transitory computer-readable storage medium having stored therein information processing program, information processing apparatus, and information processing method
WO2015195445A1 (en) * 2014-06-17 2015-12-23 Amazon Technologies, Inc. Motion control for managing content
US9910505B2 (en) 2014-06-17 2018-03-06 Amazon Technologies, Inc. Motion control for managing content
US10191629B2 (en) * 2014-07-25 2019-01-29 Andrew W Donoho Systems and methods for processing of visual content using affordances
US10540773B2 (en) 2014-10-31 2020-01-21 Fyusion, Inc. System and method for infinite smoothing of image sequences
US10430995B2 (en) 2014-10-31 2019-10-01 Fyusion, Inc. System and method for infinite synthetic image generation from multi-directional structured image array
US10846913B2 (en) 2014-10-31 2020-11-24 Fyusion, Inc. System and method for infinite synthetic image generation from multi-directional structured image array
EP3035159A1 (en) * 2014-11-20 2016-06-22 Thomson Licensing Device and method for processing visual data, and related computer program product
EP3023863A1 (en) * 2014-11-20 2016-05-25 Thomson Licensing Device and method for processing visual data, and related computer program product
US20160148434A1 (en) * 2014-11-20 2016-05-26 Thomson Licensing Device and method for processing visual data, and related computer program product
US10099134B1 (en) 2014-12-16 2018-10-16 Kabam, Inc. System and method to better engage passive users of a virtual space by providing panoramic point of views in real time
US9911395B1 (en) * 2014-12-23 2018-03-06 Amazon Technologies, Inc. Glare correction via pixel processing
US20170160875A1 (en) * 2014-12-26 2017-06-08 Seungman KIM Electronic apparatus having a sensing unit to input a user command adn a method thereof
US11928286B2 (en) 2014-12-26 2024-03-12 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US9864511B2 (en) * 2014-12-26 2018-01-09 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US11182021B2 (en) 2014-12-26 2021-11-23 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US20160187989A1 (en) * 2014-12-26 2016-06-30 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US10423284B2 (en) * 2014-12-26 2019-09-24 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US20180239494A1 (en) * 2014-12-26 2018-08-23 Seungman KIM Electronic apparatus having a sensing unit to input a user command adn a method thereof
US20190354236A1 (en) * 2014-12-26 2019-11-21 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US10845922B2 (en) * 2014-12-26 2020-11-24 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US11675457B2 (en) 2014-12-26 2023-06-13 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US20160378332A1 (en) * 2014-12-26 2016-12-29 Seungman KIM Electronic apparatus having a sensing unit to input a user command adn a method thereof
US9454235B2 (en) * 2014-12-26 2016-09-27 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US10013115B2 (en) * 2014-12-26 2018-07-03 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US10620825B2 (en) * 2015-06-25 2020-04-14 Xiaomi Inc. Method and apparatus for controlling display and mobile terminal
US11226736B2 (en) 2015-06-25 2022-01-18 Xiaomi Inc. Method and apparatus for controlling display and mobile terminal
US10852902B2 (en) 2015-07-15 2020-12-01 Fyusion, Inc. Automatic tagging of objects on a multi-view interactive digital media representation of a dynamic entity
US11435869B2 (en) 2015-07-15 2022-09-06 Fyusion, Inc. Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations
US11956412B2 (en) 2015-07-15 2024-04-09 Fyusion, Inc. Drone based capture of multi-view interactive digital media
US11636637B2 (en) 2015-07-15 2023-04-25 Fyusion, Inc. Artificially rendering images using viewpoint interpolation and extrapolation
US10750161B2 (en) * 2015-07-15 2020-08-18 Fyusion, Inc. Multi-view interactive digital media representation lock screen
US20170359570A1 (en) * 2015-07-15 2017-12-14 Fyusion, Inc. Multi-View Interactive Digital Media Representation Lock Screen
US11632533B2 (en) 2015-07-15 2023-04-18 Fyusion, Inc. System and method for generating combined embedded multi-view interactive digital media representations
US11195314B2 (en) 2015-07-15 2021-12-07 Fyusion, Inc. Artificially rendering images using viewpoint interpolation and extrapolation
US11776199B2 (en) 2015-07-15 2023-10-03 Fyusion, Inc. Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations
US11783864B2 (en) 2015-09-22 2023-10-10 Fyusion, Inc. Integration of audio into a multi-view interactive digital media representation
US10116874B2 (en) 2016-06-30 2018-10-30 Microsoft Technology Licensing, Llc Adaptive camera field-of-view
KR20190045943A (en) * 2016-09-30 2019-05-03 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for creating navigation route
KR102161390B1 (en) 2016-09-30 2020-09-29 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Navigation route creation method and device
US11202017B2 (en) 2016-10-06 2021-12-14 Fyusion, Inc. Live style transfer on a mobile device
US11366318B2 (en) 2016-11-16 2022-06-21 Samsung Electronics Co., Ltd. Electronic device and control method thereof
WO2018093075A1 (en) * 2016-11-16 2018-05-24 삼성전자 주식회사 Electronic device and control method thereof
US11960533B2 (en) 2017-01-18 2024-04-16 Fyusion, Inc. Visual search using multi-view interactive digital media representations
US11409784B2 (en) 2017-04-26 2022-08-09 The Nielsen Company (Us), Llc Methods and apparatus to detect unconfined view media
US11714847B2 (en) 2017-04-26 2023-08-01 The Nielsen Company (Us), Llc Methods and apparatus to detect unconfined view media
US10691741B2 (en) 2017-04-26 2020-06-23 The Nielsen Company (Us), Llc Methods and apparatus to detect unconfined view media
US11876948B2 (en) 2017-05-22 2024-01-16 Fyusion, Inc. Snapshots at predefined intervals or angles
US11776229B2 (en) 2017-06-26 2023-10-03 Fyusion, Inc. Modification of multi-view interactive digital media representation
US10540809B2 (en) 2017-06-30 2020-01-21 Bobby Gene Burrough Methods and apparatus for tracking a light source in an environment surrounding a device
US11112604B2 (en) * 2017-09-25 2021-09-07 Continental Automotive Gmbh Head-up display
ES2711858A1 (en) * 2017-11-03 2019-05-07 The Mad Pixel Factory S L Photo camera support device for digitizing works of art (Machine-translation by Google Translate, not legally binding)
US10841156B2 (en) 2017-12-11 2020-11-17 Ati Technologies Ulc Mobile application for monitoring and configuring second device
US11488380B2 (en) 2018-04-26 2022-11-01 Fyusion, Inc. Method and apparatus for 3-D auto tagging
US11967162B2 (en) 2018-04-26 2024-04-23 Fyusion, Inc. Method and apparatus for 3-D auto tagging
US20210405383A1 (en) * 2019-03-19 2021-12-30 Canon Kabushiki Kaisha Electronic device, method for controlling electronic device, and non-transitory computer readable storage medium
US11740477B2 (en) * 2019-03-19 2023-08-29 Canon Kabushiki Kaisha Electronic device, method for controlling electronic device, and non-transitory computer readable storage medium

Also Published As

Publication number Publication date
US20150002393A1 (en) 2015-01-01
US10275020B2 (en) 2019-04-30

Similar Documents

Publication Publication Date Title
US10275020B2 (en) Natural user interfaces for mobile image viewing
US11054964B2 (en) Panning in a three dimensional environment on a mobile device
AU2012232976B2 (en) 3D Position tracking for panoramic imagery navigation
US8610741B2 (en) Rendering aligned perspective images
US8310537B2 (en) Detecting ego-motion on a mobile device displaying three-dimensional content
Kopf et al. Street slide: browsing street level imagery
CN108283018B (en) Electronic device and method for gesture recognition of electronic device
US20160358383A1 (en) Systems and methods for augmented reality-based remote collaboration
US20110221664A1 (en) View navigation on mobile device
Reitmayr et al. Simultaneous localization and mapping for augmented reality
US20140248950A1 (en) System and method of interaction for mobile devices
US20100171691A1 (en) Viewing images with tilt control on a hand-held device
US20150040073A1 (en) Zoom, Rotate, and Translate or Pan In A Single Gesture
CN103959231A (en) Multi-dimensional interface
Joshi et al. Looking at you: fused gyro and face tracking for viewing large imagery on mobile devices
WO2022028129A1 (en) Pose determination method and apparatus, and electronic device and storage medium
US10572127B2 (en) Display control of an image on a display screen
US20140267600A1 (en) Synth packet for interactive view navigation of a scene
Kim et al. Oddeyecam: A sensing technique for body-centric peephole interaction using wfov rgb and nfov depth cameras
WO2023140990A1 (en) Visual inertial odometry with machine learning depth
Mulloni et al. Enhancing handheld navigation systems with augmented reality
Kaur et al. Computer vision and sensor fusion for efficient hybrid tracking in augmented reality systems
Xu et al. Visual registration for unprepared augmented reality environments
Karlekar et al. Mixed reality on mobile devices
US20220335638A1 (en) Depth estimation using a neural network

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:COHEN, MICHAEL F.;JOSHI, NEEL SURESH;REEL/FRAME:026455/0811

Effective date: 20110610

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001

Effective date: 20141014

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION