Methods and apparatus for streaming content corresponding to a 360 degree field of view are described. The methods and apparatus of the present invention are well suited for use with 3D immersion systems and/or head mounted display which allow a user to turn his or her head and see a corresponding scene portion. The methods and apparatus can support real or near real time streaming of 3D image content corresponding to a 360 degree field of view.
H04N 13/189 - Recording image signals; Reproducing recorded image signals
G11B 27/30 - Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/366 - Image reproducers using viewer tracking
H04N 21/218 - Source of audio or video content, e.g. local disk arrays
H04N 21/6587 - Control parameters, e.g. trick play commands or viewpoint selection
H04N 21/2662 - Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
A background image is also generated, e.g., by filling portions of a captured image where a foreground object was extracted and communicated to the playback device, Foreground objects are identified and point cloud representations of the foreground objects are generated and communicated to a playback device so that they can be used in generating images including the background which is communicated separately. In the case of a point cloud representation a number of points in an environment, e.g., 3D space, are communicated to the playback device along with color information. Thus in some embodiments a foreground object is represented as a set of points with corresponding color information on a per point basis. Foreground object information is communicated and processed in some embodiments at a different rate, e.g., faster rate, then the background textures. The playback device renders images which are sent to the display by first rendering a background layer using the communicated background information, e.g., background texture(s), UV map and environmental geometry, e.g., mesh, to which the background textures are applied.
Methods and apparatus for capturing, communicating and using image data to support virtual reality experiences are described. Images, e.g., frames, are captured at a high resolution but lower frame rate than is used for playback. Interpolation is applied to captured frames to generate interpolated frames. Captured frames, along with interpolated frame information, are communicated to the playback device. The combination of captured and interpolated frames correspond to a second frame playback rate which is higher than the image capture rate. Cameras operate at a high image resolution but slower frame rate than images could be captured with the same cameras at a lower resolution. Interpolation is performed prior to delivery to the user device with segments to be interpolated being selected based on motion and/or lens FOV information. A relatively small amount of interpolated frame data is communicated compared to captured frame data for efficient bandwidth use.
H04N 23/951 - Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
H04N 19/587 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
G06T 19/00 - Manipulating 3D models or images for computer graphics
4.
Methods and apparatus for processing content based on viewing information and/or communicating content
Methods and apparatus for collecting user feedback information from viewers of content are described. Feedback information is received from viewers of content. The feedback indicates, based on head tracking information in some embodiments, where users are looking in a simulated environment during different times of a content presentation, e.g., different frame times. The feedback information is used to prioritize different portions of an environment represented by the captured image content. Resolution allocation is performed based on the feedback information and the content re-encoded based on the resolution allocation. The resolution allocation may and normally does change as the priority of different portions of the environment change.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/442 - Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed or the storage space available from the internal hard disk
H04N 21/4728 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content for selecting a ROI [Region Of Interest], e.g. for requesting a higher resolution version of a selected region
H04N 21/6587 - Control parameters, e.g. trick play commands or viewpoint selection
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
G06T 3/40 - Scaling of a whole image or part thereof
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 23/698 - Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
5.
Methods and apparatus rendering images using point clouds representing one or more objects
A background image is also generated, e.g., by filling portions of a captured image where a foreground object was extracted and communicated to the playback device, Foreground objects are identified and point cloud representations of the foreground objects are generated and communicated to a playback device so that they can be used in generating images including the background which is communicated separately. In the case of a point cloud representation a number of points in an environment, e.g., 3D space, are communicated to the playback device along with color information. Thus in some embodiments a foreground object is represented as a set of points with corresponding color information on a per point basis. Foreground object information is communicated and processed in some embodiments at a different rate, e.g., faster rate, then the background textures. The playback device renders images which are sent to the display by first rendering a background layer using the communicated background information, e.g., background texture(s), UV map and environmental geometry, e.g., mesh, to which the background textures are applied.
Content delivery and playback methods and apparatus are described. The methods and apparatus are well suited for delivery and playback of content corresponding to a 360 degree environment and can be used to support streaming and/or real time delivery of 3D content corresponding to an event, e.g., while the event is ongoing or after the event is over. Portions of the environment are captured by cameras located at different positions. The content captured from different locations is encoded and made available for delivery. A playback device selects the content to be received based in a user's head position.
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
H04N 21/482 - End-user interface for program selection
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/371 - Image reproducers using viewer tracking for tracking rotational head movements around the vertical axis
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/134 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/146 - Data rate or code amount at the encoder output
H04N 19/39 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
H04L 65/61 - Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
H04N 19/587 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
7.
Methods and apparatus for receiving and/or using reduced resolution images
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
Methods and apparatus for supporting the capture of images of surfaces of an environment visible from a default viewing position and capturing images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are packed into one or more frames and communicated to a playback device for use as textures which can be applied to a model of the environment where the images were captured. An environmental model includes a model of surfaces which are occluded from view from a default viewing position but which maybe viewed is the user shifts the user's viewing location. Occluded image content can be incorporated directly into a frame that also includes non-occluded image data or sent in frames of a separate, e.g., auxiliary content stream that is multiplexed with the main content stream which communicates image data corresponding to non-occluded environmental portions.
H04N 13/204 - Image signal generators using stereoscopic image cameras
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06F 3/04815 - Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
9.
Method and apparatus for supporting augmented and/or virtual reality playback using tracked objects
Methods for capturing and generating information about objects in a 3D environment that can be used to support augmented reality or virtual reality playback operations in a data efficient manner are described. In various embodiments one or more frames including foreground objects are generated and transmitted with corresponding information that can be used to determine the location where the foreground objects are to be positioned relative to a background for one or more frame times are described. Data efficiency is achieved by specifying different locations for a foreground object for different frame times avoiding in some embodiments the need to transmit an image and depth information defining the same of the foreground for each frame time. The frames can be encoded using a video encoder even though some of the information communicated are not pixel values but alpha blending values, object position information, mesh distortion information, etc.
A background image is also generated, e.g., by filling portions of a captured image where a foreground object was extracted and communicated to the playback device, Foreground objects are identified and point cloud representations of the foreground objects are generated and communicated to a playback device so that they can be used in generating images including the background which is communicated separately. In the case of a point cloud representation a number of points in an environment, e.g., 3D space, are communicated to the playback device along with color information. Thus in some embodiments a foreground object is represented as a set of points with corresponding color information on a per point basis. Foreground object information is communicated and processed in some embodiments at a different rate, e.g., faster rate, then the background textures. The playback device renders images which are sent to the display by first rendering a background layer using the communicated background information, e.g., background texture(s), UV map and environmental geometry, e.g., mesh, to which the background textures are applied.
Methods and apparatus for streaming or playing back stereoscopic content are described. Camera dependent correction information is communicated to a playback device and applied in the playback device to compensate for distortions introduced by the lenses of individual cameras. By performing lens dependent distortion compensation in the playback device edges which might be lost if correction were performed prior to encoding are preserved. Distortion correction information may be in the form of UV map correction information. The correction information may indicate changes to be made to information in a UV map, e.g., at rendering time, to compensate for distortions specific to an individual camera. Different sets of correction information may be communicated and used for different cameras of a stereoscopic pair which provide images that are rendered using the same UV map. The communicated correction information is sometimes called a correction mesh since it is used to correct mesh related information.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
H04N 13/366 - Image reproducers using viewer tracking
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/398 - Synchronisation thereof; Control thereof
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
A camera apparatus, e.g., a stereoscopic camera apparatus, includes a dual element mounting plate, a pair of individual lens mounts, and a pair of sensor holders, each sensor holder holding an image sensor. The lens mounts and sensor holders are secured to the dual element mounting plate. Multiple dual element mounting plates corresponding to camera pairs may be secured to a base plate. Various features facilitate achieving and maintaining alignment between a lens and a corresponding image sensor, as well as between camera pairs and between sets of camera pairs.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G03B 17/12 - Bodies with means for supporting objectives, supplementary lenses, filters, masks, or turrets
G03B 11/00 - Filters or other obturators specially adapted for photographic purposes
G03B 35/08 - Stereoscopic photography by simultaneous recording
A method and system for encoding a stereoscopic image pair is disclosed. Groups of pixels are analyzed to determine the depth of each pixel group. The number of bits per pixel used to encode each pixel group is selected based on the depth of that pixel group. Therefore, images of objects closer to the camera pair, which appear closer to the viewer, are encoded with a larger number of bits per pixel than objects perceived to be farther from the viewer. The number of bits per pixel may also be increased based on a number of objects depicted or motion detected. The size of prediction blocks used to encode image portions may also be determined based on an angular distance of an image portion relative to the center of the frame. Therefore, smaller prediction blocks may be used to encode image portions closer to the center of the frame.
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/115 - Selection of the code volume for a coding unit prior to coding
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
14.
Methods and apparatus for encoding frames captured using fish-eye lenses
A method and system for encoding a stereoscopic image pair is disclosed. Groups of pixels are analyzed to determine the depth of each pixel group. The number of bits per pixel used to encode each pixel group is selected based on the depth of that pixel group. Therefore, images of objects closer to the camera pair, which appear closer to the viewer, are encoded with a larger number of bits per pixel than objects perceived to be farther from the viewer. The number of bits per pixel may also be increased based on a number of objects depicted or motion detected. The size of prediction blocks used to encode image portions may also be determined based on an angular distance of an image portion relative to the center of the frame. Therefore, smaller prediction blocks may be used to encode image portions closer to the center of the frame.
Methods and apparatus for collecting user feedback information from viewers of content are described. Feedback information is received from viewers of content. The feedback indicates, based on head tracking information in some embodiments, where users are looking in a simulated environment during different times of a content presentation, e.g., different frame times. The feedback information is used to prioritize different portions of an environment represented by the captured image content. Resolution allocation is performed based on the feedback information and the content re-encoded based on the resolution allocation. The resolution allocation may and normally does change as the priority of different portions of the environment change.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/442 - Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed or the storage space available from the internal hard disk
H04N 21/4728 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content for selecting a ROI [Region Of Interest], e.g. for requesting a higher resolution version of a selected region
H04N 21/6587 - Control parameters, e.g. trick play commands or viewpoint selection
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
G06T 3/40 - Scaling of a whole image or part thereof
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
16.
Methods and apparatus for capturing, processing and/or communicating images
An unobstructed image portion of a captured image from a first camera of a camera pair, e.g., a stereoscopic camera pair including fisheye lenses, is combined with a scaled extracted image portion generated from a captured image from a second camera in the camera pair. An unobstructed image portion of a captured image from the second camera of the camera pair is combined with a scaled extracted image portion generated from a captured image from the first camera in the camera pair. As part of the combining obstructed image portions which were obstructed by part of the adjacent camera are replaced in some embodiments. In some embodiments, the obstructions are due to adjacent fisheye lens. In various embodiments fish eye lenses which have been cut to be flat on one side are used for the left and right cameras with the spacing between the optical axis approximating the spacing between the optical axis of a human person's eyes.
H04N 13/122 - Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
17.
Methods and apparatus for encoding, communicating and/or using images
Methods and apparatus for capturing, communicating and using image data to support virtual reality experiences are described. Images, e.g., frames, are captured at a high resolution but lower frame rate than is used for playback. Interpolation is applied to captured frames to generate interpolated frames. Captured frames, along with interpolated frame information, are communicated to the playback device. The combination of captured and interpolated frames correspond to a second frame playback rate which is higher than the image capture rate. Cameras operate at a high image resolution but slower frame rate than images could be captured with the same cameras at a lower resolution. Interpolation is performed prior to delivery to the user device with segments to be interpolated being selected based on motion and/or lens FOV information. A relatively small amount of interpolated frame data is communicated compared to captured frame data for efficient bandwidth use.
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 19/587 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
G06T 19/00 - Manipulating 3D models or images for computer graphics
18.
Stereoscopic video encoding and decoding methods and apparatus
Methods and apparatus for stereoscopic image encoding and decoding are described. Left and right eye images are encoded following an entropy reduction operation being applied to one of the eye images when there is a difference between the left and right images of an image pair. Information about regions of negative parallax within the entropy reduced image of an image pair is encoded along with the images. Upon decoding a sharpening filter is applied to the image in an image pair which was subjected to the entropy reduction operation. In addition edge enhancement filtering is performed on the regions of the recovered entropy reduced image which are identified in the encoded image data as regions of negative parallax. Interleaving of left and right eye images at the input of the encoder combined with entropy reduction allows for efficient encoding, storage, and transmission of 3D images.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 19/00 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
H04N 13/122 - Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
H04N 19/583 - Motion compensation with overlapping blocks
H04N 19/82 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
19.
Methods and apparatus for receiving and/or using reduced resolution images
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
20.
Camera alignment and/or calibration methods and apparatus
Methods and apparatus for aligning components of camera assemblies of one or more camera pairs, e.g., a stereoscopic camera pairs, are described. A camera calibration tool referred to as a camera bra is used. Each dome of the camera bra includes a test pattern, e.g., grid of points, with the domes being aligned and spaced apart by a predetermined amount. The bra is placed over the cameras of a camera pair, the grids are detected and displayed. The camera component positions are adjusted until the displayed images show the grids as being properly aligned. Because the grids on the calibration tool are properly aligned as a result of the manufacturing of the calibration tool, when the images are brought into alignment the cameras will be properly spaced and aligned at which point the calibration tool can be removed and the stereoscopic camera pair used to capture images of a scene.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G02B 30/26 - Optical systems or apparatus for producing three-dimensional [3D] effects, e.g. stereoscopic images by providing first and second parallax images to an observer’s left and right eyes of the autostereoscopic type
21.
Methods and apparatus for supporting content generation, transmission and/or playback
Methods and apparatus for supporting the capture of images of surfaces of an environment visible from a default viewing position and capturing images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are packed into one or more frames and communicated to a playback device for use as textures which can be applied to a model of the environment where the images were captured. An environmental model includes a model of surfaces which are occluded from view from a default viewing position but which may be viewed is the user shifts the user's viewing location. Occluded image content can be incorporated directly into a frame that also includes non-occluded image data or sent in frames of a separate, e.g., auxiliary content stream that is multiplexed with the main content stream which communicates image data corresponding to non-occluded environmental portions.
H04N 13/204 - Image signal generators using stereoscopic image cameras
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06F 3/04815 - Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
22.
Methods and apparatus for capturing images of an environment
Customer wide angle lenses and methods and apparatus for using such lenses in individual cameras as well as pairs of cameras intended for stereoscopic image capture are described. The lenses are used in combination with sensors to capture different portions of an environment at different resolutions. In some embodiments ground is capture at a lower resolution than sky which is captured at a lower resolution than a horizontal area of interest. Various asymmetries in lenses and/or lens and sensor placement are described which are particularly well suited for stereoscopic camera pairs where the proximity of one camera to the adjacent camera may interfere with the field of view of the cameras.
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/296 - Synchronisation thereof; Control thereof
H04N 13/25 - Image signal generators using stereoscopic image cameras using image signals from one sensor to control the characteristics of another sensor
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
G02B 30/34 - Stereoscopes providing a stereoscopic pair of separated images corresponding to parallactically displaced views of the same object, e.g. 3D slide viewers
Methods and apparatus relating to encoding and decoding stereoscopic (3D) image data, e.g., left and right eye images, are described. Various pre-encoding and post-decoding operations are described in conjunction with difference based encoding and decoding techniques. In some embodiments left and right eye image data is subject to scaling, transform operation(s) and cropping prior to encoding. In addition, in some embodiments decoded left and right eye image data is subject to scaling, transform operations(s) and filling operations prior to being output to a display device. Transform information and/or scaling information may be included in a bitstream communicating encoded left and right eye images. The amount of scaling can be the same for an entire scene and/or program.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
H04N 19/51 - Motion estimation or motion compensation
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
Methods and apparatus for using a display in a manner which results in a user perceiving a higher resolution than would be perceived if a user viewed the display from a head on position are described. In some embodiments one or more displays are mounted at an angle, e.g., sometimes in range a range from an angle above 0 degrees to 45 relative to a user's face and thus eyes. The user sees more pixels, e.g., dots corresponding to light emitting elements, per square inch of eye area than the user would see if the user were viewing the display head on due to the angle at which the display or displays are mounted. The methods and display mounting arrangement are well suited for use in head mounted displays, e.g., Virtual Reality (VR) displays for stereoscopic viewing (e.g., 3D) and/or non-stereoscopic viewing of displayed images.
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
26.
Methods and apparatus for processing content based on viewing information and/or communicating content
Methods and apparatus for collecting user feedback information from viewers of content are described. Feedback information is received from viewers of content. The feedback indicates, based on head tracking information in some embodiments, where users are looking in a simulated environment during different times of a content presentation, e.g., different frame times. The feedback information is used to prioritize different portions of an environment represented by the captured image content. Resolution allocation is performed based on the feedback information and the content re-encoded based on the resolution allocation. The resolution allocation may and normally does change as the priority of different portions of the environment change.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/442 - Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed or the storage space available from the internal hard disk
H04N 21/4728 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content for selecting a ROI [Region Of Interest], e.g. for requesting a higher resolution version of a selected region
H04N 21/6587 - Control parameters, e.g. trick play commands or viewpoint selection
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
G06T 3/40 - Scaling of a whole image or part thereof
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
27.
Methods and apparatus for controlling, implementing and supporting trick play in an augmented reality device
Methods and Apparatus for controlling, implementing and supporting trick Play in an augmented reality (AR) device are described. Detected changes in AR device orientation and/or AR device position are detected and used in controlling temporal playback operations.
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
G06F 3/0346 - Pointing devices displaced or positioned by the user; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
G06T 19/00 - Manipulating 3D models or images for computer graphics
28.
Methods and apparatus for capturing, processing and/or communicating images
An unobstructed image portion of a captured image from a first camera of a camera pair, e.g., a stereoscopic camera pair including fisheye lenses, is combined with a scaled extracted image portion generated from a captured image from a second camera in the camera pair. An unobstructed image portion of a captured image from the second camera of the camera pair is combined with a scaled extracted image portion generated from a captured image from the first camera in the camera pair. As part of the combining obstructed image portions which were obstructed by part of the adjacent camera are replaced in some embodiments. In some embodiments, the obstructions are due to adjacent fisheye lens. In various embodiments fish eye lenses which have been cut to be flat on one side are used for the left and right cameras with the spacing between the optical axis approximating the spacing between the optical axis of a human person's eyes.
H04N 13/122 - Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
29.
Methods and apparatus for capturing images of an environment
Customer wide angle lenses and methods and apparatus for using such lenses in individual cameras as well as pairs of cameras intended for stereoscopic image capture are described. The lenses are used in combination with sensors to capture different portions of an environment at different resolutions. In some embodiments ground is capture at a lower resolution than sky which is captured at a lower resolution than a horizontal area of interest. Various asymmetries in lenses and/or lens and sensor placement are described which are particularly well suited for stereoscopic camera pairs where the proximity of one camera to the adjacent camera may interfere with the field of view of the cameras.
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/296 - Synchronisation thereof; Control thereof
H04N 13/25 - Image signal generators using stereoscopic image cameras using image signals from one sensor to control the characteristics of another sensor
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
G02B 30/34 - Stereoscopes providing a stereoscopic pair of separated images corresponding to parallactically displaced views of the same object, e.g. 3D slide viewers
Methods and apparatus for checking the alignment and/or spacing of cameras of a camera pair, e.g., stereoscopic camera pair, are described. Images of a calibration target are captured by first and second cameras of a camera pair being calibrated The captured first and second images, which may correspond to all or a portion of the calibration target, are combined in accordance with the invention to generate a calibration image which is then displayed to an operator of the calibration apparatus including the calibration target. Given the position of at least some different color grid elements in the calibration target, when proper camera spacing and optical axis alignment of the cameras is achieved, the overlaid image will include an overlay of two different color image grid elements resulting in the generated calibration image including grid elements of a different color than that of the overlaid grid elements.
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 19/136 - Incoming video signal characteristics or properties
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
32.
Methods and apparatus including or for use with one or more cameras
Camera related methods and apparatus which are well suited for use in capturing stereoscopic image data, e.g., pairs of left and right eye images, are described. Various features relate to a camera rig which can be used to mount multiple cameras at the same time. In some embodiments the camera rig includes 3 mounting locations corresponding to 3 different directions 120 degrees apart. One or more of the mounting locations may be used at a given time. When a single camera pair is used the rig can be rotated to capture images corresponding to the locations where a camera pair is not mounted. Static images from those locations can then be combined with images corresponding to the forward direction to generate a 360 degree view. Alternatively camera pairs or individual cameras can be included in each of the mounting locations to capture video in multiple directions.
F16M 11/08 - Means for attachment of apparatus; Means allowing adjustment of the apparatus relatively to the stand allowing pivoting around a vertical axis
F16M 11/24 - Undercarriages with or without wheels changeable in height or length of legs, also for transport only
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 5/262 - Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects
G03B 35/08 - Stereoscopic photography by simultaneous recording
G03B 37/04 - Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe with cameras or projectors providing touching or overlapping fields of view
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
34.
Methods and apparatus for requesting, receiving and/or playing back content corresponding to an environment
Methods and apparatus for receiving content including images of surfaces of an environment visible from a default viewing position and images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are received in content streams that can be in a variety of stream formats. In one stream format non-occluded image content is packed into a frame with occluded image content with the occluded image content normally occupying a small portion of the frame. In other embodiments occluded image portions are received in an auxiliary data stream which is multiplexed with a data stream providing frames of non-occluded image content. UV maps which are used to map received image content to segments of an environmental model are also supplied with the UV maps corresponding to the format of the frames which are used to provide the images that serve as textures.
H04N 13/204 - Image signal generators using stereoscopic image cameras
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
35.
Methods and apparatus for supporting content generation, transmission and/or playback
Methods and apparatus for supporting the capture of images of surfaces of an environment visible from a default viewing position and capturing images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are packed into one or more frames and communicated to a playback device for use as textures which can be applied to a model of the environment where the images were captured. An environmental model includes a model of surfaces which are occluded from view from a default viewing position but which maybe viewed is the user shifts the user's viewing location. Occluded image content can be incorporated directly into a frame that also includes non-occluded image data or sent in frames of a separate, e.g., auxiliary content stream that is multiplexed with the main content stream which communicates image data corresponding to non-occluded environmental portions.
H04N 13/204 - Image signal generators using stereoscopic image cameras
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
36.
Stereoscopic video encoding and decoding methods and apparatus
Methods and apparatus for stereoscopic image encoding and decoding are described. Left and right eye images are encoded following an entropy reduction operation being applied to one of the eye images when there is a difference between the left and right images of an image pair. Information about regions of negative parallax within the entropy reduced image of an image pair is encoded along with the images. Upon decoding a sharpening filter is applied to the image in an image pair which was subjected to the entropy reduction operation. In addition edge enhancement filtering is performed on the regions of the recovered entropy reduced image which are identified in the encoded image data as regions of negative parallax. Interleaving of left and right eye images at the input of the encoder combined with entropy reduction allows for efficient encoding, storage, and transmission of 3D images.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 19/00 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
H04N 13/122 - Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
H04N 19/583 - Motion compensation with overlapping blocks
H04N 19/82 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
37.
Stereoscopic image processing methods and apparatus
Stereoscopic image processing methods and apparatus are described. Left and right eye images of a stereoscopic frame are examined to determine if one or more difference reduction operations designed to reduce the luminance and/or chrominance differences between the left and right frames is within a range used to trigger a difference reduction operation. A difference reduction operation may involve assigning portions of the left and right frames to different depth regions and/or other region categories. A decision on whether or not to perform a difference reduction operation is then performed on a per regions basis with at the difference between the left and right eye portions of at least one region being reduced when a difference reduction operation is to be performed. The difference reduction process may, and in some embodiments is, performed in a precoder which processes left and right eye images of stereoscopic frames prior to stereoscopic encoding.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
Methods and apparatus for using a display in a manner which results in a user perceiving a higher resolution than would be perceived if a user viewed the display from a head on position are described. In some embodiments one or more displays are mounted at an angle, e.g., sometimes in range a range from an angle above 0 degrees to 45 relative to a user's face and thus eyes. The user sees more pixels, e.g., dots corresponding to light emitting elements, per square inch of eye area than the user would see if the user were viewing the display head on due to the angle at which the display or displays are mounted. The methods and display mounting arrangement are well suited for use in head mounted displays, e.g., Virtual Reality (VR) displays for stereoscopic viewing (e.g., 3D) and/or non-stereoscopic viewing of displayed images.
Content delivery and playback methods and apparatus are described. The methods and apparatus are well suited for delivery and playback of content corresponding to a 360 degree environment and can be used to support streaming and/or real time delivery of 3D content corresponding to an event, e.g., while the event is ongoing or after the event is over. Portions of the environment are captured by cameras located at different positions. The content captured from different locations is encoded and made available for delivery. A playback device selects the content to be received based in a user's head position.
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
H04N 21/482 - End-user interface for program selection
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/371 - Image reproducers using viewer tracking for tracking rotational head movements around the vertical axis
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/134 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/146 - Data rate or code amount at the encoder output
H04N 19/39 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
H04L 65/61 - Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
H04N 19/587 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
40.
Methods and apparatus for communicating and/or using frames including a captured image and/or including additional image content
Methods and apparatus for packing images into a frame and/or including additional content and/or graphics are described. A composite image is generated including at least one image in addition to another image and/or additional image content. A playback device received an encoded frame including a captured image of a portion of and environment and the additional image content. The additional image content is combined with or used to replace a portion of the image of the environment during rendering. Alpha value mask information is communicated to the playback device to provide alpha values for use in image combining. Alpha values are communicated as pixel values in the encoded frame or as additional information. One or more mesh models and/or information on how to map image content to the one or more mesh models is communicated to the playback device for use in rendering image content recovered from a frame.
H04N 13/293 - Generating mixed stereoscopic images; Generating mixed monoscopic and stereoscopic images, e.g. a stereoscopic image overlay window on a monoscopic image background
H04N 13/25 - Image signal generators using stereoscopic image cameras using image signals from one sensor to control the characteristics of another sensor
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/183 - On-screen display [OSD] information, e.g. subtitles or menus
41.
Methods and apparatus for packing images into a frame and/or including additional content or graphics
Methods and apparatus for packing images into a frame and/or including additional content and/or graphics are described. A composite image is generated including at least one image in addition to another image and/or additional image content, e.g., a logo, texture, sign, or advertisement. In some embodiments, first and second pairs of stereoscopic images are combined, e.g., with additional image content to generate a composite image which is then encoded, e.g., using a UHD (Ultra High Definition) encoder. In some other embodiments, the rather than two pairs of stereo images a pair of stereo images are combined with images captured by two mono cameras. In various embodiments, the set of cameras which are sources of captured images for the composite image are dynamically selected, e.g., with different sets of cameras in a camera rig being selected at different times.
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
H04N 21/4728 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content for selecting a ROI [Region Of Interest], e.g. for requesting a higher resolution version of a selected region
42.
Methods and apparatus related to capturing and/or rendering images
Camera and/or lens calibration information is generated as part of a calibration process in video systems including 3-dimensional (3D) immersive content systems. The calibration information can be used to correct for distortions associated with the source camera and/or lens. A calibration profile can include information sufficient to allow the system to correct for camera and/or lens distortion/variation. This can be accomplished by capturing a calibration image of a physical 3D object corresponding to the simulated 3D environment, and creating the calibration profile by processing the calibration image. The calibration profile can then be used to project the source content directly into the 3D viewing space while also accounting for distortion/variation, and without first translating into an intermediate space (e.g., a rectilinear space) to account for lens distortion.
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
H04N 21/4402 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
H04N 21/485 - End-user interface for client configuration
In order to reduce or minimize the effect of camera switching performed outside of user control in some embodiments an automated camera switching policy policing system is implemented. The policing system can make camera switch recommendations, automatically implement camera switches and/or take other actions. In some embodiments the automated policing system tracks the location of an area of interest in the environment, e.g., an area corresponding to the location of an object of interest, where images are being captured. Camera switches are recommend when the object of interest remains in the field of view of a current camera supplying content and the field of view of a camera to which a switch is to be made. Recommendations are made against camera switches which will result in a switch which will prevent a user from viewing the same object of interest before and after a camera switch.
A head mounted virtual reality (VR) device including an inertial measurement unit (IMU) is located in a vehicle which may be, and sometimes is, moving. Detected motion attributable to vehicle motion is filtered out based on one or more or all of: vehicle type information, information derived from sensors located in the vehicle external to the head mounted VR device, and/or captured images including a reference point or reference object within the vehicle. An image portion of a simulated VR environment is selected and presented to the user of the head mounted VR device based on the filtered motion information. Thus, the image portion presented to the user of the head mounted VR device is substantially unaffected by vehicle motion and corresponds to user induced head motion.
Methods and apparatus for determining location of objects surrounding a user of a 3D rendering and display system and indicating the objects to the user while the user views a simulated environment, e.g., on a headmounted display, are described. A sensor, e.g. camera, captures images or senses the physical environment where the user of the system is located. One or more objects in the physical environment are identified, e.g., by recognizing predetermined symbols on the objects and based on stored information indicating a mapping between different symbols and objects. The location of the objects relative to the user's location in the physical environment is determined. A simulated environment, including content corresponding to a scene and visual representations of the one or more objects, is displayed. In some embodiments visual representation are displayed in the simulated environment at locations determined based on the location of the objects relative to the user.
G06T 7/70 - Determining position or orientation of objects or cameras
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
A63F 13/213 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
A63F 13/65 - Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition
A63F 13/79 - Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
46.
Methods and apparatus for controlling a viewing position
Methods and apparatus for allowing a user to switch viewing positions and/or perspective while viewing an environment, e.g., as part of a 3D playback/viewing experience, are described. In various embodiments images of the environment are captured using cameras placed at multiple camera positions. During viewing a user can select which camera position he/she would like to experience the environment from. While experiencing the environment from the perspective of a first camera position the user may switch from the first to a second camera position by looking at the second position. A visual indication is provided to the user to indicate that the user can select the other camera position as his/her viewing position. If a user input indicates a desired viewing position change, a switch to the alternate viewing position is made and the user is presented with images captured from the perspective of the user selected alternative viewing position.
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 9/80 - Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
H04N 5/77 - Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
Customer wide angle lenses and methods and apparatus for using such lenses in individual cameras as well as pairs of cameras intended for stereoscopic image capture are described. The lenses are used in combination with sensors to capture different portions of an environment at different resolutions. In some embodiments ground is capture at a lower resolution than sky which is captured at a lower resolution than a horizontal area of interest. Various asymmetries in lenses and/or lens and sensor placement are described which are particularly well suited for stereoscopic camera pairs where the proximity of one camera to the adjacent camera may interfere with the field of view of the cameras.
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
G02B 27/22 - Other optical systems; Other optical apparatus for producing stereoscopic or other three-dimensional effects
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
48.
Camera apparatus and methods which allow for filters to be used
A camera apparatus, e.g., a stereoscopic camera apparatus, includes a slideable filter plate inserted into a slot in a dual element mounting plate. The slideable filter plate includes a plurality of selectable pairs of filter mounting positions and changes between pairs of filter mounting positions may be performed without altering camera alignments. Each pair of filter mounting positions includes a right eye filter mounting position and a left eye filter mounting position. Each pair of filter mounting positions in the slideable filter plate includes a pair of filters or no filters.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G03B 11/00 - Filters or other obturators specially adapted for photographic purposes
G03B 35/08 - Stereoscopic photography by simultaneous recording
Methods and apparatus for aligning components of camera assemblies of one or more camera pairs, e.g., a stereoscopic camera pairs, are described. A camera calibration tool referred to as a camera bra is used. Each dome of the camera bra includes a test pattern, e.g., grid of points, with the domes being aligned and spaced apart by a predetermined amount. The bra is placed over the cameras of a camera pair, the grids are detected and displayed. The camera component positions are adjusted until the displayed images show the grids as being properly aligned. Because the grids on the calibration tool are properly aligned as a result of the manufacturing of the calibration tool, when the images are brought into alignment the cameras will be properly spaced and aligned at which point the calibration tool can be removed and the stereoscopic camera pair used to capture images of a scene.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G02B 30/26 - Optical systems or apparatus for producing three-dimensional [3D] effects, e.g. stereoscopic images by providing first and second parallax images to an observer’s left and right eyes of the autostereoscopic type
50.
Methods and apparatus for receiving and/or using reduced resolution images
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
09 - Scientific and electric apparatus and instruments
Goods & Services
Digital virtual reality and stereoscopic cameras; digital virtual reality and stereoscopic camera systems comprised of, camera lenses, image processors, video, audio and data transmission and communication modules; digital virtual reality and stereoscopic camera systems, all for use in the creation, storage, delivery, manipulation, recording, playback or viewing of photographs, video, or images; camera lenses for digital virtual reality and stereoscopic cameras; computer programs for the capture, compression, decompression, editing and production of still and motion images for digital cameras; virtual reality computer hardware.
09 - Scientific and electric apparatus and instruments
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer software for the editing and manipulation of video images; Computer software for use with encoding of digital files, including audio, video, text, binary, still images, graphics and multimedia files; Computer software, namely, software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into 3D formats and panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software programs for the integration of text, audio, graphics, still image and moving pictures into an interactive delivery for multimedia applications; Computer software, namely, software for use in converting, encoding, processing and translating 3D digital media content for use with virtual reality headsets, helmets, and viewing environments; Computer software, namely, virtual reality software for 3D films and panoramic 3D videos, digital media, and multimedia entertainment content; Downloadable mobile application software for indexing, sorting, reviewing, and selection of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable mobile application software for use in distribution of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable films and videos featuring 3D and 360 degree viewing in the field of general interest, entertainment, sports and original content. Film and video production; Multimedia entertainment services in the nature of development, recording, production, post-production and distribution of 2D, 3D, panoramic 2D and panoramic 3D film and video production in the field of general interest, entertainment, sports and original content; Providing on-line information relating to distribution of multimedia entertainment content and digital media in the nature of 2D, 3D, panoramic 2D and panoramic 3D films and videos; Entertainment services, namely, provision of an immersive 3D virtual reality experience in the nature of nondownloadable films and videos in the field of general interest, entertainment, sports and original content. Design and development of virtual reality software.
09 - Scientific and electric apparatus and instruments
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Digital virtual reality and stereoscopic cameras; digital virtual reality and stereoscopic camera systems comprised of camera lenses, image processors, video, audio and data transmission and communication modules; digital virtual reality and stereoscopic camera systems, all for use in the creation, storage, delivery, manipulation, recording, playback or viewing of photographs, video, or images; camera lenses for digital virtual reality and stereoscopic cameras; computer programs for the capture, compression, decompression, editing and production of still and motion images for digital cameras; Virtual reality computer hardware; Virtual reality software for live-action video delivery; Computer software for the editing and manipulation of video images; Computer software for use with encoding of digital files, including audio, video, text, binary, still images, graphics and multimedia files; Computer software, namely, software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into 3D formats and panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software programs for the integration of text, audio, graphics, still image and moving pictures into an interactive delivery for multimedia applications; Computer software, namely, software for use in converting, encoding, processing and translating 3D digital media content for use with virtual reality headsets, helmets, and viewing environments; Computer software, namely, virtual reality software for 3D films and panoramic 3D videos, digital media, and multimedia entertainment content; Downloadable mobile application software for indexing, sorting, reviewing, and selection of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable mobile application software for use in distribution of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable films and videos featuring 3D and 360 degree viewing in the field of general interest, entertainment, sports and original content. Film and video production; Multimedia entertainment services in the nature of development, recording, production, post-production and distribution of 2D, 3D, panoramic 2D and panoramic 3D film and video production in the field of general interest, entertainment, sports and original content; Providing on-line information relating to distribution of multimedia entertainment content and digital media in the nature of 2D, 3D, panoramic 2D and panoramic 3D films and videos; Entertainment services, namely, provision of an immersive 3D virtual reality experience in the nature of nondownloadable films and videos in the field of general interest, entertainment, sports and original content. Design and development of virtual reality software.
09 - Scientific and electric apparatus and instruments
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Digital still and motion cameras; digital cinema camera systems and accessories, sold individually or as a unit, comprised of, cameras, camera lenses, optical digital image sensors and structural fittings therefor, flash memory cards, electronic memories, hard drives for video recorders, batteries, electronic input/output modules, namely, video, audio and data transmission and communication modules; digital cinema camera systems and accessories, sold individually or as a unit, comprised of, viewfinders, video monitors and flat panel display screens, all for use in the creation, storage, delivery, manipulation, recording, playback or viewing of photographs, video, or cinema images; cinematographic projectors; photographic projectors; optical and magneto-optical disc players and recorders for audio and video data; computer programs for the capture, compression, decompression, editing and production of still and motion images for digital cameras; Virtual reality glasses; Virtual reality headsets; Computer peripheral devices; Wearable peripherals for computers, tablet computers, mobile devices and mobile telephones, namely, configurable head-mounted displays; Virtual reality computer hardware; Virtual reality software for live-action video delivery; Computer software for the editing and manipulation of video images; Computer software for use with encoding of digital files, including audio, video, text, binary, still images, graphics and multimedia files; Computer software, namely, software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into 3D formats and panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software for use in converting, encoding, processing and translating audio, video, text, binary, still images, graphics and multimedia files into panoramic 3D formats for films, videos, digital media, and multimedia entertainment content; Computer software programs for the integration of text, audio, graphics, still image and moving pictures into an interactive delivery for multimedia applications; Computer software, namely, software for use in converting, encoding, processing and translating 3D digital media content for use with virtual reality headsets, helmets, and viewing environments; Computer software, namely, virtual reality software for 3D films and panoramic 3D videos, digital media, and multimedia entertainment content; Downloadable mobile application software for indexing, sorting, reviewing, and selection of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable mobile application software for use in distribution of 3D and panoramic 3D films, videos, digital media, and multimedia entertainment content; Downloadable films and videos featuring 3D and 360 degree viewing in the field of general interest, entertainment, sports and original content. Film and video production; Multimedia entertainment services in the nature of development, recording, production, post-production and distribution of 2D, 3D, panoramic 2D and panoramic 3D film and video production in the field of general interest, entertainment, sports and original content; Providing on-line information relating to distribution of multimedia entertainment content and digital media in the nature of 2D, 3D, panoramic 2D and panoramic 3D films and videos; Entertainment services, namely, provision of an immersive 3D virtual reality experience in the nature of nondownloadable films and videos in the field of general interest, entertainment, sports and original content. Design and development of virtual reality software.
Methods and apparatus for capturing and displaying stereoscopic images are described in a manner that allows a user to obtain a 3d virtual reality experience simulating that of being in a seat at a football game or other event. Rear images are modified, e.g., in luminance intensity, to make them consistent with the luminance intensity of the forward images to avoid or reduce edges or differences in luminance intensity as a users turns his head from viewing a main image area to a side or rear image area. A seamless 3D presentation is made possible through the use of fisheye lenses at capture time and combining of images corresponding to forward and rear image areas as a user turns his or her head requiring a change in the captured image area which is displayed to the user.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
57.
3D video encoding and decoding methods and apparatus
Methods and apparatus relating to encoding and decoding stereoscopic (3D) image data, e.g., left and right eye images, are described. Various pre-encoding and post-decoding operations are described in conjunction with difference based encoding and decoding techniques. In some embodiments left and right eye image data is subject to scaling, transform operation(s) and cropping prior to encoding. In addition, in some embodiments decoded left and right eye image data is subject to scaling, transform operations(s) and filling operations prior to being output to a display device. Transform information and/or scaling information may be included in a bitstream communicating encoded left and right eye images. The amount of scaling can be the same for an entire scene and/or program.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
H04N 19/51 - Motion estimation or motion compensation
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
58.
Methods and apparatus for detecting objects in proximity to a viewer and presenting visual representations of objects in a simulated environment
Methods and apparatus for determining location of objects surrounding a user of a 3D rendering and display system and indicating the objects to the user while the user views a simulated environment, e.g., on a headmounted display, are described. A sensor, e.g. camera, captures images or senses the physical environment where the user of the system is located. One or more objects in the physical environment are identified, e.g., by recognizing predetermined symbols on the objects and based on stored information indicating a mapping between different symbols and objects. The location of the objects relative to the user's location in the physical environment is determined. A simulated environment, including content corresponding to a scene and visual representations of the one or more objects, is displayed. In some embodiments visual representation are displayed in the simulated environment at locations determined based on the location of the objects relative to the user.
A head mounted virtual reality (VR) device including an inertial measurement unit (IMU) is located in a vehicle which may be, and sometimes is, moving. Detected motion attributable to vehicle motion is filtered out based on one or more or all of: vehicle type information, information derived from sensors located in the vehicle external to the head mounted VR device, and/or captured images including a reference point or reference object within the vehicle. An image portion of a simulated VR environment is selected and presented to the user of the head mounted VR device based on the filtered motion information. Thus, the image portion presented to the user of the head mounted VR device is substantially unaffected by vehicle motion and corresponds to user induced head motion.
Methods and apparatus for capturing and displaying stereoscopic images are described in a manner that allows a user to obtain a 3d virtual reality experience simulating that of being in a seat at a football game or other event. Rear images are modified, e.g., in luminance intensity, to make them consistent with the luminance intensity of the forward images to avoid or reduce edges or differences in luminance intensity as a users turns his head from viewing a main image area to a side or rear image area. A seamless 3D presentation is made possible through the use of fisheye lenses at capture time and combining of images corresponding to forward and rear image areas as a user turns his or her head requiring a change in the captured image area which is displayed to the user.
Camera and/or lens calibration information is generated as part of a calibration process in video systems including 3-dimensional (3D) immersive content systems. The calibration information can be used to correct for distortions associated with the source camera and/or lens. A calibration profile can include information sufficient to allow the system to correct for camera and/or lens distortion/variation. This can be accomplished by capturing a calibration image of a physical 3D object corresponding to the simulated 3D environment, and creating the calibration profile by processing the calibration image. The calibration profile can then be used to project the source content directly into the 3D viewing space while also accounting for distortion/variation, and without first translating into an intermediate space (e.g., a rectilinear space) to account for lens distortion.
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
H04N 21/4402 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
H04N 21/485 - End-user interface for client configuration
Methods and apparatus for collecting user feedback information from viewers of content are described. Feedback information is received from viewers of content. The feedback indicates, based on head tracking information in some embodiments, where users are looking in a simulated environment during different times of a content presentation, e.g., different frame times. The feedback information is used to prioritize different portions of an environment represented by the captured image content. Resolution allocation is performed based on the feedback information and the content re-encoded based on the resolution allocation. The resolution allocation may and normally does change as the priority of different portions of the environment change.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 21/442 - Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed or the storage space available from the internal hard disk
H04N 21/4728 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content for selecting a ROI [Region Of Interest], e.g. for requesting a higher resolution version of a selected region
H04N 21/6587 - Control parameters, e.g. trick play commands or viewpoint selection
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
G06T 3/40 - Scaling of a whole image or part thereof
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
63.
3D video encoding and decoding methods and apparatus
Methods and apparatus relating to encoding and decoding stereoscopic (3D) image data, e.g., left and right eye images, are described. Various pre-encoding and post-decoding operations are described in conjunction with difference based encoding and decoding techniques. In some embodiments left and right eye image data is subject to scaling, transform operation(s) and cropping prior to encoding. In addition, in some embodiments decoded left and right eye image data is subject to scaling, transform operations(s) and filling operations prior to being output to a display device. Transform information and/or scaling information may be included in a bitstream communicating encoded left and right eye images. The amount of scaling can be the same for an entire scene and/or program.
A camera rig including one or more stereoscopic camera pairs and/or one or more light field cameras are described. Images are captured by the light field cameras and stereoscopic camera pairs are captured at the same time. The light field images are used to generate an environmental depth map which accurately reflects the environment in which the stereoscopic images are captured at the time of image capture. In addition to providing depth information, images captured by the light field camera or cameras is combined with or used in place of stereoscopic image data to allow viewing and/or display of portions of a scene not captured by a stereoscopic camera pair.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
Camera and/or lens calibration information is generated as part of a calibration process in video systems including 3-dimensional (3D) immersive content systems. The calibration information can be used to correct for distortions associated with the source camera and/or lens. A calibration profile can include information sufficient to allow the system to correct for camera and/or lens distortion/variation. This can be accomplished by capturing a calibration image of a physical 3D object corresponding to the simulated 3D environment, and creating the calibration profile by processing the calibration image. The calibration profile can then be used to project the source content directly into the 3D viewing space while also accounting for distortion/variation, and without first translating into an intermediate space (e.g., a rectilinear space) to account for lens distortion.
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
H04N 21/4402 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
H04N 21/485 - End-user interface for client configuration
Methods and apparatus for receiving content including images of surfaces of an environment visible from a default viewing position and images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are received in content streams that can be in a variety of stream formats. In one stream format non-occluded image content is packed into a frame with occluded image content with the occluded image content normally occupying a small portion of the frame. In other embodiments occluded image portions are received in an auxiliary data stream which is multiplexed with a data stream providing frames of non-occluded image content. UV maps which are used to map received image content to segments of an environmental model are also supplied with the UV maps corresponding to the format of the frames which are used to provide the images that serve as textures.
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
H04N 13/111 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 13/204 - Image signal generators using stereoscopic image cameras
H04N 13/282 - Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
67.
Methods and apparatus for supporting content generation, transmission and/or playback
Methods and apparatus for supporting the capture of images of surfaces of an environment visible from a default viewing position and capturing images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are packed into one or more frames and communicated to a playback device for use as textures which can be applied to a model of the environment where the images were captured. An environmental model includes a model of surfaces which are occluded from view from a default viewing position but which maybe viewed is the user shifts the user's viewing location. Occluded image content can be incorporated directly into a frame that also includes non-occluded image data or sent in frames of a separate, e.g., auxiliary content stream that is multiplexed with the main content stream which communicates image data corresponding to non-occluded environmental portions.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
68.
Methods and apparatus for supporting content generation, transmission and/or playback
Methods and apparatus for supporting the capture of images of surfaces of an environment visible from a default viewing position and capturing images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are packed into one or more frames and communicated to a playback device for use as textures which can be applied to a model of the environment where the images were captured. An environmental model includes a model of surfaces which are occluded from view from a default viewing position but which maybe viewed is the user shifts the user's viewing location. Occluded image content can be incorporated directly into a frame that also includes non-occluded image data or sent in frames of a separate, e.g., auxiliary content stream that is multiplexed with the main content stream which communicates image data corresponding to non-occluded environmental portions.
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/232 - Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
69.
Methods and apparatus for requesting, receiving and/or playing back content corresponding to an environment
Methods and apparatus for receiving content including images of surfaces of an environment visible from a default viewing position and images of surfaces not visible from the default viewing position, e.g., occluded surfaces, are described. Occluded and non-occluded image portions are received in content streams that can be in a variety of stream formats. In one stream format non-occluded image content is packed into a frame with occluded image content with the occluded image content normally occupying a small portion of the frame. In other embodiments occluded image portions are received in an auxiliary data stream which is multiplexed with a data stream providing frames of non-occluded image content. UV maps which are used to map received image content to segments of an environmental model are also supplied with the UV maps corresponding to the format of the frames which are used to provide the images that serve as textures.
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
Methods and apparatus for making environmental measurements are described. In some embodiments different devices are used to capture environmental information at different times, rates and/or resolutions. Environmental information, e.g., depth information, from multiples sources captured using a variety of devices is processed and combined. Some environmental information is captured during an event. Such information is combined, in some embodiments, with environmental information that was captured prior to the event. Environmental depth model is generated in some embodiments by combining, e.g., reconciling, depth information from at least two different sources including: i) depth information obtained from a static map, ii) depth information obtained from images captured by light field cameras, and iii) depth information obtained from images captured by stereoscopic camera pairs. The reconciliation process may involve a variety of information weighting operations taking into consideration the advantages of different depth information sources and the availability of such information.
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
73.
Methods and apparatus for receiving and/or using reduced resolution images
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 19/136 - Incoming video signal characteristics or properties
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
74.
Methods and apparatus for mapping at least one received image to a surface of a model in a manner that efficiently uses the image content as a texture
Methods and apparatus for using selective resolution reduction on images to be transmitted and/or used by a playback device are described. Prior to transmission one or more images of an environment are captured. Based on image content, motion detection and/or user input a resolution reduction operation is selected and performed. The reduced resolution image is communicated to a playback device along with information indicating a UV map corresponding to the selected resolution allocation that should be used by the playback device for rendering the communicated image. By changing the resolution allocation used and which UV map is used by the playback device different resolution allocations can be made with respect to different portions of the environment while allowing the number of pixels in transmitted images to remain constant. The playback device renders the individual images with the UV map corresponding to the resolution allocation used to generate the individual images.
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
H04N 13/243 - Image signal generators using stereoscopic image cameras using three or more 2D image sensors
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/136 - Incoming video signal characteristics or properties
H04N 21/437 - Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/2387 - Stream processing in response to a playback request from an end-user, e.g. for trick-play
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
Methods and apparatus for allowing a user to switch viewing positions and/or perspective while viewing an environment, e.g., as part of a 3D playback/viewing experience, are described. In various embodiments images of the environment are captured using cameras placed at multiple camera positions. During viewing a user can select which camera position he/she would like to experience the environment from. While experiencing the environment from the perspective of a first camera position the user may switch from the first to a second camera position by looking at the second position. A visual indication is provided to the user to indicate that the user can select the other camera position as his/her viewing position. If a user input indicates a desired viewing position change, a switch to the alternate viewing position is made and the user is presented with images captured from the perspective of the user selected alternative viewing position.
Methods and apparatus relating to encoding and decoding stereoscopic (3D) image data, e.g., left and right eye images, are described. Various pre-encoding and post-decoding operations are described in conjunction with difference based encoding and decoding techniques. In some embodiments left and right eye image data is subject to scaling, transform operation(s) and cropping prior to encoding. In addition, in some embodiments decoded left and right eye image data is subject to scaling, transform operations(s) and cropping prior to being output to a display device. Transform information, scaling information and/or cropping information may be included in a bitstream communicating encoded left and right eye images. The amount of scaling can be the same for an entire scene and/or program.
Methods and apparatus for stereoscopic image encoding and decoding are described. Left and right eye images are encoded following an entropy reduction operation being applied to one of the eye images when there is a difference between the left and right images of an image pair. Information about regions of negative parallax within the entropy reduced image of an image pair is encoded along with the images. Upon decoding a sharpening filter is applied to the image in an image pair which was subjected to the entropy reduction operation. In addition edge enhancement filtering is performed on the regions of the recovered entropy reduced image which are identified in the encoded image data as regions of negative parallax. Interleaving of left and right eye images at the input of the encoder combined with entropy reduction allows for efficient encoding, storage, and transmission of 3D images.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/00 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04N 19/583 - Motion compensation with overlapping blocks
H04N 19/82 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Methods and apparatus for implementing user controlled zoom operations during a stereoscopic, e.g., 3D, presentation are described. While viewing a 3D presentation of a scene environment, a user may switch to a zoom mode allowing the user to zoom in on a particular portion of the environment being displayed. In order to maintain the effect of being physically present at the event, and also to reduce the risk of making the user sick from sudden non-real world like changes in views of the environment, the user in response to initiating a zoom mode of operation is presented with a view which is the same or similar to that which might be expected as the result of looking through a pair of binoculars. In some embodiments the restriction in view is achieved by applying masks to enlarged version of left and right eye views to be displayed.
A camera rig including one or more stereoscopic camera pairs and/or one or more light field cameras are described. Images are captured by the light field cameras and stereoscopic camera pairs are captured at the same time. The light field images are used to generate an environmental depth map which accurately reflects the environment in which the stereoscopic images are captured at the time of image capture. In addition to providing depth information, images captured by the light field camera or cameras is combined with or used in place of stereoscopic image data to allow viewing and/or display of portions of a scene not captured by a stereoscopic camera pair.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
80.
Stereoscopic video encoding and decoding methods and apparatus
Methods and apparatus for stereoscopic image encoding and decoding are described. Left and right eye images are encoded following an entropy reduction operation being applied to one of the eye images when there is a difference between the left and right images of an image pair. Information about regions of negative parallax within the entropy reduced image of an image pair is encoded along with the images. Upon decoding a sharpening filter is applied to the image in an image pair which was subjected to the entropy reduction operation. In addition edge enhancement filtering is performed on the regions of the recovered entropy reduced image which are identified in the encoded image data as regions of negative parallax. Interleaving of left and right eye images at the input of the encoder combined with entropy reduction allows for efficient encoding, storage, and transmission of 3D images.
Stereoscopic image processing methods and apparatus are described. Left and right eye images of a stereoscopic frame are examined to determine if one or more difference reduction operations designed to reduce the luminance and/or chrominance differences between the left and right frames is within a range used to trigger a difference reduction operation. A difference reduction operation may involve assigning portions of the left and right frames to different depth regions and/or other region categories. A decision on whether or not to perform a difference reduction operation is then performed on a per regions basis with at the difference between the left and right eye portions of at least one region being reduced when a difference reduction operation is to be performed. The difference reduction process may, and in some embodiments is, performed in a precoder which processes left and right eye images of stereoscopic frames prior to stereoscopic encoding.
Methods and apparatus for streaming or playing back stereoscopic content are described. Camera dependent correction information is communicated to a playback device and applied in the playback device to compensate for distortions introduced by the lenses of individual cameras. By performing lens dependent distortion compensation in the playback device edges which might be lost if correction were performed prior to encoding are preserved. Distortion correction information maybe in the form of UV map correction information. The correction information may indicate changes to be made to information in a UV map, e.g., at rendering time, to compensate for distortions specific to an individual camera. Different sets of correction information maybe communicated and used for different cameras of a stereoscopic pair which provide images that are rendered using the same UV map. The communicated correction information is sometimes called a correction mesh since it is used to correct mesh related information.
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
Methods and apparatus for streaming or playing back stereoscopic content are described. Camera dependent correction information is communicated to a playback device and applied in the playback device to compensate for distortions introduced by the lenses of individual cameras. By performing lens dependent distortion compensation in the playback device edges which might be lost if correction were performed prior to encoding are preserved. Distortion correction information maybe in the form of UV map correction information. The correction information may indicate changes to be made to information in a UV map, e.g., at rendering time, to compensate for distortions specific to an individual camera. Different sets of correction information maybe communicated and used for different cameras of a stereoscopic pair which provide images that are rendered using the same UV map. The communicated correction information is sometimes called a correction mesh since it is used to correct mesh related information.
H04N 7/12 - Systems in which the television signal is transmitted via one channel or a plurality of parallel channels, the bandwidth of each channel being less than the bandwidth of the television signal
H04N 11/02 - Colour television systems with bandwidth reduction
H04N 11/04 - Colour television systems using pulse code modulation
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04N 13/172 - Processing image signals image signals comprising non-image signal components, e.g. headers or format information
H04N 13/161 - Encoding, multiplexing or demultiplexing different image signal components
H04N 13/189 - Recording image signals; Reproducing recorded image signals
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 13/275 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
H04N 13/366 - Image reproducers using viewer tracking
H04N 13/139 - Format conversion, e.g. of frame-rate or size
H04N 13/398 - Synchronisation thereof; Control thereof
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
Content delivery and playback methods and apparatus are described. The methods and apparatus are well suited for delivery and playback of content corresponding to a 360 degree environment and can be used to support streaming and/or real time delivery of 3D content corresponding to an event, e.g., while the event is ongoing or after the event is over. Portions of the environment are captured by cameras located at different positions. The content captured from different locations is encoded and made available for delivery. A playback device selects the content to be received based in a user's head position.
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/134 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
H04N 19/37 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
H04N 19/146 - Data rate or code amount at the encoder output
H04N 19/39 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
H04N 19/587 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
85.
METHODS AND APPARATUS FOR DELIVERING CONTENT AND/OR PLAYING BACK CONTENT
Content delivery and playback methods and apparatus are described. The methods and apparatus are well suited for delivery and playback of content, corresponding to a 360 degree environment and can be used to support streaming and/or real time delivery of content, e.g., 3D content, corresponding to an event such as a sports game, e.g., while the event is ongoing or after the event is over. Portions of the environment are captured by cameras located at different positions. The content captured from different locations is encoded and made available for delivery. A playback device selects the content to be received based in a user's head position. Streams may be prioritized and selected for delivery based on the user's current field of view and/or direction of head rotation. Static images or synthesized images can be used and combined with content from one or more streams, e.g., for background, sky and/or ground portions.
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
H04N 21/23 - Processing of content or additional data; Elementary server operations; Server middleware
H04N 21/414 - Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
H04N 21/647 - Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load or bridging bet
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
86.
Stereoscopic image processing methods and apparatus
Stereoscopic image processing methods and apparatus are described. Left and right eye images of a stereoscopic frame are examined to determine if one or more difference reduction operations designed to reduce the luminance and/or chrominance differences between the left and right frames is within a range used to trigger a difference reduction operation. A difference reduction operation may involve assigning portions of the left and right frames to different depth regions and/or other region categories. A decision on whether or not to perform a difference reduction operation is then performed on a per regions basis with at the difference between the left and right eye portions of at least one region being reduced when a difference reduction operation is to be performed. The difference reduction process may, and in some embodiments is, performed in a precoder which processes left and right eye images of stereoscopic frames prior to stereoscopic encoding.
Camera related methods and apparatus which are well suited for use in capturing stereoscopic image data, e.g., pairs of left and right eye images, are described. Various features relate to a camera rig which can be used to mount multiple cameras at the same time. In some embodiments the camera rig includes 3 mounting locations corresponding to 3 different directions 120 degrees apart. One or more of the mounting locations may be used at a given time. When a single camera pair is used the rig can be rotated to capture images corresponding to the locations where a camera pair is not mounted. Static images from those locations can then be combined with images corresponding to the forward direction to generate a 360 degree view. Alternatively camera pairs or individual cameras can be included in each of the mounting locations to capture video in multiple directions.
G03B 35/08 - Stereoscopic photography by simultaneous recording
G03B 37/04 - Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe with cameras or projectors providing touching or overlapping fields of view
F16M 11/08 - Means for attachment of apparatus; Means allowing adjustment of the apparatus relatively to the stand allowing pivoting around a vertical axis
F16M 11/24 - Undercarriages with or without wheels changeable in height or length of legs, also for transport only
H04N 5/262 - Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects
Methods and apparatus for streaming content corresponding to a 360 degree field of view are described. The methods and apparatus of the present invention are well suited for use with 3D immersion systems and/or head mounted display which allow a user to turn his or her head and see a corresponding scene portion. The methods and apparatus can support real or near real time streaming of 3D image content corresponding to a 360 degree field of view.
H04N 9/82 - Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
H04N 13/00 - PICTORIAL COMMUNICATION, e.g. TELEVISION - Details thereof
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G11B 27/30 - Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
Methods and apparatus for performing stereoscopic image encoding and decoding are described. Left and right eye images are generated. Image difference information is generated, e.g., a set of pixel values resulting from XORing the pixel values of the left and right eye images. One of the left and right eye images is compressed along with the difference map. The compressed image and compressed difference map is stored and/or transmitted. Stereoscopic images are generated by decompressing and using the received compressed image and compressed difference information. Prior to generation of the difference map the left and right eye images may be subject to a transposition operation to minimize the differences between the images and thus the size of the difference map. When transposition is applied, transposition information is stored and communicated in addition to the compressed image data so that the transposition can be reversed during the stereoscopic image generation process.
In some embodiments left and right eye images are encoded following an entropy reduction operation being applied to one of the eye images when there is a difference between the left and right images of an image pair. Information about regions of negative parallax within the entropy reduced image of an image pair is encoded along with the images. Upon decoding a sharpening filter is applied to the image in an image pair which was subjected to the entropy reduction operation. In addition edge enhancement filtering is performed on the regions of the recovered entropy reduced image which are identified in the encoded image data as regions of negative parallax. Interleaving of left and right eye images at the input of the encoder combined with entropy reduction allows for efficient encoding, storage, and transmission of 3D images.