A method for performing gain control on audio signals is provided. In some implementations, the method involves determining downmixed signals associated with one or more downmix channels associated with a current frame of an audio signal to be encoded. In some implementations, the method involves determining whether an overload condition exists for an encoder. In some implementation, the method involves determining a gain parameter. In some implementations, the method involves determining at least one gain transition function based on the gain parameter and a gain parameter associated with a preceding frame of the audio signal. In some implementations, the method involves applying the at least one gain transition function to one or more of the downmixed signals. In some implementations, the method involves encoding the downmixed signals in connection with information indicative of gain control applied to the current frame.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Disclosed is an audio signal encoding/decoding method that uses an encoding downmix strategy applied at an encoder that is different than a decoding re-mix/upmix strategy applied at a decoder. Based on the type of downmix coding scheme, the method comprises: computing input downmixing gains to be applied to the input audio signal to construct a primary downmix channel; determining downmix scaling gains to scale the primary downmix channel; generating prediction gains based on the input audio signal, the input downmixing gains and the downmix scaling gains; determining residual channel(s) from the side channels by using the primary downmix channel and the prediction gains to generate side channel predictions and subtracting the side channel predictions from the side channels; determining decorrelation gains based on energy in the residual channels; encoding the primary downmix channel, the residual channel(s), the prediction gains and the decorrelation gains; and sending the bitstream to a decoder.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 5/00 - Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
A method comprising receiving a first input bit stream for a first parametrically coded input audio signal, the first input bit stream including data representing a first input core audio signal and a first set including at least one spatial parameter relating to the first parametrically coded input audio signal. A first covariance matrix of the first parametrically coded audio signal is determined based on the spatial parameter(s) of the first set. A modified set including at least one spatial parameter is determined based on the determined first covariance matrix, wherein the modified set is different from the first set. An output core audio signal is determined, which is based on, or constituted by, the first input core audio signal. An output bit stream for a parametrically coded output audio signal is generated, the output bit stream including data representing the output core audio signal and the modified set.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
A method for distributing High Dynamic Range (HDR) content to playback devices for displaying images where the HDR content is encoded to an HDR bitstream and the HDR bitstream is subsequently decoded by a playback device. The HDR bitstream contains auxiliary metadata packets that are based upon the processing capability of the playback device.
H04N 1/64 - Colour picture communication systems - Details therefor, e.g. coding or decoding means therefor
H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/235 - Processing of additional data, e.g. scrambling of additional data or processing content descriptors
H04N 21/63 - Control signaling between client, server and network components; Network processes for video distribution between server and clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally- efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
H04N 19/31 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
H04N 19/187 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
7.
METHODS AND DEVICES FOR ENCODING AND/OR DECODING SPATIAL BACKGROUND NOISE WITHIN A MULTI-CHANNEL INPUT SIGNAL
The present document describes a method (600) for encoding a multi-channel input signal (101) which comprises N different channels. The method (600) comprises, for a current frame of a sequence of frames, determining (601) whether the current frame is an active frame or an inactive frame using a signal and/or a voice activity detector, and determining (602) a downmix signal (103) based on the multi-channel input signal (101), wherein the downmix signal (103) comprises N channels or less. In addition, the method (600) comprises determining (603) upmixing metadata (105) comprising a set of parameters for generating, based on the downmix signal (103), a reconstructed multi-channel signal (111) comprising N channels, wherein the upmixing metadata (105) is determined in dependance of whether the current frame is an active frame or an inactive frame. The method (600) further comprises encoding (604) the upmixing metadata (105) into a bitstream.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Systems, methods, and computer program products are disclosed for adaptive downmixing of audio signals with improved continuity. An audio encoding system receives an input multi-channel audio signal including a primary input audio channel and L non-primary input audio channels. The system determines a set of L input gains. For each of the channels and gains, the system forms a respective scaled non-primary input audio channel. The system forms a primary output audio channel from the sum of the primary input audio channel and the scaled non-primary input audio channels. The system determines a set of L prediction gains. The system forms a prediction channel from the primary output audio channel. The system forms L non-primary output audio channels. The system forms an output multi-channel audio signal from the primary output audio channel and the L non-primary output audio channels.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/04 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
9.
QUANTIZATION AND ENTROPY CODING OF PARAMETERS FOR A LOW LATENCY AUDIO CODEC
Described is a method of frame-wise encoding metadata for an input signal, the metadata comprising a plurality of at least partially interrelated parameters calculable from the input signal. The method comprises, for each frame: iteratively performing, by using a looping process, steps of: determining a processing strategy from a plurality of processing strategies for calculating and quantizing the parameters; calculating and quantizing the parameters based on the determined processing strategy to obtain quantized parameters; and encoding the quantized parameters. In particular, each of the plurality of processing strategies comprises a respective first indication indicative of an ordering related to the calculation and quantization of individual parameters; and the processing strategy is determined based on at least one bitrate threshold.
G10L 19/032 - Quantisation or dequantisation of spectral components
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
10.
METHODS, APPARATUS, AND SYSTEMS FOR DETECTION AND EXTRACTION OF SPATIALLY-IDENTIFIABLE SUBBAND AUDIO SOURCES
In an embodiment, a method comprises: transforming one or more frames of a two-channel time domain audio signal into a time-frequency domain representation including a plurality of time-frequency tiles, wherein the frequency domain of the time-frequency domain representation includes a plurality of frequency bins grouped into subbands. For each time-frequency tile, the method comprises: calculating spatial parameters and a level for the time-frequency tile; modifying the spatial parameters using shift and squeeze parameters; obtaining a softmask value for each frequency bin using the modified spatial parameters, the level and subband information; and applying the softmask values to the time-frequency tile to generate a modified time-frequency tile of an estimated audio source. In an embodiment, a plurality of frames of the time-frequency tiles are assembled into a plurality of chunks, wherein each chunk includes a plurality of subbands, and the method described above is performed on each subband of each chunk.
Embodiments are disclosed for bitrate distribution in immersive voice and audio services. In an embodiment, a method of encoding an IVAS bitstream comprises: receiving an input audio signal; downmixing the input audio signal into one or more downmix channels and spatial metadata; reading a set of one or more bitrates for the downmix channels and a set of quantization levels for the spatial metadata from a bitrate distribution control table; determining a combination of the one or more bitrates for the downmix channels; determining a metadata quantization level from the set of metadata quantization levels using a bitrate distribution process; quantizing and coding the spatial metadata using the metadata quantization level; generating, using the combination of one or more bitrates, a downmix bitstream for the one or more downmix channels; combining the downmix bitstream, the quantized and coded spatial metadata and the set of quantization levels into the IVAS bitstream.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
12.
MULTICHANNEL AUDIO ENCODE AND DECODE USING DIRECTIONAL METADATA
The disclosure relates to methods of processing a spatial audio signal for generating a compressed representation of the spatial audio signal. The methods include analyzing the spatial audio signal to determine directions of arrival for one or more audio elements; for at least one frequency subband, determining respective indications of signal power associated with the directions of arrival; generating metadata including direction information that includes indications of the directions of arrival of the audio elements, and energy information that includes respective indications of signal power; generating a channel-based audio signal with a predefined number of channels based on the spatial audio signal; and outputting, as the compressed representation, the channel-based audio signal and the metadata. The disclosure further relates to methods of processing a compressed representation of a spatial audio signal for generating a reconstructed representation of the spatial audio signal, and to corresponding apparatus, programs, and storage media.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
13.
METHODS AND SYSTEM FOR WAVEFORM CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL
Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
An multi-input, multi-output audio process is implemented as a linear system for use in an audio filterbank to convert a set of frequency-domain input audio signals into a set of frequency-domain output audio signals. A transfer function from one input to one output is defined as a frequency dependent gain function. In some implementations, the transfer function includes a direct component that is substantially defined as a frequency dependent gain, and one or more decorrelated components that have frequency-varying group phase response. The transfer function is formed from a set of sub-band functions, with each sub-band function being formed from a set of corresponding component transfer functions including direct component and one or more decorrelated components.
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
H04S 3/02 - Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
In some implementations, a method of encoding a low-frequency effect (LFE) channel comprises: receiving a time-domain LFE channel signal; filtering, using a low-pass filter, the time-domain LFE channel signal; converting the filtered time-domain LFE channel signal into a frequency-domain representation of the LFE channel signal that includes a number of coefficients representing a frequency spectrum of the LFE channel signal; arranging coefficients into a number of subband groups corresponding to different frequency bands of the LFE channel signal; quantizing coefficients in each subband group according to a frequency response curve of the low-pass filter; encoding the quantized coefficients in each subband group using an entropy coder tuned for the subband group; and generating a bitstream including the encoded quantized coefficients; and storing the bitstream on a storage device or streaming the bitstream to a downstream device.
An audio processing method may involve receiving output signals from each microphone of a plurality of microphones in an audio environment, the output signals corresponding to a current utterance of a person and determining, based on the output signals, one or more aspects of context information relating to the person, including an estimated current proximity of the person to one or more microphone locations. The method may involve selecting two or more loudspeaker-equipped audio devices based, at least in part, on the one or more aspects of the context information, determining one or more types of audio processing changes to apply to audio data being rendered to loudspeaker feed signals for the audio devices and causing one or more types of audio processing changes to be applied. In some examples, the audio processing changes have the effect of increasing a speech to echo ratio at one or more microphones.
Encoding/decoding an immersive voice and audio services (IVAS) bitstream comprises: encoding/decoding a coding mode indicator in a common header (CH) section of an IVAS bitstream, encoding/decoding a mode header or tool header in the tool header (TH) section of the bitstream, the TH section following the CH section, encoding/decoding a metadata payload in a metadata payload (MDP) section of the bitstream, the MDP section following the CH section, encoding/decoding an enhanced voice services (EVS) payload in an EVS payload (EP) section of the bitstream, the EP section following the CH section, and on the encoder side, storing or streaming the encoded bitstream, and on the decoder side, controlling an audio decoder based on the coding mode, the tool header, the EVS payload, and the metadata payload or storing a representation of same.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Methods and systems for improving signal processing by smoothing the covariance matrix of a multi-channel signal by setting a forgetting factor based on the bins of a band. A method and system for resetting the smoothing based on transient detection is also disclosed. A method and system for resampling for the smoothing during a banding transition is also disclosed.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
A filterbank, suitable for modifying audio signals with dynamic gains in each band, is constructed so that the perceived latency is small, while a larger group delay is applied at low frequencies to enable higher frequency resolution in the lower frequency bands. The higher group delay at low frequencies is achieved by inserting an all-pass filter into the reconstructed filter response.
A projection system and method therefor comprises a first light source configured to emit a first-eye light, wherein the first-eye light includes a first set of wavelengths; a second light source configured to emit a second-eye light, wherein the second-eye light includes a second set of wavelengths; a first projector including first projection optics configured to receive a first input light; and an optical switch configured to be switched between an a first mode and a second mode, wherein the optical switch is configured to, in the first mode, combine the first-eye light and the second-eye light into a combined light and direct the combined light to the first projection optics as the first input light.
G02F 1/00 - Devices or arrangements for the control of the intensity, colour, phase, polarisation or direction of light arriving from an independent light source, e.g. switching, gating or modulating; Non-linear optics
The disclosure herein generally relates to capturing, acoustic pre-processing, encoding, decoding, and rendering of directional audio of an audio scene. In particular, it relates to a device adapted to modify a directional property of a captured directional audio in response to spatial data of a microphone system capturing the directional audio. The disclosure further relates to a rendering device configured to modify a directional property of a received directional audio in response to received spatial data.
The disclosed embodiments enable converting audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec (e.g., an Immersive Voice and Audio Services (IVAS) codec). In an embodiment, a simplification unit of the audio device receives an audio signal captured by one or more audio capture devices coupled to the audio device. The simplification unit determines whether the audio signal is in a format that is supported/not supported by an encoding unit of the audio device. Based on the determining, the simplification unit, converts the audio signal into a format that is supported by the encoding unit. In an embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit can convert the audio signal into a spatial "mezzanine" format supported by the encoding.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
23.
PROJECTOR LIGHT SOURCE DIMMING USING METADATA FROM FUTURE FRAMES
A projection display system comprises a light source configured to emit a light in response to a content data; an optical modulator configured to modulate the light; and a controller configured to adjust a light level of the projection display system based on the content data and a metadata relating to a future frame, thereby to reduce a perceptibility of a visual artifact.
The present document describes a method (700) for encoding a multi-channel input signal (201). The method (700) comprises determining (701) a plurality of downmix channel signals (203) from the multi-channel input signal (201) and performing (702) energy compaction of the plurality of downmix channel signals (203) to provide a plurality of compacted channel signals (404). Furthermore, the method (700) comprises determining (703) joint coding metadata (205) based on the plurality of compacted channel signals (404) and based on the multi-channel input signal (201), wherein the joint coding metadata (205) is such that it allows upmixing of the plurality of compacted channel signals (404) to an approximation of the multi-channel input signal (201). In addition, the method (700) comprises encoding (704) the plurality of compacted channel signals (404) and the joint coding metadata (205).
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/04 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
25.
METHODS AND DEVICES FOR GENERATING OR DECODING A BITSTREAM COMPRISING IMMERSIVE AUDIO SIGNALS
The present document describes a method (500) for generating a bitstream (101), wherein the bitstream (101) comprises a sequence of superframes (400) for a sequence of frames of an immersive audio signal (111). The method (500) comprises, repeatedly for the sequence of superframes (400), inserting (501) coded audio data (206) for one or more frames of one or more downmix channel signals (203) derived from the immersive audio signal (111), into data fields (411, 421, 412, 422) of a superframe (400); and inserting (502) metadata (202, 205) for reconstructing one or more frames of the immersive audio signal (111) from the coded audio data (206), into a metadata field (403) of the superframe (400).
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
A projector controller includes an object detector and control electronics, and is configured to protect audience members from intense light imposing an exclusion zone in front of a projector. The object detector is configured to optically sense a presence of an object in a detection region beneath the exclusion zone and above the audience members. The control electronics is configured to control the projector when the object detector indicates the presence of the object in the detection region. A method for protecting audience members from intense light imposing an exclusion zone in front of an output of a projector includes: (i) optically sensing a presence of an object in a detection region between the exclusion zone and the audience members, and (ii) controlling the projector when the presence of the object is sensed in the detection region.
An optical filter to increase contrast of an image generated with a spatial light modulator includes a lens for spatially Fourier transforming modulated light from the spatial light modulator, and an optical filter mask positioned at a Fourier plane of the lens to filter the modulated light. The modulated light has a plurality of diffraction orders, and the optical filter mask transmits at least one of the diffraction orders of the modulated light and block a remaining portion of the modulated light. A method that improves contrast of an image generated with a spatial light modulator includes spatially Fourier transforming modulated light from the spatial light modulator onto a Fourier plane, and filtering the modulated light by transmitting at least one diffraction order of the modulated light at the Fourier plane and blocking a remaining portion of the modulated light at the Fourier plane.
Given a sequence of images in a first codeword representation, methods, processes, and systems are presented for image reshaping using rate distortion optimization, wherein reshaping allows the images to be coded in a second codeword representation which allows more efficient compression than using the first codeword representation. Syntax methods for signaling reshaping parameters are also presented.
H04N 19/503 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
H04N 19/107 - Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/147 - Data rate or code amount at the encoder output according to rate distortion criteria
H04N 19/159 - Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/46 - Embedding additional information in the video signal during the compression process
H04N 19/82 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
29.
IMAGE RESHAPING IN VIDEO CODING USING RATE DISTORTION OPTIMIZATION
Given a sequence of images in a first codeword representation, methods, processes, and systems are presented for image reshaping using rate distortion optimization, wherein reshaping allows the images to be coded in a second codeword representation which allows more efficient compression than using the first codeword representation. Syntax methods for signaling reshaping parameters are also presented.
H04N 19/40 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/174 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 21/235 - Processing of additional data, e.g. scrambling of additional data or processing content descriptors
H04N 21/84 - Generation or processing of descriptive data, e.g. content descriptors
H04N 19/46 - Embedding additional information in the video signal during the compression process
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
34.
HEADTRACKING FOR PARAMETRIC BINAURAL OUTPUT SYSTEM AND METHOD
A method of encoding channel or object based input audio for playback, the method including the steps of: (a) initially rendering the channel or object based input audio into an initial output presentation; (b) determining an estimate of the dominant audio component from the channel or object based input audio and determining a series of dominant audio component weighting factors for mapping the initial output presentation into the dominant audio component; (c) determining an estimate of the dominant audio component direction or position; and (d) encoding the initial output presentation, the dominant audio component weighting factors, the dominant audio component direction or position as the encoded signal for playback.
A method of encoding channel or object based input audio for playback, the method including the steps of: (a) initially rendering the channel or object based input audio into an initial output presentation; (b) determining an estimate of the dominant audio component from the channel or object based input audio and determining a series of dominant audio component weighting factors for mapping the initial output presentation into the dominant audio component; (c) determining an estimate of the dominant audio component direction or position; and (d) encoding the initial output presentation, the dominant audio component weighting factors, the dominant audio component direction or position as the encoded signal for playback.
H04S 3/02 - Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
A method for encoding an input audio stream including the steps of obtaining a first playback stream presentation of the input audio stream intended for reproduction on a first audio reproduction system, obtaining a second playback stream presentation of the input audio stream intended for reproduction on a second audio reproduction system, determining a set of transform parameters suitable for transforming an intermediate playback stream presentation to an approximation of the second playback stream presentation, wherein the transform parameters are determined by minimization of a measure of a difference between the approximation of the second playback stream presentation and the second playback stream presentation, and encoding the first playback stream presentation and the set of transform parameters for transmission to a decoder.
86841344 ABSTRACT In a method to improve backwards compatibility when decoding high-dynamic range images coded in a wide color gamut (WCG) space which may not be compatible with legacy color spaces, hue and/or saturation values of images in an image database are computed for both a legacy color space (say, YCbCr-gamma) and a preferred WCG color space (say, IPT-PQ). Based on a cost function, a reshaped color space is computed so that the distance between the hue values in the legacy color space and rotated hue values in the preferred color space is minimized. HDR images are coded in the reshaped color space. Legacy devices can still decode standard dynamic range images assuming they are coded in the legacy color space, while updated devices can use color reshaping information to decode HDR images in the preferred color space at full dynamic range. Date Recue/Date Received 2020-07-10
In a method to improve backwards compatibility when decoding high-dynamic range images coded in a wide color gamut (WCG) space which may not be compatible with legacy color spaces, hue and/or saturation values of images in an image database are computed for both a legacy color space (say, YCbCr-gamma) and a preferred WCG color space (say, IPT-PQ). Based on a cost function, a reshaped color space is computed so that the distance between the hue values in the legacy color space and rotated hue values in the preferred color space is minimized. HDR images are coded in the reshaped color space. Legacy devices can still decode standard dynamic range images assuming they are coded in the legacy color space, while updated devices can use color reshaping information to decode HDR images in the preferred color space at full dynamic range.
H04N 19/46 - Embedding additional information in the video signal during the compression process
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
A display management processor receives an input image with enhanced dynamic range to be displayed on a target display which has a different dynamic range than a reference display. The input image is first transformed into a perceptually-quantized (PQ) color space, preferably the IPT-PQ color space. A color volume mapping function, which includes an adaptive tone-mapping function and an adaptive gamut mapping function, generates a mapped image. A detail-preservation step is applied to the intensity component of the mapped image to generate a final mapped image with a filtered tone-mapped intensity image. The final mapped image is then translated back to the display's preferred color space. Examples of the adaptive tone mapping and gamut mapping functions are provided.
In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feedback delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feedback delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feedback delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feedback delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
H04S 5/00 - Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
An audio processing system, such as an upmixer, may be capable of separating diffuse and non-diffuse portions of N input audio signals. The upmixer may be capable of detecting instances of transient audio signal conditions. During instances of transient audio signal conditions, the upmixer may be capable of adding a signal-adaptive control to a diffuse signal expansion process in which M audio signals are output. The upmixer may vary the diffuse signal expansion process over time such that during instances of transient audio signal conditions the diffuse portions of audio signals may be distributed substantially only to output channels spatially close to the input channels. During instances of non-transient audio signal conditions, the diffuse portions of audio signals may be distributed in a substantially uniform manner.
Methods which uses interpolated primitive matrices to decode encoded audio to recover (losslessly) content of a multichannel audio program and/or to recover at least one downmix of such content, and encoding methods for generating such encoded audio. In some embodiments, a decoder performs interpolation on a set of seed primitive matrices to determine interpolated matrices for use in rendering channels of the program. Other aspects are a system or device configured to implement any embodiment of the method.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
47.
AUDIO ENCODER AND DECODER WITH PROGRAM INFORMATION OR SUBSTREAM STRUCTURE METADATA
Apparatus and methods for generating an encoded audio bitstream, including by including substream structure metadata (SSM) and/or program information metadata (PIM) and audio data in the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, and an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method.
H04H 20/95 - Arrangements characterised by special technical features of the broadcast information, e.g. signal form or information format characterised by a specific format, e.g. MP3 [MPEG-1 Audio Layer 3]
H04H 60/74 - Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information using programme related information, e.g. title, composer or interpreter
48.
MULTI-HALF-TONE IMAGING AND DUAL MODULATION PROJECTION/DUAL MODULATION LASER PROJECTION
Smaller halftone tiles are implemented on a first modulator of a dual modulation projection system. This techniques uses multiple halftones per frame in the pre-modulator synchronized with a modified bit sequence in the primary modulator to effectively increase the number of levels provided by a given tile size in the halftone modulator. It addresses the issue of reduced contrast ratio at low light levels for small tile sizes and allows the use of smaller PSFs which reduce halo artifacts in the projected image and may be utilized in 3D projecting and viewing.
G09G 3/30 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix using controlled light sources using electroluminescent panels
Smaller halftone tiles are implemented on a ast modulator of a dual modulation projection system. This techniques uses multiple halftones per frame in the pre-modulator synchronized with a modified bit sequence in the primary modulator to effectively increase the number of levels provided by a given tile size in the halftone modulator. It addresses the issue of reduced contrast ratio at low light levels for small tile sizes and allows the use of smaller PSFs which reduce halo artifacts in the projected image and may be utilized in 3D projecting and viewing.
G09F 9/30 - Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements
G09G 3/20 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix
H04N 5/74 - Projection arrangements for image reproduction, e.g. using eidophor
50.
MULTI-HALF-TONE IMAGING AND DUAL MODULATION PROJECTION/DUAL MODULATION LASER PROJECTION
Smaller halftone tiles are implemented on a first modulator of a dual modulation projection system. This techniques uses multiple halftones per frame in the pre-modulator synchronized with a modified bit sequence in the primary modulator to effect- ively increase the nurnber of levels provided by a given tile size in the halftone modulator. ft addresses the issue of reduced contrast ratio at low light levels for small tile sizes and allows the use of smaller PSFs which reduce halo artifacts in the projected image and may be utilized in 3D projecting and viewing.
G09F 9/30 - Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements
G09G 3/20 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix
H04N 5/74 - Projection arrangements for image reproduction, e.g. using eidophor
51.
COMPANDING APPARATUS AND METHOD TO REDUCE QUANTIZATION NOISE USING ADVANCED SPECTRAL EXTENSION
Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.
Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during "run time," during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment.
Received audio data may include a first set of frequency coefficients and a second set of frequency coefficients. Spatial parameters for at least part of the second set of frequency coefficients may be estimated, based at least in part on the first set of frequency coefficients. The estimated spatial parameters may be applied to the second set of frequency coefficients to generate a modified second set of frequency coefficients. The first set of frequency coefficients may correspond to a first frequency range (for example, an individual channel frequency range) and the second set of frequency coefficients may correspond to a second frequency range (for example, a coupled channel frequency range). Combined frequency coefficients of a composite coupling channel may be based on frequency coefficients of two or more channels. Cross-correlation coefficients, between frequency coefficients of a first channel and the combined frequency coefficients, may be computed.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
54.
AUDIO ENCODER AND DECODER WITH PROGRAM LOUDNESS AND BOUNDARY METADATA
Apparatus and methods for generating an encoded audio bitstream, including by including program loudness metadata and audio data in the bitstream, and optionally also program boundary metadata in at least one segment (e.g., frame) of the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, e.g., including by performing adaptive loudness processing of the audio data of an audio program indicated by the bitstream, or authentication and/or validation of metadata and/or audio data of such an audio program. Another aspect is an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method.
Embodiments are directed to speakers and circuits that reflect sound off a ceiling to a listening location at a distance from a speaker. The reflected sound provides height cues to reproduce audio objects that have overhead audio components. The speaker comprises upward firing drivers to reflect sound off of the upper surface and represents a virtual height speaker. A virtual height filter based on a directional hearing model is applied to the upward-firing driver signal to improve the perception of height for audio signals transmitted by the virtual height speaker to provide optimum reproduction of the overhead reflected sound. The virtual height filter may be incorporated as part of a crossover circuit that separates the full band and sends high frequency sound to the upward-firing driver.
Video data are coded in a coding-standard layered bit stream. Given a base layer (BL) and one or more enhancement layer (EL) signals, the BL signal is coded into a coded BL stream using a BL encoder which is compliant to a first coding standard. In response to the BL signal and the EL signal, a reference processing unit (RPU) determines RPU processing parameters. In response to the RPU processing parameters and the BL signal, the RPU generates an inter-layer reference signal. Using an EL encoder which is compliant to a second coding standard, the EL signal is coded into a coded EL stream, where the encoding of the EL signal is based at least in part on the inter-layer reference signal. Receivers with an RPU and video decoders compliant to both the first and the second coding standards may decode both the BL and the EL coded streams.
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
H04N 19/12 - Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
H04N 19/187 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
H04N 19/40 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
H04N 19/46 - Embedding additional information in the video signal during the compression process
H04N 19/61 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
57.
MULTISTAGE IIR FILTER AND PARALLELIZED FILTERING OF DATA WITH SAME
In some embodiments, a multistage filter whose biquad filter stages are combined with latency between the stages, a system (e.g., an audio encoder or decoder) including such a filter, and methods for multistage biquad filtering. In typical embodiments, all biquad filter stages of the filter are operable independently to perform fully parallelized processing of data. In some embodiments, the inventive multistage filter includes a buffer memory, at least two biquad filter stages, and a controller coupled and configured to assert a single stream of instructions to the filter stages. Typically, the multistage filter is configured to perform multistage filtering of a block of input samples in a single processing loop with iteration over a sample index but without iteration over a biquadratic filter stage index.
A method for performing linear mixing on coupled Head-related transfer functions (HRTFs) to determine an interpolated HRTF for any specified arrival direction in a range (e.g., a range spanning at least 60 degrees in a plane, or a full range of 360 degrees in a plane), where the coupled HRTFs have been predetermined to have properties such that linear mixing can be performed thereon (to generate interpolated HRTFs) without introducing significant comb filtering distortion. In some embodiments, the method includes steps of: in response to a signal indicative of a specified arrival direction, performing linear mixing on data indicative of coupled HRTFs of a coupled HRTF set to determine an HRTF for the specified arrival direction; and performing HRTF filtering on an audio input signal using the HRTF for the specified arrival direction.
A method for determining mantissa bit allocation of audio data values of frequency domain audio data to be encoded. The allocation method includes a step of determining masking values for the audio data values, including by performing adaptive low frequency compensation on the audio data of each frequency band of a set of low frequency bands of the audio data. The adaptive low frequency compensation includes steps of: performing tonality detection on the audio data to generate compensation control data indicative of whether each frequency band in the set of low frequency bands has prominent tonal content; and performing low frequency compensation on the audio data in each frequency band in the set of low frequency bands having prominent tonal content as indicated by the compensation control data, but not performing low frequency compensation on the audio data in any other frequency band in the set of low frequency bands.
G10L 19/032 - Quantisation or dequantisation of spectral components
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
60.
DEVICE AND METHOD OF IMPROVING THE PERCEPTUAL LUMINANCE NONLINEARITY - BASED IMAGE DATA EXCHANGE ACROSS DIFFERENT DISPLAY CAPABILITIES
A handheld imaging device has a data receiver that is con- figured to receive reference encodal image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is ba.sed on perceptual non- linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specitic code valucs are configured to pro- duce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device- specific code values. W() 2013/086169 A1 111111 10 1 i 0 EE 10111 1111111111 10 1010111111E11 11111011E01 110E1 1 111111111111 TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, Published: EE, ES, Ft, FR, GB, GR, HR, HU, IE, LS, IT, LT, LU, with international search report (Art. 21(3)) LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BE, BJ, CF, CG, CI, CM, GA, GN, GQ, ~ before the aspiration of the time limit for amending the GW, ML, MR, NE, SN, TD, TG). claims and to be republished in the event of receipt of amendments (Rule 48.2(11)) Declarations under Rule 4.17: ~ as to applicant's entitlement to apply for and be granted a patent (Rule 4.17(ii)) Date Recue/Date Received 2021-04-07
G09G 5/02 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
A handheld imaging device has a data receiver that is con- figured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is bascd on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between thc reference code values and device-specific codc values of the imaging device. The device-specific code values are configured to pro- duce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode thc reference cncodcd image data into device-specific image data, which is encoded with the device- specific code values. WO 2013/086169 Al 111111 IMMO 111111 11111 11111 1111 101 l II 11111 1111 11111 11111 11111 1111 Mill 11 DEMI TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, Published: EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, ¨ with international search report (Art. 21(3)) LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, ¨ before the expiration of the time limit for amending the GW, ML, MR, NE, SN, TD, TG). claims and to be republished in the event of receipt of amendments (Rule 48.2(h)) Declarations under Rule 4.17: ¨ as to applicant's entitlement to apply for and be granted a patent (Rule 4.17(ii)) Date Recite/Date Received 2023-04-27
G09G 5/02 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
62.
DEVICE AND METHOD OF IMPROVING THE PERCEPTUAL LUMINANCE NONLINEARITY - BASED IMAGE DATA EXCHANGE ACROSS DIFFERENT DISPLAY CAPABILITIES
A handheld imaging device has a data receiver that is configured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to produce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device-specific code values.
A handheld imaging device has a data receiver that is con- figured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The retrence code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision aitapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to pro- duce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device- specific code values.
G09G 5/02 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
64.
DEVICE AND METHOD OF IMPROVING THE PERCEPTUAL LUMINANCE NONLINEARITY - BASED IMAGE DATA EXCHANGE ACROSS DIFFERENT DISPLAY CAPABILITIES
A handheld imaging device has a data receiver that is con- figured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is bascd on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between thc reference code values and device-specific codc values of the imaging device. The device-specific code values are configured to pro- duce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode thc reference cncodcd image data into device-specific image data, which is encoded with the device- specific code values. WO 2013/086169 Al 111111 IMMO 111111 11111 11111 1111 101 l II 11111 1111 11111 11111 11111 1111 Mill 11 DEMI TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, Published: EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, ¨ with international search report (Art. 21(3)) LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, ¨ before the expiration of the time limit for amending the GW, ML, MR, NE, SN, TD, TG). claims and to be republished in the event of receipt of amendments (Rule 48.2(h)) Declarations under Rule 4.17: ¨ as to applicant's entitlement to apply for and be granted a patent (Rule 4.17(ii)) Date Recite/Date Received 2023-04-27
G09G 5/02 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
A handheld imaging device has a data receiver that is configured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to produce gay levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device-specific code values.
A handheld imaging device has a data receiver that is configured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to produce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device-specific code values.
G09G 5/02 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/42 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
G09G 3/36 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix by control of light from an independent source using liquid crystals
67.
SYSTEM AND TOOLS FOR ENHANCED 3D AUDIO AUTHORING AND RENDERING
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that spe- cifies whether the stream is a channel-based or object-based stream. Channel- based streams have rendering informatiott encoded by tneans of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the render- ing location of a sound is based on the characteristics of the playback envir- onment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of refer- ence information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones, During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment.
Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment.
: Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environ- ment. WO 2013/006330 A2111111 10 1 E 0 11111 010 1E1 111110 1010 110 11 11 1011 01111111110 1 EEI 11 01 IIE Published: ¨ as to the applicant's entitlement to claim the priority of the earlier application (Rule 4.17(iii)) ¨ without international search report and to be republished ¨ of inventorship (Rule 4.17(iv)) upon receipt of that report (Rule 48.2(g)) Date Recue/Date Received 2021-04-20
The Total Surround Sound System (TSS System)*, as one of possible technological audio standards, which uses a special digital processor with 8 sound channels, besides applications in HDTV projection systems, theatre and multi-function halls and such like, could also be used in creating a Live Performance Sound. The TSS concept could be most interesting in sound reinforcing rock concerts and rock operas, musicals, various multimedia stage shows and such like. *TSS System - pat. pending
. : Embodiments are described for an adaptive audio system that . processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that spe- cifies whether the stream is a channel-based or object-based stream. Channel- based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the render- ing location of a sound is based on the characteristics of the playback envir- onment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of refer- ence information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment.
Systems and methods of audio signal processing are provided that relate to improved upmixing, whereby N audio channels are derived from M audio channels, a decorrelated version of the M audio channels and a set of spatial parameters. The set of spatial parameters includes an amplitude parameter, a correlation parameter and a phase parameter. The M audio channels are decorrelated using multiple decorrelation techniques to obtain the decorrelated version of the M audio channels. This can be used, for example, for generating an N audio channel upmix.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Systems and methods of audio signal processing are provided that relate to improved Upmixing, whereby N audio channels are derived from M audio channels, a decorrelated version of the M audio channels and a set of spatial parameters. The set of spatial parameters includes an amplitude parameter, a correlation parameter and a phase parameter. The M audio channels are decorrelated using multiple decorrelation techniques to obtain the decorrelated version of the M audio channels. This can be used, for example, for generating an N audio channel upmix.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
Techniques are provided to encode and decode image data comprising a tone mapped (TM) image with HDR reconstruction data in the form of luminance ratios and color residual values. In an example embodiment, luminance ratio values and residual values in color channels of a color space are generated on an individual pixel basis based on a high dynamic range (HDR) image and a derivative tone-mapped (TM) image that comprises one or more color alterations that would not be recoverable from the TM image with a luminance ratio image. The TM image with HDR reconstruction data derived from the luminance ratio values and the color-channel residual values may be outputted in an image file to a downstream device, for example, for decoding, rendering, and/or storing. The image file may be decoded to generate a restored HDR image free of the color alterations.
The computational resources that are needed to apply a transform-based filterbank to a limited-bandwidth audio signals are reduced by performing an integrated process of combining real-valued input data into complex-valued data and applying a short transform to the complex-valued data, applying a bank of very short transforms to the output of the integrated process, and deriving a sequence of real-valued output data from the outputs of the bank of very short transforms.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
81.
METHODS AND SYSTEMS FOR GENERATING FILTER COEFFICIENTS AND CONFIGURING FILTERS
Methods for generating a palette of feedback (IIR) filter coefficient sets and using the palette to configure (e.g., adaptively update) a prediction filter which includes a feedback filter, and a system for performing any of the methods. Examples of the system include an encoder, including a prediction filter and configured to encode data indicative of a waveform signal (e.g., samples of an audio signal), and a decoder. In some embodiments, the prediction filter is included in an encoder operable to generate (and assert to a decoder) encoded data including filter coefficient data indicative of the selected IIR coefficient set with which the prediction filter was configured during generation of the encoded data. In some embodiments, the timing with which adaptive updating of prediction filter configuration occurs or is allowed to occur is constrained (e.g., to optimize efficiency of prediction encoding).
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
82.
ADAPTIVE PROCESSING WITH MULTIPLE MEDIA PROCESSING NODES
Techniques for adaptive processing of media data based on separate data specifying a state of the media data are provided. A device in a media processing chain may determine whether a type of media processing has already been performed on an input version of media data. If so, the device may adapt its processing of the media data to disable performing the type of media processing. If not, the device performs the type of media processing. The device may create a state of the media data specifying the type of media processing. The device may communicate the state of the media data and an output version of the media data to a recipient device in the media processing chain, for the purpose of supporting the recipient device's adaptive processing of the media data.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 21/0324 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude - Details of processing therefor
G10L 25/21 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being power information
83.
ADAPTIVE PROCESSING WITH MULTIPLE MEDIA PROCESSING NODES
Techniques for adaptive processing of media data based on separate data specifying a state of the media data are provided. A device in a media processing chain may determine whether a type of media processing has already been performed on an input version of media data. If so, the device may adapt its processing of the media data to disable performing the type of media processing. If not, the device performs the type of media processing. The device may create a state of the media data specifying the type of media processing. The device may communicate the state of the media data and an output version of the media data to a recipient device in the media processing chain, for the purpose of supporting the recipient device's adaptive processing of the media data.
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
The invention relates to downmixing techniques by which output audio signals are obtained from input audio signals partitioned into subgroups. A variable common gain limiting factor is applied to all downmix coefficients that govern the contributions from the input signals in a subgroup. While preserving the proportions between signal values within a subgroup, the invention makes it possible to limit the gain of different input signal subgroups to different extents, so that relatively more perceptible signals can be limited relatively less. It then becomes possible to achieve a consistent dialogue level while transitioning in a less perceptible fashion between signal portions with and without gain limiting. Embodiments of the invention include a method, a mixing system and a computer-program product.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
85.
AUDIO STREAM MIXING WITH DIALOG LEVEL NORMALIZATION
A method for mixing of audio signals that allows maintaining of a consistent perceived sound level for the mixed signal by holding the sound level of the dominant signal in the mix constant by adjusting the sound level of the non-dominant signal(s) in relation to the dominant signal. It further includes receiving of a mixing balance input, which denotes the adjustable balance between the main and associated signals. It further includes identification of the dominant signal from the mixing balance input and mixing metadata, from which an appropriate scale factor for the non-dominant signal may also be determined directly from the scaling information, without the need for any analysis or measurement of the audio signals to be mixed. It further includes scaling the non-dominant signal in relation to the dominant signal and combining the scaled non-dominant signal with the dominant signal into a mixed signal.
A method, an apparatus, a computer readable storage medium configured with instructions for carrying out a method, and logic encoded in one or more computer- readable tangible medium to carry out actions. The method is to decode audio data that includes N.n channels to M.m decoded audio channels, including unpacking metadata and unpacking and decoding frequency domain exponent and mantissa data; determining transform coefficients from the unpacked and decoded frequency domain exponent and mantissa data; inverse transforming the frequency domain data; and in the case M
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
87.
AUDIO DECODER AND DECODING METHOD USING EFFICIENT DOWNMIXING
A method, an apparatus, a computer readable storage medium configured with instructions for carrying out a method, and logic encoded in one or more computer-readable tangible medium to carry out actions. The method is to decode audio data that includes N.n channels to M.m decoded audio channels, including unpacking metadata and unpacking and decoding frequency domain exponent and mantissa data; determining transform coefficients from the unpacked and decoded frequency domain exponent and mantissa data; inverse transforming the frequency domain data; and in the case M < N, downmixing according to downmixing data, the downmixing carried out efficiently.
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
88.
SYSTEM AND METHOD FOR NON-DESTRUCTIVELY NORMALIZING LOUDNESS OF AUDIO SIGNALS WITHIN PORTABLE DEVICES
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
94.
DECODING OF MULTICHANNEL AUDIO ENCODED BIT STREAMS USING ADAPTIVE HYBRID TRANSFORMATION
The processing efficiency of a process used to decode frames of an enhanced AC-3 bit stream is improved by processing each audio block in a frame only once. Audio blocks of encoded data are decoded in block order rather than in channel order. Exemplary decoding processes for enhanced bit stream coding features such as adaptive hybrid transform processing and spectral extension are disclosed.
Method and system for generating output signals for reproduction by two physical speakers in response to input audio signals indicative of sound from multiple source locations including at least two rear locations. Typically, the input signals are indicative of sound from three front locations and two rear locations (left and right surround sources). A virtualizer generates left and right surround outputs useful for driving front loudspeakers to emit sound that a listener perceives as emitting from rear sources. Typically, the virtualizer generates left and right surround outputs by transforming rear source inputs in accordance with a head- related transfer function. To ensure that virtual channels are well heard in the presence of other channels, the virtualizer performs dynamic range compression on rear source inputs. The dynamic range compression is preferably accomplished by amplifying rear source inputs or partially processed versions thereof in a nonlinear way relative to front source inputs.
H04S 3/02 - Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
96.
SYSTEMS AND METHODS FOR APPLYING ADAPTIVE GAMMA IN IMAGE PROCESSING FOR HIGH BRIGHTNESS AND HIGH DYNAMIC RANGE DISPLAYS
Systems and methods of image processing are provided for a display having a light source modulation layer and a display modulation layer. A section of a perceptual curve, such as a DICOM curve, is extracted for each frame of image data, based on a profile of expected luminance on the display modulation layer from light emitted by the light source modulation layer. The section of the perceptual curve may be used to determine a desired-total response curve which maps display modulation layer input control values to corresponding output luminance values. The desired-total response curve and a display modulator-specific response curve may be applied to image data to generate control values for driving the display modulation layer.
G09G 3/36 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix by control of light from an independent source using liquid crystals
G09G 3/34 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix by control of light from an independent source
97.
ENHANCING THE REPRODUCTION OF MULTIPLE AUDIO CHANNELS
This invention relates to the field of multichannel audio. More particularly, the invention relates to methods and apparatus for enhancing the reproduction of multiple audio channels. The channels include channels intended for playback to the front of a listening area and channels intended for playback to the sides and/or rear of the listening area. The methods comprise extracting out-of-phase sound information from a pair of the channels intended for playback to the sides or rear sides of the listening area, and applying the out-of-phase sound information to one or more loudspeakers located above loudspeakers playing back channels intended for playback to the front of the listening area.
In one embodiment the present invention includes a method of improving audibility of speech in a multi-channel audio signal. The method includes comparing a first characteristic and a second characteristic of the multi-channel audio signal to generate an attenuation factor. The first characteristic corresponds to a first channel of the multi-channel audio signal that contains speech and non-speech audio, and the second characteristic corresponds to a second channel of the multi-channel audio signal that contains predominantly non-speech audio. The method further includes adjusting the attenuation factor according to a speech likelihood value to generate an adjusted attenuation factor. The method further includes attenuating the second channel using the adjusted attenuation factor.
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
99.
METHOD AND APPARATUS FOR MAINTAINING SPEECH AUDIBILITY IN MULTI-CHANNEL AUDIO WITH MINIMAL IMPACT ON SURROUND EXPERIENCE
In one embodiment the present invention includes a method of improving audibility of speech in a multi-channel audio signal. The method includes comparing a first characteristic and a second characteristic of the multi-channel audio signal to generate an attenuation factor. The first characteristic corresponds to a first channel of the multi-channel audio signal that contains speech and non-speech audio, and the second characteristic corresponds to a second channel of the multi-channel audio signal that contains predominantly non-speech audio. The method further includes adjusting the attenuation factor according to a speech likelihood value to generate an adjusted attenuation factor. The method further includes attenuating the second channel using the adjusted attenuation factor.
Luminosity of individual LED light sources is measured and a forward voltage control of each LED is set so that each LED has a pre-determined (e.g., uniform) luminosity at a same modulation level. The LEDs are then driven via a modulation technique such as PWM, PCM, polyphase, etc. according to lighting requirements. The LEDs are, for example, a backlight of a dual modulation HDR LCD display system, and the lighting requirements are local dimming signals for the display.