Dolby International AB

Netherlands

1-100 of 319 for Dolby International AB

Sort by

Query

Patent
World - WIPO

Aggregations

Reset Report

Date

210

IPC Class

Found results for

patents

1 2 3 4 Next Page

1. METHODS, APPARATUS AND SYSTEMS FOR MODELLING AUDIO OBJECTS WITH EXTENT

Application Number	EP2022061331
Publication Number	2022/229319
Status	In Force
Filing Date	2022-04-28
Publication Date	2022-11-03
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fischer, Daniel Setiawan, Panji Fersch, Christof

Abstract

A method of modelling extended audio objects for audio rendering in a virtual or augmented reality environment is described. The method comprises obtaining an extent representation indicative of a geometric form of an extended audio object and information relating to one or more first audio sources that are associated with the extended audio object. Furthermore, the method comprises obtaining a relative point on the geometric form of the extended audio object based on a user position in the virtual or augmented reality environment. The method also comprises determining an extent parameter for the extent representation based on the user position and the relative point and determining positions of one or more second audio sources, relative to the user position, for modelling the extended audio object. In addition, the method comprises outputting a modified representation of the extended audio object for modelling the extended audio object.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

2. ENCODING OF ENVELOPE INFORMATION OF AN AUDIO DOWNMIX SIGNAL

Application Number	EP2022059005
Publication Number	2022/214480
Status	In Force
Filing Date	2022-04-05
Publication Date	2022-10-13
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Mundt, Harald

Abstract

A method for encoding envelope information is provided. In some implementations, the method involves determining a first downmixed signal associated with a downmixed channel associated with an audio signal to be encoded. In some implementations, the method involves determining energy levels of the first downmixed signal for a plurality of frequency bands. In some implementations, the method involves determining whether to encode information indicative of the energy levels in a bitstream. In some implementations, the method involves encoding the determined energy levels. In some implementations, the method involves generating an energy control value indicating that energy levels are encoded. In some implementations, the method involves generating the bitstream, wherein the energy control value and the information indicative of the energy levels are usable by the decoder to adjust energy levels associated with the first downmixed signal.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

3. MULTI-BAND DUCKING OF AUDIO SIGNALS TECHNICAL FIELD

Application Number	US2022023057
Publication Number	2022/216542
Status	In Force
Filing Date	2022-04-01
Publication Date	2022-10-13
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Tyagi, Rishabh Purnhagen, Heiko

Abstract

A method for multi-band ducking of audio signals is provided. In some implementations, the method involves receiving, at a decoder, an input audio signal, wherein the input audio signal is a downmixed audio signal. In some implementations, the method involves separating the input audio signal into a first set of frequency bands. In some implementations, the method involves determining a set of ducking gains, a ducking gain corresponding to a frequency band of the first set of frequency bands. In some implementations, the method involves generating a broadband decorrelated audio signal, wherein ducking gains of the set of ducking gains are applied to at least one of: 1) a second set of frequency bands prior to generating the at least one broadband decorrelated audio signal; or 2) a third set of frequency bands that separates the at least one broadband decorrelated audio signal.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

4. PROJECTION SYSTEM AND METHOD OF DRIVING A PROJECTION SYSTEM WITH FIELD MAPPING

Application Number	US2022021823
Publication Number	2022/204446
Status	In Force
Filing Date	2022-03-24
Publication Date	2022-09-29
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Pires-Arrifano, Angelo Le Barbenchon, Clement, Luc, Carol Pertierra, Juan, Pablo

Abstract

A projection system includes a light source configured to emit a light in response to an image data, a phase light modulator configured to receive the light from the light source and to apply a spatially-varying phase modulation on the light, thereby generating a projection light and steering the light on a reconstruction field, wherein the reconstruction field is a complex plane on which a reconstruction image is formed, and a controller configured to control the light source, control the phase light modulator, initialize (401) the reconstruction field to an initial value, and iteratively for each of a plurality of subframes within a frame of the image data: set (402) the reconstruction field to the initial value for the first iteration or set (402) the reconstruction field to a subsequent-iteration reconstruction field value for any subsequent-iteration, map (403) the reconstruction field to a modulation field, wherein the modulation field is a complex plane of the phase light modulator which modulates a phase of the light, set (404) an amplitude of the modulation field to a predetermined value, and map (405) the modulation field with the amplitude set to the predetermined value, to a subsequent-iteration reconstruction field, wherein the controller is further configured to provide (408) a phase control signal based on the modulation field mapped with the last iteration to the phase light modulator.

IPC Classes ?

H04N 9/31 - Projection devices for colour picture display
G03H 1/22 - Processes or apparatus for obtaining an optical image from holograms

5. AUDIO CODEC WITH ADAPTIVE GAIN CONTROL OF DOWNMIXED SIGNALS

Application Number	US2022019292
Publication Number	2022/192217
Status	In Force
Filing Date	2022-03-08
Publication Date	2022-09-15
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Setiawan, Panji Tyagi, Rishabh Bruhn, Stefan

Abstract

A method for performing gain control on audio signals is provided. In some implementations, the method involves determining downmixed signals associated with one or more downmix channels associated with a current frame of an audio signal to be encoded. In some implementations, the method involves determining whether an overload condition exists for an encoder. In some implementation, the method involves determining a gain parameter. In some implementations, the method involves determining at least one gain transition function based on the gain parameter and a gain parameter associated with a preceding frame of the audio signal. In some implementations, the method involves applying the at least one gain transition function to one or more of the downmixed signals. In some implementations, the method involves encoding the downmixed signals in connection with information indicative of gain control applied to the current frame.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/22 - Mode decision, i.e. based on audio signal content versus external parameters
G10L 19/002 - Dynamic bit allocation
G10L 19/005 - Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 19/16 - Vocoder architecture
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic

6. APPARATUS AND METHOD FOR LEVELING MAIN AND SUPPLEMENTARY AUDIO FROM A HBBTV SERVICE

Application Number	EP2022055717
Publication Number	2022/189341
Status	In Force
Filing Date	2022-03-07
Publication Date	2022-09-15
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Lassure, Gael Stahlmann, Alexander Mueller, Jan

Abstract

Described is a method of audio processing in a HbbTV terminal device. The method includes receiving a decoded broadcast feed including a first audio track, receiving HbbTV content relating to the broadcast feed, the HbbTV content including a second audio track, extracting level-related information from the decoded broadcast feed, wherein the level-related information is embedded in the decoded broadcast feed and enables to obtain an indication of an original audio level of the first audio track, analyzing the first audio track for determining an actual audio level of the first audio track, determining a gain factor based on the actual audio level and the original audio level, and generating a third audio track for output by the HbbTV terminal device based on the first audio track, the second audio track, and the gain factor. Also described is an apparatus for carrying out the method, as well as corresponding programs and computer-readable storage media.

IPC Classes ?

H04N 21/426 - Internal components of the client
H04N 21/462 - Content or additional data management e.g. creating a master electronic program guide from data received from the Internet and a Head-end or controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabi
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
H04H 20/10 - Arrangements for replacing or switching information during the broadcast or during the distribution
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs
H04N 21/458 - Scheduling content for creating a personalised stream, e.g. by combining a locally stored advertisement with an incoming stream; Updating operations, e.g. for OS modules
H04N 21/81 - Monomedia components thereof

7. PROJECTION SYSTEM AND METHOD WITH DYNAMIC TARGET GEOMETRY

Application Number	US2022014793
Publication Number	2022/165441
Status	In Force
Filing Date	2022-02-01
Publication Date	2022-08-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Pertierra, Juan Pablo Pires Arrifano, Angelo Miguel Le Barbenchon, Clement Luc Carol Richards, Martin J. Lippey, Barret

Abstract

A projection system and method includes a light source configured to emit a light in response to an image data; a phase light modulator configured to receive the light from the light source and to apply a spatially-varying phase modulation on the light, thereby to steer the light and generate a projection light; and a controller configured to dynamically determine, based on at least one of a user input or a sensor signal, a target geometry of a projection surface on which the projection light is projected, determine, based on the target geometry, a phase configuration for a frame of the image data, and provide a phase control signal to the phase light modulator, the phase control signal configured to cause the phase light modulator to generate the projection light in accordance with the phase configuration for the frame.

IPC Classes ?

H04N 9/31 - Projection devices for colour picture display

8. DETECTION AND ENHANCEMENT OF SPEECH IN BINAURAL RECORDINGS

Application Number	US2022012128
Publication Number	2022/155205
Status	In Force
Filing Date	2022-01-12
Publication Date	2022-07-21
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Cengarle, Giulio Ma, Yuanxing

Abstract

Disclosed herein are method, systems, and computer-program products for segmenting a binaural recording of speech into parts containing self-speech and parts containing external speech, and processing each category with different settings, to obtain an enhanced overall presentation. The segmentation is based on a combination of: i) feature-based frame-by-frame classification, and ii) detecting dissimilarity by statistical methods. The segmentation information is then used by a speech enhancement chain, where independent settings are used to process the self- and external speech parts.

IPC Classes ?

G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/87 - Detection of discrete points within a voice signal
G10L 21/0208 - Noise filtering
G10L 25/78 - Detection of presence or absence of voice signals

9. BINAURAL SIGNAL POST-PROCESSING

Application Number	US2021063878
Publication Number	2022/133128
Status	In Force
Filing Date	2021-12-16
Publication Date	2022-06-23
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Breebaart, Dirk Jeroen Cengarle, Giulio Brown, C. Phillip

Abstract

A method of audio processing includes performing spatial analysis on a binaural signal to estimate level differences and phase differences characteristic of a binaural filter of the binaural signal, performing object extraction on the binaural audio signal using the estimated level and phase differences to generate a left/right main component signal and a left/right residual component signal. The system may process the left/right main and left/right residual components differently using different object processing parameters for e.g. repositioning, equalization, compression, upmixing, channel remapping or storage to generate a processed binaural signal that provides an improved listening experience. Repositioning may be based on head tracking sensor data.

IPC Classes ?

H04S 5/00 - Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

10. METHOD AND APPARATUS FOR PROCESSING OF AUDIO DATA USING A PRE-CONFIGURED GENERATOR

Application Number	EP2021085852
Publication Number	2022/129146
Status	In Force
Filing Date	2021-12-15
Publication Date	2022-06-23
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit

Abstract

Described herein is a method for setting up a decoder for generating processed audio data from an audio bitstream, the decoder comprising a Generator of a Generative Adversarial Network, GAN, for processing of the audio data, wherein the method includes the steps of (a) pre-configuring the Generator for processing of audio data with a set of parameters for the Generator, the parameters being determined by training, at training time, the Generator using the full concatenated distribution; and (b) pre-configuring the decoder to determine, at decoding time, a truncation mode for modifying the concatenated distribution and to apply the determined truncation mode to the concatenated distribution. Described are further a method of generating processed audio data from an audio bitstream using a Generator of a Generative Adversarial Network, GAN, for processing of the audio data and a respective apparatus. Moreover, described are also respective systems and computer program products.

IPC Classes ?

G10L 21/0208 - Noise filtering
G10L 19/26 - Pre-filtering or post-filtering
G10L 19/005 - Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G06N 3/04 - Architecture, e.g. interconnection topology

11. PERVASIVE ACOUSTIC MAPPING

Application Number	IB2021000788
Publication Number	2022/118072
Status	In Force
Filing Date	2021-12-02
Publication Date	2022-06-09
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Thomas, Mark R.P. Southwell, Benjamin John Bruni, Avery Townsend, Olha Michelle Arteaga, Daniel Scaini, Davide Hines, Christopher Graham Seefeldt, Alan J. Gunawan, David Brown, C. Phillip

Abstract

Some methods may involve receiving a first content stream that includes first audio signals, rendering the first audio signals to produce first audio playback signals, generating first calibration signals, generating first modified audio playback signals by inserting the first calibration signals into the first audio playback signals, and causing a loudspeaker system to play back the first modified audio playback signals, to generate first audio device playback sound. The method(s) may involve receiving microphone signals corresponding to at least the first audio device playback sound and to second through Nth audio device playback sound corresponding to second through Nth modified audio playback signals (including second through Nth calibration signals) played back by second through Nth audio devices, extracting second through Nth calibration signals from the microphone signals and estimating at least one acoustic scene metric based, at least partly, on the second through Nth calibration signals.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 3/00 - Circuits for transducers

12. AUDIBILITY AT USER LOCATION THROUGH MUTUAL DEVICE AUDIBILITY

Application Number	US2021061506
Publication Number	2022/119990
Status	In Force
Filing Date	2021-12-02
Publication Date	2022-06-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Thomas, Mark R. P. Arteaga, Daniel Hines, Christopher Graham Scaini, Davide Southwell, Benjamin, John Bruni, Avery Townsend, Olha, Michelle

Abstract

Some methods involve causing a plurality of audio devices in an audio environment to reproduce audio data, each audio device of the plurality of audio devices including at least one loudspeaker and at least one microphone, determining audio device location data including an audio device location for each audio device of the plurality of audio devices and obtaining microphone data from each audio device of the plurality of audio devices. Some methods involve determining a mutual audibility for each audio device of the plurality of audio devices relative to each other audio device of the plurality of audio devices, determining a user location of a person in the audio environment, determining a user location audibility of each audio device of the plurality of audio devices at the user location and controlling one or more aspects of audio device playback based, at least in part, on the user location audibility.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

13. IMMERSIVE VOICE AND AUDIO SERVICES (IVAS) WITH ADAPTIVE DOWNMIX STRATEGIES

Application Number	US2021061671
Publication Number	2022/120093
Status	In Force
Filing Date	2021-12-02
Publication Date	2022-06-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Mundt, Harald Mcgrath, David S. Tyagi, Rishabh

Abstract

Disclosed is an audio signal encoding/decoding method that uses an encoding downmix strategy applied at an encoder that is different than a decoding re-mix/upmix strategy applied at a decoder. Based on the type of downmix coding scheme, the method comprises: computing input downmixing gains to be applied to the input audio signal to construct a primary downmix channel; determining downmix scaling gains to scale the primary downmix channel; generating prediction gains based on the input audio signal, the input downmixing gains and the downmix scaling gains; determining residual channel(s) from the side channels by using the primary downmix channel and the prediction gains to generate side channel predictions and subtracting the side channel predictions from the side channels; determining decorrelation gains based on energy in the residual channels; encoding the primary downmix channel, the residual channel(s), the prediction gains and the decorrelation gains; and sending the bitstream to a decoder.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 5/00 - Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

14. AUTOMATIC LOCALIZATION OF AUDIO DEVICES

Application Number	US2021061533
Publication Number	2022/120005
Status	In Force
Filing Date	2021-12-02
Publication Date	2022-06-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Arteaga, Daniel Scaini, Davide Thomas, Mark R.P. Bruni, Avery Townsend, Olha Michelle

Abstract

A method may involve: receiving direction of arrival (DOA) data corresponding to sound emitted by at least a first smart audio device of the audio environment that includes a first audio transmitter and a first audio receiver, the DOA data corresponding to sound received by at least a second smart audio device of the audio environment that includes a second audio transmitter and a second audio receiver, the DOA data corresponding to sound emitted by at least the second smart audio device and received by at least the first smart audio device; receiving one or more configuration parameters corresponding to the audio environment, to one or more audio devices, or both; and minimizing a cost function based at least in part on the DOA data and the configuration parameter(s), to estimate a position and an orientation of at least the first smart audio device and the second smart audio device.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 3/00 - Circuits for transducers

15. ROTATION OF SOUND COMPONENTS FOR ORIENTATION-DEPENDENT CODING SCHEMES

Application Number	US2021061549
Publication Number	2022/120011
Status	In Force
Filing Date	2021-12-02
Publication Date	2022-06-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan Mundt, Harald Mcgrath, David S. Brown, Stefanie

Abstract

Method for encoding scene-based audio is provided. In some implementations, the method involves determining, by an encoder, a spatial direction of a dominant sound component in a frame of an input audio signal. In some implementations, the method involves determining rotation parameters based on the determined spatial direction and a direction preference of a coding scheme to be used to encode the input audio signal. In some implementations, the method involves rotating sound components of the frame based on the rotation parameters such that, after being rotated, the dominant sound component has a spatial direction that aligns with the direction preference of the coding scheme. In some implementations, the method involves encoding the rotated sound components of the frame of the input audio signal using the coding scheme in connection with an indication of the rotation parameters or an indication of the spatial direction of the dominant sound component.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

16. AUTOMATIC GENERATION AND SELECTION OF TARGET PROFILES FOR DYNAMIC EQUALIZATION OF AUDIO CONTENT

Application Number	US2021059827
Publication Number	2022/115303
Status	In Force
Filing Date	2021-11-18
Publication Date	2022-06-02
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Cengarle, Giulio Engel, Nicholas Laurence Scannell, Patrick Winfrey Scaini, David

Abstract

In an embodiment, a method comprises: filtering reference audio content items to separate the reference audio content items into different frequency bands; for each frequency band, extracting a first feature vector from at least a portion of each of the reference audio content items, wherein the first feature vector includes at least one audio characteristic of the reference audio content items; obtaining at least one semantic label from at least a portion of each of the reference audio content items; obtaining a second feature vector consisting of the first feature vectors per frequency band and the at least one semantic label; generating, based on the second feature vector, cluster feature vectors representing centroids of clusters; separating the reference audio content items according to the cluster feature vectors; and computing an average target profile for each cluster based on the reference audio content items in the cluster.

IPC Classes ?

H03G 5/16 - Automatic control
G06K 9/62 - Methods or arrangements for recognition using electronic means
H03G 5/00 - Tone control or bandwidth control in amplifiers

17. ROBUST INTRUSIVE PERCEPTUAL AUDIO QUALITY ASSESSMENT BASED ON CONVOLUTIONAL NEURAL NETWORKS

Application Number	EP2021083531
Publication Number	2022/112594
Status	In Force
Filing Date	2021-11-30
Publication Date	2022-06-02
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit Jiang, Guanxin

Abstract

Described herein is a computer-implemented deep-learning-based system for determining an indication of an audio quality of an input audio frame. The system comprises at least one inception block configured to receive at least one representation of an input audio frame and to map the at least one representation of the input audio frame into a feature map; and at least one fully connected layer configured to receive a feature map corresponding to the at least one representation of the input audio frame from the at least one inception block, wherein the at least one fully connected layer is configured to determine the indication of the audio quality of the input audio frame. Described are further respective methods of operating and training said system.

IPC Classes ?

G10L 25/60 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G06N 3/04 - Architecture, e.g. interconnection topology

18. SIGNAL CODING USING A GENERATIVE MODEL AND LATENT DOMAIN QUANTIZATION

Application Number	EP2021078053
Publication Number	2022/078960
Status	In Force
Filing Date	2021-10-11
Publication Date	2022-04-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Klejsa, Janusz Villemoes, Lars Hedelin, Per

Abstract

The present disclosure provides a decoder configured to receive a finite bitrate stream that includes a quantized latent frame, where the quantized latent frame includes a quantized representation of a current frame of a signal in a latent domain different from a first domain; to generate a reconstructed latent frame from the quantized latent frame; to use a generative neural network model to perform a task for which the general neural network model has been trained, wherein the task includes to generate parameters for an invertible mapping from the latent domain to the first domain; to reconstruct a current frame of the signal in the first domain, which includes to map the reconstructed latent frame to the first domain by use of the invertible mapping, and to use the reconstructed current frame of the signal in the first domain to update a state of the generative neural network model.

IPC Classes ?

G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G06N 3/04 - Architecture, e.g. interconnection topology

19. A GENERATIVE NEURAL NETWORK MODEL FOR PROCESSING AUDIO SAMPLES IN A FILTER-BANK DOMAIN

Application Number	EP2021078652
Publication Number	2022/079263
Status	In Force
Filing Date	2021-10-15
Publication Date	2022-04-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Ekstrand, Per Klejsa, Janusz Tinajero, Pedro Jafeth Villasana Villemoes, Lars

Abstract

A neural network system is provided, implementing a generative model for autoregressively generating a distribution for a plurality of current filter-bank samples of an audio signal, wherein the current samples correspond to a current time slot, and each current sample corresponds to a channel of the filter-bank. The system includes a hierarchy of a plurality of neural network processing tiers ordered from a top to a bottom tier, each tier trained to generate conditioning information based on previous filter-bank samples and, for at least each tier but the top tier, also on the conditioning information from a tier higher up in the hierarchy, and an output stage trained to generate the probability distribution based on previous samples for one or more previous time slots and the conditioning information from the lowest processing tier.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/04 - Architecture, e.g. interconnection topology
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
H03H 17/02 - Frequency-selective networks

20. A GENERAL MEDIA NEURAL NETWORK PREDICTOR AND A GENERATIVE MODEL INCLUDING SUCH A PREDICTOR

Application Number	US2021054617
Publication Number	2022/081599
Status	In Force
Filing Date	2021-10-12
Publication Date	2022-04-21
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Zhou, Cong Vinton, Mark, S. Davidson, Grant, A. Villemoes, Lars

Abstract

A neural network system for predicting frequency coefficients of a media signal, the neural network system comprising a time predicting portion including at least one neural network trained to predict a first set of output variables representing a specific frequency band of a current time frame given coefficients of one or several previous time frames, and a frequency predicting portion including a at least one neural network trained to predict a second set of output variables representing a specific frequency band given coefficients of one or several frequency bands adjacent to the specific frequency band in said current time frame. Such a neural network system forms a predictor capable of capturing both temporal and frequency dependencies occurring in time-frequency tiles of a media signal.

IPC Classes ?

G10L 19/04 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
G10L 21/038 - Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
G06N 3/02 - Neural networks

21. FRAME-LEVEL PERMUTATION INVARIANT TRAINING FOR SOURCE SEPARATION

Application Number	US2021054737
Publication Number	2022/081678
Status	In Force
Filing Date	2021-10-13
Publication Date	2022-04-21
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Liu, Xiaoyu Pons Puig, Jordi

Abstract

Described is a method of training a deep-learning-based system for sound source separation. The system comprises a separation stage for frame-wise extraction of representations of sound sources from a representation of an audio signal, and a clustering stage for generating, for each frame, a vector indicative of an assignment permutation of extracted frames of representations of sound sources to respective sound sources. The representation of the audio signal is a waveform-based representation. The separation stage is trained using frame-level permutation invariant training. Further, the clustering stage is trained to generate embedding vectors for the frames of the audio signal that allow to determine estimates of respective assignment permutations between extracted sound signals and labels of sound sources that had been used for the frames. Also described is a method of using the deep-learning-based system for sound source separation.

IPC Classes ?

G10L 21/0308 - Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G06N 3/08 - Learning methods
G06N 3/02 - Neural networks

22. REAL-TIME PACKET LOSS CONCEALMENT USING DEEP GENERATIVE NETWORKS

Application Number	EP2021078443
Publication Number	2022/079164
Status	In Force
Filing Date	2021-10-14
Publication Date	2022-04-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Pascual, Santiago Serra, Joan Pons Puig, Jordi

Abstract

The present disclosure relates to a method and system for performing packet loss concealment using a neural network system. The method comprises obtaining a representation of an incomplete audio signal, inputting the representation of the incomplete audio signal to an encoder neural network and outputting a latent representation of a predicted complete audio signal. The latent representation is input to a decoder neural network which outputs a representation of a predicted complete audio signal comprising a reconstruction of the original portion of the complete audio signal, wherein said encoder neural network and said decoder neural network have been trained with an adversarial neural network.

IPC Classes ?

G10L 19/005 - Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

23. METHOD AND APPARATUS FOR GENERATING AN INTERMEDIATE AUDIO FORMAT FROM AN INPUT MULTICHANNEL AUDIO SIGNAL

Application Number	EP2021078444
Publication Number	2022/079165
Status	In Force
Filing Date	2021-10-14
Publication Date	2022-04-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Arteaga, Daniel Pons Puig, Jordi

Abstract

Described herein is a method for training a machine learning algorithm. The method may comprise receiving a first input multichannel audio signal. The method may comprise generating, using the machine learning algorithm, an intermediate audio signal based on the first input multichannel audio signal. The method may comprise rendering the intermediate audio signal into a first output multichannel audio signal. Further, the method may comprise improving the machine learning algorithm based on a difference between the first input multichannel audio signal and the first output multichannel audio signal. Described herein are further an apparatus for generating an intermediate audio format from an input multichannel audio signal as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

IPC Classes ?

G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
H04S 1/00 - Two-channel systems
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
G10L 21/0272 - Voice signal separating
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

24. METHOD AND APPARATUS FOR NEURAL NETWORK BASED PROCESSING OF AUDIO USING SINUSOIDAL ACTIVATION

Application Number	EP2021078653
Publication Number	2022/079264
Status	In Force
Filing Date	2021-10-15
Publication Date	2022-04-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit

Abstract

Described herein is a method of processing an audio signal using a deep-learning-based generator, wherein the method includes the steps of: (a) inputting the audio signal into the generator for processing the audio signal; (b) mapping a time segment of the audio signal to a latent feature space representation, using an encoder stage of the generator; (c) upsampling the latent feature space representation using a decoder stage of the generator, wherein at least one layer of the decoder stage applies sinusoidal activation; and (d) obtaining, as an output from the decoder stage of the generator, a processed audio signal. Described are further a method for training said generator and respective apparatus, systems and computer program products.

IPC Classes ?

G10L 21/0208 - Noise filtering
G06N 3/04 - Architecture, e.g. interconnection topology

25. ADAPTIVE NOISE ESTIMATION

Application Number	US2021051162
Publication Number	2022/066590
Status	In Force
Filing Date	2021-09-21
Publication Date	2022-03-31
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Scaini, Davide Yeh, Chunghsin Cengarle, Giulio De Burgh, Mark David

Abstract

In some embodiments, a method, comprises: dividing, using at least one processor, an audio input into speech and non-speech segments; for each frame in each non-speech segment, estimating, using the at least one processor, a time-varying noise spectrum of the non-speech segment; for each frame in each speech segment, estimating, using the at least one processor, speech spectrum of the speech segment; for each frame in each speech segment, identifying one or more non-speech frequency components in the speech spectrum; comparing the one or more non-speech frequency components with one or more corresponding frequency components in a plurality of estimated noise spectra and selecting the estimated noise spectrum from the plurality of estimated noise spectra based on a result of the comparing.

IPC Classes ?

G10L 21/0232 - Processing in the frequency domain
G10L 25/78 - Detection of presence or absence of voice signals

26. METHOD AND DEVICE FOR PROCESSING A BINAURAL RECORDING

Application Number	US2021050534
Publication Number	2022/060891
Status	In Force
Filing Date	2021-09-15
Publication Date	2022-03-24
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Shuang, Zhiwei Ma, Yuanxing Liu, Yang Yang, Ziyu Cengarle, Giulio

Abstract

The present invention relates to a method and device for processing a first and a second audio signal representing an input binaural audio signal acquired by a binaural recording device. The present invention further relates to a method for rendering a binaural audio signal on a speaker system. The method for processing a binaural signal comprising extracting audio information from the first audio signal, computing a band gain for reducing noise in the first audio signal and applying the band gains to respective frequency bands of the first audio signal in accordance with a dynamic scaling factor, to provide a first output audio signal. Wherein the dynamic scaling factor has a value between zero and one and is selected so as to reduce quality degradation for the first audio signal.

IPC Classes ?

G10L 21/0208 - Noise filtering
H04S 1/00 - Two-channel systems

27. PROCESSING PARAMETRICALLY CODED AUDIO

Application Number	US2021049285
Publication Number	2022/055883
Status	In Force
Filing Date	2021-09-07
Publication Date	2022-03-17
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Breebaart, Dirk Jeroen Eckert, Michael Purnhagen, Heiko

Abstract

A method comprising receiving a first input bit stream for a first parametrically coded input audio signal, the first input bit stream including data representing a first input core audio signal and a first set including at least one spatial parameter relating to the first parametrically coded input audio signal. A first covariance matrix of the first parametrically coded audio signal is determined based on the spatial parameter(s) of the first set. A modified set including at least one spatial parameter is determined based on the determined first covariance matrix, wherein the modified set is different from the first set. An output core audio signal is determined, which is based on, or constituted by, the first input core audio signal. An output bit stream for a parametrically coded output audio signal is generated, the output bit stream including data representing the output core audio signal and the modified set.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/16 - Vocoder architecture

28. AUTOMATIC DETECTION AND ATTENUATION OF SPEECH-ARTICULATION NOISE EVENTS

Application Number	EP2021072384
Publication Number	2022/034139
Status	In Force
Filing Date	2021-08-11
Publication Date	2022-02-17
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Yeh, Chunghsin Cengarle, Giulio De Burgh, Mark David

Abstract

Described is a method of performing automatic audio enhancement on an input audio signal including at least one speech-articulation noise event. The method comprises: segmenting the input audio signal into a number of audio frames; obtaining at least one feature parameter from the audio frames; and determining, based at least in part on the obtained feature parameter, a respective type of the speech-articulation noise event and a respective time-frequency range associated with the speech-articulation noise event within the input audio signal.

IPC Classes ?

G10L 15/04 - Segmentation; Word boundary detection
G10L 21/0264 - Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 21/034 - Automatic adjustment
G10L 25/93 - Discriminating between voiced and unvoiced parts of speech signals
G10L 21/0308 - Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 25/09 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being zero crossing rates
G10L 25/21 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being power information
G10L 25/24 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being the cepstrum
G10L 25/84 - Detection of presence or absence of voice signals for discriminating voice from noise
G10L 21/0316 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude

29. HUM NOISE DETECTION AND REMOVAL FOR SPEECH AND MUSIC RECORDINGS

Application Number	EP2021071148
Publication Number	2022/023415
Status	In Force
Filing Date	2021-07-28
Publication Date	2022-02-03
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Yeh, Chunghsin

Abstract

Described are methods of processing audio data for hum noise detection and/or removal. The audio data comprises a plurality of frames. One method incudes: classifying frames of the audio data as either content frames or noise frames, using one or more content activity detectors; determining a noise spectrum from one or more frames of the audio data that are classified as noise frames; determining one or more hum noise frequencies based on the determined noise spectrum; generating an estimated hum noise signal based on the one or more hum noise frequencies; and removing hum noise from at least one frame of the audio data based on the estimated hum noise signal. Also described are apparatus for carrying out the methods, as well as corresponding programs and computer-readable storage media.

IPC Classes ?

G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
G10L 25/78 - Detection of presence or absence of voice signals
G10L 21/0232 - Processing in the frequency domain
G10L 21/0208 - Noise filtering

30. PACKET LOSS CONCEALMENT

Application Number	EP2021068774
Publication Number	2022/008571
Status	In Force
Filing Date	2021-07-07
Publication Date	2022-01-13
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Mundt, Harald Bruhn, Stefan Purnhagen, Heiko Plain, Simon Schug, Michael

Abstract

Described are methods of processing an audio signal for packet loss concealment. The audio signal comprises a sequence of frames, each frame containing representations of a plurality of audio channels and reconstruction parameters for upmixing the plurality of audio channels to a predetermined channel format. One method includes: receiving the audio signal; and generating a reconstructed audio signal in the predefined channel format based on the received audio signal. Generating the reconstructed audio signal comprises: determining whether at least one frame of the audio signal has been lost; and if a number of consecutively lost frames exceeds a first threshold, fading the reconstructed audio signal to a predefined spatial configuration. Also described is a method of encoding an audio signal. Yet further described are apparatus for carrying out the methods, as well as corresponding programs and computer-readable storage media.

IPC Classes ?

G10L 19/005 - Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

31. SYSTEM FOR AUTOMATED MULTITRACK MIXING

Application Number	EP2021066206
Publication Number	2021/259725
Status	In Force
Filing Date	2021-06-16
Publication Date	2021-12-30
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Steinmetz, Christian James Serra, Joan

Abstract

A deep-learning-based system for performing automated multitrack mixing based on a plurality of input audio tracks is described herein. The system comprises one or more instances of a deep-learning-based first network and one or more instances of a deep- learning-based second network. Particularly, the first network is configured to, based on the 5 input audio tracks, generate parameters for use in the automated multitrack mixing. The second network is configured to, based on the parameters, apply signal processing and at least one mixing gain to the input audio tracks, for generating an output mix of the audio tracks.

IPC Classes ?

G11B 27/038 - Cross-faders therefor
G10H 1/00 - ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE - Details of electrophonic musical instruments
H04H 60/04 - Studio equipment; Interconnection of studios
G10H 1/46 - Volume control

32. METHOD FOR LEARNING AN AUDIO QUALITY METRIC COMBINING LABELED AND UNLABELED DATA

Application Number	EP2021066786
Publication Number	2021/259842
Status	In Force
Filing Date	2021-06-21
Publication Date	2021-12-30
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Serra, Joan Pons Puig, Jordi Pascual, Santiago

Abstract

Described is a method of training a neural-network-based system for determining an indication of an audio quality of an audio input. The method includes obtaining, as input, at least one training set comprising audio samples. The audio samples include audio samples of a first type and audio samples of a second type, wherein each of the first type of audio samples is labelled with information indicative of a respective predetermined audio quality metric, and wherein each of the second type of audio samples is labelled with information indicative of a respective audio quality metric relative to that of a reference audio sample. The method further includes: inputting the training set to the neural-network-based system; and iteratively training the system to predict the respective label information of the audio samples in the training set.

IPC Classes ?

G10L 25/69 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for evaluating synthetic or decoded voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

33. PERCEPTUAL OPTIMIZATION OF MAGNITUDE AND PHASE FOR TIME-FREQUENCY AND SOFTMASK SOURCE SEPARATION SYSTEMS

Application Number	US2021036866
Publication Number	2021/252795
Status	In Force
Filing Date	2021-06-10
Publication Date	2021-12-16
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Master, Aaron Steven Lu, Lie Purnhagen, Heiko

Abstract

A method comprises: obtaining softmask values for frequency bins of time-frequency tiles representing an audio signal; reducing, or expanding and limiting, the softmask values; and applying the reduced, or expanded and limited, softmask values to the frequency bins to create a time-frequency representation of an estimated target source. An alternative method comprises, for each time-frequency tile: obtaining softmask values; applying the softmask values to the frequency bins to create a time-frequency domain representation of an estimated target source; obtaining a panning parameter and a source concentration estimates for the target source; determining, using the panning parameter estimate and the softmask values, a magnitude for the time-frequency representation of the estimated target source; determining, using the panning parameter estimate and the source phase concentration estimate, a phase for the time-frequency representation of the estimated target source; and combining the magnitude and the phase.

IPC Classes ?

G10L 21/0272 - Voice signal separating
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

34. FRAME LOSS CONCEALMENT FOR A LOW-FREQUENCY EFFECTS CHANNEL

Application Number	EP2021065613
Publication Number	2021/250167
Status	In Force
Filing Date	2021-06-10
Publication Date	2021-12-16
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan

Abstract

A method of generating a substitution frame for a lost audio frame of an audio signal is presented. The method may comprise determining an audio filter based on samples of a valid audio frame preceding the lost audio frame. The method may comprise generating the substitution frame based on the audio filter and the samples of the valid audio frame preceding the lost audio frame. The method may be advantageously applied to a low frequency effects (LFE) channel of a multi-channel audio signal.

IPC Classes ?

G10L 19/005 - Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 25/12 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being prediction coefficients

35. METHODS, APPARATUS, AND SYSTEMS FOR DETECTION AND EXTRACTION OF SPATIALLY-IDENTIFIABLE SUBBAND AUDIO SOURCES

Application Number	US2021036900
Publication Number	2021/252823
Status	In Force
Filing Date	2021-06-11
Publication Date	2021-12-16
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Master, Aaron Steven Lu, Lie Mundt, Harald

Abstract

In an embodiment, a method comprises: transforming one or more frames of a two-channel time domain audio signal into a time-frequency domain representation including a plurality of time-frequency tiles, wherein the frequency domain of the time-frequency domain representation includes a plurality of frequency bins grouped into subbands. For each time-frequency tile, the method comprises: calculating spatial parameters and a level for the time-frequency tile; modifying the spatial parameters using shift and squeeze parameters; obtaining a softmask value for each frequency bin using the modified spatial parameters, the level and subband information; and applying the softmask values to the time-frequency tile to generate a modified time-frequency tile of an estimated audio source. In an embodiment, a plurality of frames of the time-frequency tiles are assembled into a plurality of chunks, wherein each chunk includes a plurality of subbands, and the method described above is performed on each subband of each chunk.

IPC Classes ?

G10L 21/0272 - Voice signal separating

36. METHOD AND APPARATUS FOR DETERMINING PARAMETERS OF A GENERATIVE NEURAL NETWORK

Application Number	EP2021064511
Publication Number	2021/245015
Status	In Force
Filing Date	2021-05-31
Publication Date	2021-12-09
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit Plain, Simon

Abstract

Described herein is a method of determining parameters for a generative neural network for processing an audio signal, wherein the generative neural network includes an encoder stage mapping to a coded feature space and a decoder stage, each stage including a plurality of convolutional layers with one or more weight coefficients, the method comprising a plurality of cycles with sequential processes of: pruning the weight coefficients of either or both stages based on pruning control information, the pruning control information determining the number of weight coefficients that are pruned for respective convolutional layers; training the pruned generative neural network based on a set of training data; determining a loss for the trained and pruned generative neural network based on a loss function; and determining updated pruning control information based on the determined loss and a target loss. Further described are corresponding apparatus, programs, and computer-readable storage media.

IPC Classes ?

G10L 19/26 - Pre-filtering or post-filtering
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 21/0208 - Noise filtering
G06N 3/08 - Learning methods
G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility

37. IMPROVED MAIN-ASSOCIATED AUDIO EXPERIENCE WITH EFFICIENT DUCKING GAIN APPLICATION

Application Number	EP2021063427
Publication Number	2021/239562
Status	In Force
Filing Date	2021-05-20
Publication Date	2021-12-02
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Popp, Jens Spenger, Claus-Christian Merpillat, Celine Mueller, Tobias Hoerich, Holger

Abstract

An audio bitstream is decoded into audio objects and audio metadata for the audio objects. The audio objects include a specific audio object. The audio metadata specifies frame-level gains that include a first gain and a second gain respectively for a first audio frame and a second audio frame. It is determined, based on the first and second gains, whether sub-frame gains are to be generated for the specific audio object. If so, a ramp length is determined for a ramp used to generate the sub-frame gains for the specific audio object. The ramp of the ramp length is used to generate the sub-frame gains for the specific audio object. A sound field represented by the audio objects with the sub-frame gains is rendered by audio speakers.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic

38. METHOD AND UNIT FOR PERFORMING DYNAMIC RANGE CONTROL

Application Number	EP2021062942
Publication Number	2021/233809
Status	In Force
Filing Date	2021-05-17
Publication Date	2021-11-25
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Gorlow, Stanislaw Thesing, Robin

Abstract

The present document describes a dynamic range control unit (210) configured to apply dynamic range control, referred to as DRC, to an audio signal (211). The DRC unit (210) is configured to downsample a subband signal (212) derived from the audio signal (211), to provide a downsampled subband signal (321), to determine a DRC gain (329) based on the downsampled subband signal (321), and to apply the DRC gain (329) to the subband signal (212), to provide a compressed subband signal (213) of a compressed audio signal (214).

IPC Classes ?

G10L 21/0316 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H03G 7/00 - Volume compression or expansion in amplifiers
H03G 9/02 - Combinations of two or more types of control, e.g. gain control and tone control in untuned amplifiers

39. METHODS AND APPARATUS FOR UNIFIED SPEECH AND AUDIO DECODING IMPROVEMENTS

Application Number	EP2021063092
Publication Number	2021/233886
Status	In Force
Filing Date	2021-05-18
Publication Date	2021-11-25
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Beer, Michael Franz Rubin, Eytan Fischer, Daniel Fersch, Christof Werner, Markus

Abstract

Described herein are methods, apparatus and computer products for decoding an encoded MPEG-D USAC bitstream. Described herein are such methods, apparatus and computer products that reduce a computational complexity.

IPC Classes ?

G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/16 - Vocoder architecture
G10L 19/18 - Vocoders using multiple modes
G10L 19/07 - Line spectrum pair [LSP] vocoders

40. METHOD AND DEVICE FOR IMPROVING DIALOGUE INTELLIGIBILITY DURING PLAYBACK OF AUDIO DATA

Application Number	EP2021062619
Publication Number	2021/228935
Status	In Force
Filing Date	2021-05-12
Publication Date	2021-11-18
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Schindler, Christian Schmidt, Malte

Abstract

Described herein is a method for improving dialogue intelligibility during playback of audio data on a playback device, wherein the audio data comprise dialogue audio data, and at least one of music and effects audio data, the method including the steps of: determining a volume mixing ratio based on a volume value for playback; mixing the dialogue audio data and the at least one of music and effects audio data based on said volume mixing ratio; and outputting the mixed audio data for playback. Described are further a respective playback device and a respective computer program product.

IPC Classes ?

G11B 27/031 - Electronic editing of digitised analogue information signals, e.g. audio or video signals
H04N 21/439 - Processing of audio elementary streams
G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 21/034 - Automatic adjustment
G06F 3/16 - Sound input; Sound output
H04R 3/00 - Circuits for transducers

41. METHOD AND APPARATUS COMBINING SEPARATION AND CLASSIFICATION OF AUDIO SIGNALS

Application Number	US2021030506
Publication Number	2021/225978
Status	In Force
Filing Date	2021-05-03
Publication Date	2021-11-11
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Master, Aaron, Steven Lu, Lie Lehtonen, Heidi-Maria

Abstract

Computer-implemented methods and devices for combined audio separation and classification are provided. An estimated separated signal is time gated based on a determination of an audio classifier of, at least in part, the original mix of signals before separation. Combined separation, classification, and time gating of both the estimated signal and a residual signal are also provided.

IPC Classes ?

G10L 21/0272 - Voice signal separating
G10L 25/78 - Detection of presence or absence of voice signals

42. METHOD, APPARATUS AND SYSTEM FOR ENHANCING MULTI-CHANNEL AUDIO IN A DYNAMIC RANGE REDUCED DOMAIN

Application Number	EP2021061283
Publication Number	2021/219798
Status	In Force
Filing Date	2021-04-29
Publication Date	2021-11-04
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit

Abstract

Described herein is a method of generating, in a dynamic range reduced domain, an enhanced multi-channel audio signal from an audio bitstream including a multi-channel audio signal, wherein the multi-channel audio signal comprises two or more channels, and wherein the method includes jointly enhancing the two or more channels of the dynamic range reduced raw multi-channel audio signal using a multi-channel Generator of a Generative Adversarial Network setting. Described herein are further a method for training a multi-channel Generator in a dynamic range reduced domain in a Generative Adversarial Network setting, an apparatus for generating, in a dynamic range reduced domain, an enhanced multi-channel audio signal from an audio bitstream including a multi-channel audio signal, respective systems and a computer program product.

IPC Classes ?

G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
G06N 3/04 - Architecture, e.g. interconnection topology

43. DIFFRACTION MODELLING BASED ON GRID PATHFINDING

Application Number	EP2021058105
Publication Number	2021/198152
Status	In Force
Filing Date	2021-03-29
Publication Date	2021-10-07
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fischer, Daniel Fersch, Christof Setiawan, Panji

Abstract

Described herein is a method of processing audio content for rendering in a three-dimensional audio scene, wherein the audio content comprises a sound source at a source position, the method comprising: obtaining a voxelized representation of the three-dimensional audio scene, wherein the voxelized representation indicates volume elements in which sound can propagate and volume elements by which sound is occluded; generating a two-dimensional projection map for the audio scene based on the voxelized representation by applying a projection operation to the voxelized representation that projects onto a horizontal plane; and determining parameters indicating a virtual source position of a virtual sound source based on the source position, a listener position, and the projection map, to simulate, by rendering a virtual source signal from the virtual source position, an impact of acoustic diffraction by the three-dimensional audio scene on a source signal of the sound source at the source position. Described are moreover a corresponding apparatus as well as corresponding computer program products.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
G10K 15/08 - Arrangements for producing a reverberation or echo sound
G06T 15/10 - Geometric effects
A63F 13/54 - Controlling the output signals based on the game progress involving acoustic signals, e.g. for simulating revolutions per minute [RPM] dependent engine sounds in a driving game or reverberation against a virtual wall

44. AUTOMATIC LEVELING OF SPEECH CONTENT

Application Number	US2021024232
Publication Number	2021/195429
Status	In Force
Filing Date	2021-03-25
Publication Date	2021-09-30
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Yeh, Chunghsin Cengarle, Giulio De Burgh, Mark David

Abstract

Embodiments are disclosed for automatic leveling of speech content. In an embodiment, a method comprises: receiving, using one or more processors, frames of an audio recording including speech and non-speech content; for each frame: determining, using the one or more processors, a speech probability; analyzing, using the one or more processors, a perceptual loudness of the frame; obtaining, using the one or more processors, a target loudness range for the frame; computing, using the one or more processors, gains to apply to the frame based on the target loudness range and the perceptual loudness analysis, where the gains include dynamic gains that change frame-by-frame and that are scaled based on the speech probability; and applying the gains to the frame so that a resulting loudness range of the speech content in the audio recording fits within the target loudness range.

IPC Classes ?

G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 17/00 - Speaker identification or verification
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/78 - Detection of presence or absence of voice signals
H03G 3/32 - Automatic control in amplifiers having semiconductor devices the control being dependent upon ambient noise level or sound level

45. BASS ENHANCEMENT FOR LOUDSPEAKERS

Application Number	US2021023239
Publication Number	2021/188953
Status	In Force
Filing Date	2021-03-19
Publication Date	2021-09-23
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Ekstrand, Per Hao, Yuxing Yu, Xuemei

Abstract

A method of audio processing includes generating harmonics in a hybrid complex quadrature mirror filter domain. Generating the harmonics may include multiplication, using a feedback delay loop, and dynamic compression. The harmonics may be generated based on one or more hybrid sub-bands of the complex transform domain signal.

IPC Classes ?

H04R 3/04 - Circuits for transducers for correcting frequency response

46. METHOD AND DEVICE FOR ADAPTIVE PLAYOUT OF MEDIA CONTENT

Application Number	EP2021052289
Publication Number	2021/156194
Status	In Force
Filing Date	2021-02-01
Publication Date	2021-08-12
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Larsen, Jonas Moeller

Abstract

Described herein is a method for controlling media data playout on a client device, wherein the method includes the steps of: (a) retrieving, by the client device, media data comprising a plurality of segments subdivided into one or more chunks for playout from at least one media server; (b) analyzing a current chunk of the one or more chunks of a current segment; and (c) adapting the playout of the media data in response to the result of the analysis prior to fully retrieving the current chunk. Described herein are further a client device having implemented a media player application configured to perform said method and a computer program product with instructions adapted to cause a device having processing capability to carry out said method.

IPC Classes ?

H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronizing decoder's clock; Client middleware

47. NOISE FLOOR ESTIMATION AND NOISE REDUCTION

Application Number	EP2021050921
Publication Number	2021/148342
Status	In Force
Filing Date	2021-01-18
Publication Date	2021-07-29
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Cengarle, Giulio Mateos Solé, Antonio Scaini, Davide

Abstract

Embodiments are disclosed for noise floor estimation and noise reduction, In an embodiment, a method comprises: obtaining an audio signal; dividing the audio signal into a plurality of buffers; determining time-frequency samples for each buffer of the audio signal; for each buffer and for each frequency, determining a median (or mean) and a measure of an amount of variation of energy based on the samples in the buffer and samples in neighboring buffers that together span a specified time range of the audio signal; combining the median (or mean) and the measure of the amount of variation of energy into a cost function; for each frequency: determining a signal energy of a particular buffer of the audio signal that corresponds to a minimum value of the cost function; selecting the signal energy as the estimated noise floor of the audio signal; and reducing, using the estimated noise floor, noise in the audio signal.

IPC Classes ?

G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
G10L 21/0208 - Noise filtering

48. ADAPTIVE STREAMING OF MEDIA CONTENT WITH BITRATE SWITCHING

Application Number	EP2021050445
Publication Number	2021/144244
Status	In Force
Filing Date	2021-01-12
Publication Date	2021-07-22
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Schmidt, Malte

Abstract

A method for adaptive streaming of media content with bitrate switching is described, wherein the media content comprising a plurality of consecutive media segments. The method comprising, at a media streaming server: transmitting a segment of the media content encoded in a first coding mode having a first bitrate; receiving an indication for a coding mode switch to a second coding mode having a second bitrate and in response transmitting a transition segment for transitioning between the first coding mode and the second coding mode; and transmitting another segment of the media content encoded in the second coding mode.

IPC Classes ?

H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/81 - Monomedia components thereof
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
H04N 21/8543 - Content authoring using a description language, e.g. MHEG [Multimedia and Hypermedia information coding Expert Group] or XML [eXtensible Markup Language]

49. PROJECTION SYSTEM AND METHOD OF DRIVING A PROJECTION SYSTEM

Application Number	US2020064663
Publication Number	2021/119524
Status	In Force
Filing Date	2020-12-11
Publication Date	2021-06-17
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Pertierra, Juan Pablo Richards, Martin J. Le Barbenchon, Clement Luc Carol Pires Arrifano, Angelo Miguel

Abstract

A projection system and method includes a light source configured to emit a light in response to an image data; a phase light modulator configured to receive the light from the light source and to apply a spatially-varying phase modulation on the light; and a controller configured to determine, for a frame of the image data, a plurality of phase configurations, respective ones of the plurality of phase configurations corresponding to solutions of a phase algorithm and representing the same image with a different modulation pattern, and provide a phase control signal to the phase light modulator, the phase control signal configured to cause the phase light modulator to modulate the plurality of phase configurations in a time-divisional manner within a time period of the frame, thereby to project a series of subframes within the time period.

IPC Classes ?

H04N 9/31 - Projection devices for colour picture display

50. SYSTEMS, METHODS AND APPARATUS FOR CONVERSION FROM CHANNEL-BASED AUDIO TO OBJECT-BASED AUDIO

Application Number	US2020062873
Publication Number	2021/113350
Status	In Force
Filing Date	2020-12-02
Publication Date	2021-06-10
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Ward, Michael C. Sanchez, Freddie Fersch, Christof

Abstract

Embodiments are disclosed for channel-based audio (CBA) (e.g., 22.2-ch audio) to object-based audio (OBA) conversion. The conversion includes converting CBA metadata to object audio metadata (OAMD) and reordering the CBA channels based on channel shuffle information derived in accordance with channel ordering constraints of the OAMD. The OBA with reordered channels is rendered in a playback device using the OAMD or in a source device, such as a set-top box or audio/video recorder. In an embodiment, the CBA metadata includes signaling that indicates a specific OAMD representation to be used in the conversion of the metadata. In an embodiment, pre-computed OAMD is transmitted in a native audio bitstream (e.g., AAC) for transmission (e.g., over HDMI) or for rendering in a source device. In an embodiment, pre-computed OAMD is transmitted in a transport layer bitstream (e.g., ISO BMFF, MPEG4 audio bitstream) to a playback device or source device.

IPC Classes ?

H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/16 - Vocoder architecture
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

51. METHODS AND DEVICES FOR PROVIDING PERSONALIZED AUDIO TO A USER

Application Number	EP2020083640
Publication Number	2021/105370
Status	In Force
Filing Date	2020-11-27
Publication Date	2021-06-03
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Schildbach, Wolfgang A. Schmidt, Malte

Abstract

The present application describes a method (400) for providing personalized audio to a user. The method (400) comprises receiving (401) a manifest file (140) for a media element from which audio is to be rendered, wherein the manifest file (140) comprises a description (141) for a plurality of different presentations (152) of audio content of the media element. In addition, the method (400) comprises selecting (402) a presentation (152) from the plurality of presentations (152) based on the manifest file (140). The method (400) further comprises receiving (403) a list of audio track objects comprised within the media element, and selecting (404) an audio track object from the list of audio track objects, in dependence of the selected presentation (152).

IPC Classes ?

H04N 21/2343 - Processing of video elementary streams, e.g. splicing of video streams or manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
H04N 21/643 - Communication protocols
H04N 21/84 - Generation or processing of descriptive data, e.g. content descriptors
H04N 21/45 - Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies

52. METHODS AND DEVICES FOR PERSONALIZING AUDIO CONTENT

Application Number	EP2020082493
Publication Number	2021/099363
Status	In Force
Filing Date	2020-11-18
Publication Date	2021-05-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Schmidt, Malte Hoerich, Holger

Abstract

The present document describes a method (400) for personalizing audio content. The method (400) comprises receiving (401) a manifest file (140) for the audio content. The manifest file (140) comprises at least one adaptation set (281, 282) referencing an audio bitstream (121), where the audio bitstream (121) comprises a plurality of audio objects (181), and a plurality of different preselection elements (291, 292, 293) for the adaptation set (281, 282), wherein the different preselection elements (291, 292, 293) specify different combinations of the plurality of audio objects (181). The method (400) further comprises selecting (402) a preselection element (291) from the plurality of different preselection elements (291, 292, 293), and causing (403) rendering of an audio signal which depends on the selected preselection element (291).

IPC Classes ?

H04N 21/485 - End-user interface for client configuration
H04N 21/462 - Content or additional data management e.g. creating a master electronic program guide from data received from the Internet and a Head-end or controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabi
H04N 21/262 - Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission or generating play-lists
H04N 21/81 - Monomedia components thereof
H04N 21/439 - Processing of audio elementary streams
H04N 21/233 - Processing of audio elementary streams
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
H04N 21/6373 - Control signals issued by the client directed to the server or network components for rate control

53. DEEP SOURCE SEPARATION ARCHITECTURE

Application Number	US2020056531
Publication Number	2021/081002
Status	In Force
Filing Date	2020-10-20
Publication Date	2021-04-29
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kadioglu, Berkan Horgan, Michael Getty Puig, Jordi Pons Liu, Xiaoyu

Abstract

A speech separation server comprises a deep-learning encoder with nonlinear activation. The encoder is programmed to take a mixture audio waveform in the time domain, learn generalized patterns from the mixture audio waveform, and generate an encoded representation that effectively characterizes the mixture audio waveform for speech separation.

IPC Classes ?

G10L 21/0272 - Voice signal separating
G06N 3/04 - Architecture, e.g. interconnection topology

54. METHODS AND SYSTEM FOR WAVEFORM CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL

Application Number	US2020056162
Publication Number	2021/077023
Status	In Force
Filing Date	2020-10-16
Publication Date	2021-04-22
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Klejsa, Janusz Biswas, Arijit Villemoes, Lars Fejgin, Roy M. Zhou, Cong

Abstract

Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.

IPC Classes ?

G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

55. MULTI-LAG FORMAT FOR AUDIO CODING

Application Number	EP2020073067
Publication Number	2021/032719
Status	In Force
Filing Date	2020-08-18
Publication Date	2021-02-25
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Villemoes, Lars Lehtonen, Heidi-Maria Purnhagen, Heiko Hedelin, Per

Abstract

Described herein is a method of encoding an audio signal. The method comprises: generating a plurality of subband audio signals based on the audio signal; determining a spectral envelope of the audio signal; for each subband audio signal, determining autocorrelation information for the subband audio signal based on an autocorrelation function of the subband audio signal; and generating an encoded representation of the audio signal, the encoded representation comprising a representation of the spectral envelope of the audio signal and a representation of the autocorrelation information for the plurality of subband audio signals. Further described are methods of decoding the audio signal from the encoded representation, as well as corresponding encoders, decoders, computer programs, and computer-readable recording media.

IPC Classes ?

G10L 25/06 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being correlation coefficients
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

56. METHODS AND DEVICES FOR GENERATION AND PROCESSING OF MODIFIED AUDIO BITSTREAMS

Application Number	US2020046042
Publication Number	2021/030515
Status	In Force
Filing Date	2020-08-13
Publication Date	2021-02-18
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Fersch, Christof Fischer, Daniel Terentiv, Leon Mcgarry, Gregory John

Abstract

Described herein is a method for generating a modified bitstream on a source device, wherein the method includes the steps of: a) receiving, by a receiver, a bitstream including coded media data; b) generating, by an embedder, payload of additional media data and embedding the payload in the bitstream for obtaining, as an output from the embedder, a modified bitstream including the coded media data and the payload of the additional media data; and c) outputting the modified bitstream to a sink device. Described is further a method for processing said modified bitstream on a sink device. Described are moreover a respective source device and sink device as well as a system of a source device and a sink device and respective computer program products.

IPC Classes ?

H04N 21/2389 - Multiplex stream processing, e.g. multiplex stream encrypting
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 7/24 - Systems for the transmission of television signals using pulse code modulation

57. METHODS AND DEVICES FOR GENERATION AND PROCESSING OF MODIFIED BITSTREAMS

Application Number	US2020046235
Publication Number	2021/030625
Status	In Force
Filing Date	2020-08-13
Publication Date	2021-02-18
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Fersch, Christof Fischer, Daniel Terentiv, Leon Mcgarry, Gregory John

Abstract

Described herein is a method for generating a modified bitstream on a source device, wherein the method includes the steps of: a) receiving, by a receiver, a bitstream including coded media data; b) generating, by an embedder, payload of additional media data and embedding the payload in the bitstream for obtaining, as an output from the embedder, a modified bitstream including the coded media data and the payload of the additional media data; and d) outputting the modified bitstream to a sink device. Described is further a method for processing said modified bitstream on a sink device. Described are moreover a respective source device and sink device as well as a system of a source device and a sink device and respective computer program products.

IPC Classes ?

G10L 19/16 - Vocoder architecture

58. ADAPTABLE SPATIAL AUDIO PLAYBACK

Application Number	US2020042391
Publication Number	2021/021460
Status	In Force
Filing Date	2020-07-16
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Seefeldt, Alan J. Lando, Joshua B. Arteaga, Daniel Dickins, Glenn N. Thomas, Mark Richard Paul

Abstract

A rendering mode may be determined for received audio data, including audio signals and associated spatial data. The audio data may be rendered for reproduction via a set of loudspeakers of an environment according to the rendering mode, to produce rendered audio signals. Rendering the audio data may involve determining relative activation of a set of loudspeakers in an environment. The rendering mode may be variable between a reference spatial mode and one or more distributed spatial modes. The reference spatial mode may have an assumed listening position and orientation. In the distributed spatial mode(s), one or more elements of the audio data may each be rendered in a more spatially distributed manner than in the reference spatial mode and spatial locations of remaining elements of the audio data may be warped such that they span a rendering space of the environment more completely than in the reference spatial mode.

IPC Classes ?

H04R 5/04 - Circuit arrangements
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 5/02 - Spatial or constructional arrangements of loudspeakers
H04R 27/00 - Public address systems
H04R 29/00 - Monitoring arrangements; Testing arrangements

59. COORDINATION OF AUDIO DEVICES

Application Number	US2020043769
Publication Number	2021/021752
Status	In Force
Filing Date	2020-07-27
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Dickins, Glenn N. Cartwright, Richard J. Gunawan, David Hines, Christopher Graham Thomas, Mark R.P. Seefeldt, Alan J. Lando, Joshua B. Dyonisio, Carlos Eduardo Medaglia Arteaga, Daniel

Abstract

An audio session management method for an audio environment having multiple audio devices may involve receiving, from a first device implementing a first application and by a device implementing an audio session manager, a first route initiation request to initiate a first route for a first audio session. The first route initiation request may indicate a first audio source and a first audio environment destination. The first audio environment destination may correspond with at least a first person in the audio environment, but in some instances will not indicate an audio device. The method may involve establishing a first route corresponding to the first route initiation request. Establishing the first route may involve determining a first location of at least the first person in the audio environment, determining at least one audio device for a first stage of the first audio session and initiating or scheduling the first audio session.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 3/12 - Circuits for transducers for distributing signals to two or more loudspeakers

60. ACOUSTIC ECHO CANCELLATION CONTROL FOR DISTRIBUTED AUDIO DEVICES

Application Number	US2020043958
Publication Number	2021/021857
Status	In Force
Filing Date	2020-07-29
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Dickins, Glenn N. Hines, Christopher Graham Gunawan, David Cartwright, Richard J. Seefeldt, Alan J. Arteaga, Daniel Thomas, Mark R. P. Lando, Joshua B.

Abstract

An audio processing method may involve receiving output signals from each microphone of a plurality of microphones in an audio environment, the output signals corresponding to a current utterance of a person and determining, based on the output signals, one or more aspects of context information relating to the person, including an estimated current proximity of the person to one or more microphone locations. The method may involve selecting two or more loudspeaker-equipped audio devices based, at least in part, on the one or more aspects of the context information, determining one or more types of audio processing changes to apply to audio data being rendered to loudspeaker feed signals for the audio devices and causing one or more types of audio processing changes to be applied. In some examples, the audio processing changes have the effect of increasing a speech to echo ratio at one or more microphones.

IPC Classes ?

H04R 3/00 - Circuits for transducers
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04M 1/60 - Substation equipment, e.g. for use by subscribers including speech amplifiers
H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
H04R 3/02 - Circuits for transducers for preventing acoustic reaction
H04R 1/02 - Casings; Cabinets; Mountings therein
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
H04R 27/00 - Public address systems
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
H04R 3/12 - Circuits for transducers for distributing signals to two or more loudspeakers

61. RENDERING AUDIO OVER MULTIPLE SPEAKERS WITH MULTIPLE ACTIVATION CRITERIA

Application Number	US2020043631
Publication Number	2021/021682
Status	In Force
Filing Date	2020-07-25
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Seefeldt, Alan J. Lando, Joshua B. Arteaga, Daniel

Abstract

Methods for rendering audio for playback by two or more speakers are disclosed. The audio includes one or more audio signals, each with an associated intended perceived spatial position. Relative activation of the speakers may be a cost function of a model of perceived spatial position of the audio signals when played back over the speakers, a measure of proximity of the intended perceived spatial position of the audio signals to positions of the speakers, and one or more additional dynamically configurable functions. The dynamically configurable functions may be based on at least one or more properties of the audio signals, one or more properties of the set of speakers and/or one or more external inputs.

IPC Classes ?

H04R 5/04 - Circuit arrangements
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 5/02 - Spatial or constructional arrangements of loudspeakers
H04R 27/00 - Public address systems
H04R 29/00 - Monitoring arrangements; Testing arrangements

62. MANAGING PLAYBACK OF MULTIPLE STREAMS OF AUDIO OVER MULTIPLE SPEAKERS

Application Number	US2020043696
Publication Number	2021/021707
Status	In Force
Filing Date	2020-07-27
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Seefeldt, Alan J. Lando, Joshua B. Arteaga, Daniel Thomas, Mark R.P. Dickins, Glenn N.

Abstract

A multi-stream rendering system and method may render and play simultaneously a plurality of audio program streams over a plurality of arbitrarily placed loudspeakers. At least one of the program streams may be a spatial mix. The rendering of said spatial mix may be dynamically modified as a function of the simultaneous rendering of one or more additional program streams. The rendering of one or more additional program streams may be dynamically modified as a function of the simultaneous rendering of the spatial mix.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

63. DYNAMICS PROCESSING ACROSS DEVICES WITH DIFFERING PLAYBACK CAPABILITIES

Application Number	US2020043764
Publication Number	2021/021750
Status	In Force
Filing Date	2020-07-27
Publication Date	2021-02-04
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Seefeldt, Alan J. Lando, Joshua B. Arteaga, Daniel

Abstract

Individual loudspeaker dynamics processing configuration data, for each of a plurality of loudspeakers of a listening environment, may be obtained. Listening environment dynamics processing configuration data may be determined, based on the individual loudspeaker dynamics processing configuration data. Dynamics processing may be performed on received audio data based on the listening environment dynamics processing configuration data, to generate processed audio data. The processed audio data may be rendered for reproduction via a set of loudspeakers that includes at least some of the plurality of loudspeakers, to produce rendered audio signals. The rendered audio signals may be provided to, and reproduced by, the set of loudspeakers.

IPC Classes ?

H04R 5/04 - Circuit arrangements
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
H04R 5/02 - Spatial or constructional arrangements of loudspeakers
H04R 27/00 - Public address systems
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
H04R 3/00 - Circuits for transducers
H04R 3/04 - Circuits for transducers for correcting frequency response

64. COORDINATION OF AUDIO DEVICES

Application Number	US2020043795
Publication Number	2021/021766
Status	In Force
Filing Date	2020-07-28
Publication Date	2021-02-04
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Dickins, Glenn N. Thomas, Mark Richard Paul Seefeldt, Alan J. Lando, Joshua B. Arteaga, Daniel Dyonisio, Carlos Medaglia Gunawan, David Cartwright, Richard J. Hines, Christopher Graham

Abstract

An audio session management method may involve: determining, by an audio session manager, one or more first media engine capabilities of a first media engine of a first smart audio device, the first media engine being configured for managing one or more audio media streams received by the first smart audio device and for performing first smart audio device signal processing for the one or more audio media streams according to a first media engine sample clock; receiving, by the audio session manager and via a first application communication link, first application control signals from the first application; and controlling the first smart audio device according to the first media engine capabilities, by the audio session manager, via first audio session management control signals transmitted to the first smart audio device via a first smart audio device communication link and without reference to the first media engine sample clock.

IPC Classes ?

H04R 3/12 - Circuits for transducers for distributing signals to two or more loudspeakers
H04L 29/08 - Transmission control procedure, e.g. data link level control procedure
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

65. LATENCY MANAGEMENT FOR CONTENT DELIVERY

Application Number	EP2020070042
Publication Number	2021/009255
Status	In Force
Filing Date	2020-07-15
Publication Date	2021-01-21
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Wolters, Martin Krauss, Kurt

Abstract

The present document discloses a method for playback of media content via a delivery channel. The delivery channel may generally refer to the channels through which audio or video programs are delivered (transmitted) to the user (receiver). The media content may generally comprise consecutive media programs. In particular, for a specific media program within the media content, a respective content type for that specific media program is also provided. The method may comprise receiving an indication of the sensitivity of a media program to playback latency. The method may further comprise receiving at least a portion of the media program. The method may yet further comprise adapting the playback of the media program based on the indication of its sensitivity to playback latency.

IPC Classes ?

H04N 21/61 - Network physical structure; Signal processing
H04N 21/24 - Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth or upstream requests
H04N 21/262 - Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission or generating play-lists
H04N 21/235 - Processing of additional data, e.g. scrambling of additional data or processing content descriptors
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs

66. PRESENTATION INDEPENDENT MASTERING OF AUDIO CONTENT

Application Number	US2020041064
Publication Number	2021/007246
Status	In Force
Filing Date	2020-07-07
Publication Date	2021-01-14
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Breebaart, Dirk, Jeroen Cooper, David, Matthew Cengarle, Giulio Crockett, Brett, G. Wilson, Rhonda, J.

Abstract

A method for generating mastered audio content, the method comprising obtaining an input audio content comprising a number, M1, of audio signals, obtaining rendered presentation of the input audio content, the rendered presentation comprising a number, M2, of audio signals, obtaining a mastered presentation generated by mastering the rendered presentation, comparing the mastered presentation with the rendered presentation to determine one or more indications of differences between the mastered presentation and the rendered presentation, modifying one or more of the audio signals of the input audio content based on the indications of differences to generate the mastered audio content. With this approach, conventional, typically stereo, channel-based mastering tools can be used to provide a mastered version of any input audio content, including object-based immersive audio content.

IPC Classes ?

G11B 27/031 - Electronic editing of digitised analogue information signals, e.g. audio or video signals
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic

67. METHODS, APPARATUS AND SYSTEMS FOR REPRESENTATION, ENCODING, AND DECODING OF DISCRETE DIRECTIVITY DATA

Application Number	EP2020068380
Publication Number	2021/001358
Status	In Force
Filing Date	2020-06-30
Publication Date	2021-01-07
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fersch, Christof Fischer, Daniel

Abstract

The present disclosure relates to a method of processing audio content including directivity information for at least one sound source, the directivity information comprising a first set of first directivity unit vectors representing directivity directions and associated first directivity gains. The disclosure further relates to corresponding methods of encoding and decoding audio content including directivity information for at least one sound source.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

68. DIALOGUE ENHANCEMENT IN AUDIO CODEC

Application Number	EP2020060534
Publication Number	2020/212390
Status	In Force
Filing Date	2020-04-15
Publication Date	2020-10-22
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Gorlow, Stanislaw Leif Jonas, Samuelsson Hoerich, Holger Friedrich, Tobias

Abstract

Dialogue enhancement of an audio signal, comprising obtaining a set of time-varying parameters configured to estimate a dialogue component present in said audio signal, estimating the dialogue component from the audio signal, applying a compressor only to the estimated dialogue component, to generate a processed dialogue component, applying a user-determined gain to the processed dialogue component, to provide an enhanced dialogue component. The processing of the estimated dialogue may be performed on the decoder side or encoder side. The invention enables an improved dialogue enhancement.

IPC Classes ?

G10L 19/16 - Vocoder architecture
G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility

69. METHOD AND APPARATUS FOR UPDATING A NEURAL NETWORK

Application Number	EP2020055869
Publication Number	2020/187587
Status	In Force
Filing Date	2020-03-05
Publication Date	2020-09-24
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Fersch, Christof Biswas, Arijit

Abstract

Described herein is a method of generating a media bitstream to transmit parameters for updating a neural network implemented in a decoder, wherein the method includes the steps of: (a) determining at least one set of parameters for updating the neural network; (b) encoding the at least one set of parameters and media data to generate the media bitstream; and (c) transmitting the media bitstream to the decoder for updating the neural network with the at least one set of parameters. Described herein are further a method for updating a neural network implemented in a decoder, an apparatus for generating a media bitstream to transmit parameters for updating a neural network implemented in a decoder, an apparatus for updating a neural network implemented in a decoder and computer program products comprising a computer-readable storage medium with instructions adapted to cause the device to carry out said methods when executed by a device having processing capability.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods

70. ATTENUATING WAVEFRONT DETERMINATION FOR NOISE REDUCTION

Application Number	US2020013950
Publication Number	2020/150523
Status	In Force
Filing Date	2020-01-16
Publication Date	2020-07-23
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Pires Arrifano, Angelo, M. Pertierra, Juan, P.

Abstract

A system and method comprise a light source; a spatial light modulator including a substantially transparent material layer and a phase modulation layer; an imaging device configured to receive a light from the light source as reflected by the spatial light modulator, and to generate an image data; and a controller. The controller provides a phase-drive signal to the spatial light modulator and determines an attenuating wavefront of the substantially transparent material layer based on the image data.

IPC Classes ?

G09G 3/00 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
G02B 26/06 - Optical devices or arrangements for the control of light using movable or deformable optical elements for controlling the phase of light
G02F 1/01 - Devices or arrangements for the control of the intensity, colour, phase, polarisation or direction of light arriving from an independent light source, e.g. switching, gating or modulating; Non-linear optics for the control of the intensity, phase, polarisation or colour
H03H 1/00 - Constructional details of impedance networks whose electrical mode of operation is not specified or applicable to more than one type of network
G09G 3/36 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix by control of light from an independent source using liquid crystals

71. METHOD, APPARATUS AND SYSTEM FOR HYBRID SPEECH SYNTHESIS

Application Number	EP2019086656
Publication Number	2020/141108
Status	In Force
Filing Date	2019-12-20
Publication Date	2020-07-09
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Mustafa, Ahmed Biswas, Arijit

Abstract

A method of decoding an original speech signal for hybrid adversarial-parametric speech synthesis comprising:(a) receiving quantized original linear prediction coding parameters estimated by applying linear prediction coding analysis filtering to an original speech signal and a quantized compressed representation of a residual of the original speech signal; (b) dequantizing the original linear prediction coding parameters and the compressed representation of the residual; (c) inputting the dequantized compressed representation of the residual into a decoder part of a Generator for applying adversarial mapping from the compressed residual domain to a fake (first) signal domain; (d) outputting, by the decoder part of the Generator, a fake speech signal; (e) applying linear prediction coding analysis filtering to the fake speech signal for obtaining a corresponding fake residual; (f) reconstructing the original speech signal by applying linear prediction coding cross-synthesis filtering to the fake residual and the dequantized original linear prediction coding analysis parameters.

IPC Classes ?

G10L 19/08 - Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
G06N 3/02 - Neural networks

72. DUAL-ENDED MEDIA INTELLIGENCE

Application Number	US2019065338
Publication Number	2020/123424
Status	In Force
Filing Date	2019-12-10
Publication Date	2020-06-18
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bai, Yanning Gerrard, Mark William Han, Richard Wolters, Martin

Abstract

A method of encoding audio content comprises performing a content analysis of the audio content, generating classification information indicative of a content type of the audio content based on the content analysis, encoding the audio content and the classification information in a bitstream, and outputting the bitstream. A method of decoding audio content from a bitstream including audio content and classification information for the audio content, wherein the classification information is indicative of a content classification of the audio content, comprises receiving the bitstream, decoding the audio content and the classification information, and selecting, based on the classification information, a post processing mode for performing post processing of the decoded audio content. Selecting the post processing mode can involve calculating one or more control weights for post processing of the decoded audio content based on the classification information.

IPC Classes ?

G10L 19/20 - Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
G10L 19/22 - Mode decision, i.e. based on audio signal content versus external parameters
G10L 19/16 - Vocoder architecture

73. AUDIO PROCESSING IN IMMERSIVE AUDIO SERVICES

Application Number	US2019060855
Publication Number	2020/102153
Status	In Force
Filing Date	2019-11-12
Publication Date	2020-05-22
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan Torres, Juan Felix Mcgrath, David S. Lee, Brian

Abstract

The disclosure herein generally relates to capturing, acoustic pre-processing, encoding, decoding, and rendering of directional audio of an audio scene. In particular, it relates to a device adapted to modify a directional property of a captured directional audio in response to spatial data of a microphone system capturing the directional audio. The disclosure further relates to a rendering device configured to modify a directional property of a received directional audio in response to received spatial data.

IPC Classes ?

H04R 3/00 - Circuits for transducers
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

74. REPRESENTING SPATIAL AUDIO BY MEANS OF AN AUDIO SIGNAL AND ASSOCIATED METADATA

Application Number	US2019060862
Publication Number	2020/102156
Status	In Force
Filing Date	2019-11-12
Publication Date	2020-05-22
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan

Abstract

There is provided encoding and decoding methods for representing spatial audio that is a combination of directional sound and diffuse sound. An exemplary encoding method includes inter alia creating a single- or multi-channel downmix audio signal by downmixing input audio signals from a plurality of microphones in an audio capture unit capturing the spatial audio; determining first metadata parameters associated with the downmix audio signal, wherein the first metadata parameters are indicative of one or more of: a relative time delay value, a gain value, and a phase value associated with each input audio signal; and combining the created downmix audio signal and the first metadata parameters into a representation of the spatial audio.

IPC Classes ?

H04R 3/00 - Circuits for transducers
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

75. METHODS AND APPARATUS FOR RATE QUALITY SCALABLE CODING WITH GENERATIVE MODELS

Application Number	EP2019079508
Publication Number	2020/089215
Status	In Force
Filing Date	2019-10-29
Publication Date	2020-05-07
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Klejsa, Janusz Hedelin, Per

Abstract

Described herein is a method of decoding an audio or speech signal, the method including the steps of: (a) receiving, by a decoder, a coded bitstream including the audio or speech signal and conditioning information; (b) providing, by a bitstream decoder, decoded conditioning information in a format associated with a first bitrate; (c) converting, by a converter, the decoded conditioning information from the format associated with the first bitrate to a format associated with a second bitrate; and (d) providing, by a generative neural network, a reconstruction of the audio or speech signal according to a probabilistic model conditioned by the conditioning information in the format associated with the second bitrate. Described are further an apparatus for decoding an audio or speech signal, a respective encoder, a system of the encoder and the apparatus for decoding an audio or speech signal as well as a respective computer program product.

IPC Classes ?

G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

76. AN AUDIO ENCODER AND AN AUDIO DECODER

Application Number	EP2019079683
Publication Number	2020/089302
Status	In Force
Filing Date	2019-10-30
Publication Date	2020-05-07
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Friedrich, Tobias Purnhagen, Heiko Gorlow, Stanislaw Merpillat, Celine

Abstract

The present disclosure relates to the field audio coding, an in particular to an audio decoder having at least two decoding modes, and associated decoding methods and decoding software for such audio decoder. In one of the decoding modes, at least one dynamic audio object is mapped to a set of static audio objects, the set of static audio objects corresponding to a predefined speaker configuration. The present disclosure further relates to a corresponding audio encoder, and associated encoding methods and encoding software for such audio encoder.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/18 - Vocoders using multiple modes

77. TRANSFORMING AUDIO SIGNALS CAPTURED IN DIFFERENT FORMATS INTO A REDUCED NUMBER OF FORMATS FOR SIMPLIFYING ENCODING AND DECODING OPERATIONS

Application Number	US2019055009
Publication Number	2020/076708
Status	In Force
Filing Date	2019-10-07
Publication Date	2020-04-16
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan Eckert, Michael Juan Felix, Torres Brown, Stephanie Mcgrath, David S.

Abstract

The disclosed embodiments enable converting audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec (e.g., an Immersive Voice and Audio Services (IVAS) codec). In an embodiment, a simplification unit of the audio device receives an audio signal captured by one or more audio capture devices coupled to the audio device. The simplification unit determines whether the audio signal is in a format that is supported/not supported by an encoding unit of the audio device. Based on the determining, the simplification unit, converts the audio signal into a format that is supported by the encoding unit. In an embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit can convert the audio signal into a spatial "mezzanine" format supported by the encoding.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

78. METHODS AND DEVICES FOR CONTROLLING AUDIO PARAMETERS

Application Number	US2019051762
Publication Number	2020/061215
Status	In Force
Filing Date	2019-09-18
Publication Date	2020-03-26
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Saule, Lucas E. Chen, Eugene Derreveaux, Julien Guy Pierre Siwak, Jakub Brinkley, Daniel Christian

Abstract

A method of controlling headphones having external microphone signal pass-through functionality may involve controlling a display to present a geometric shape on the display and receiving an indication of digit motion from a sensor system associated with the display. The sensor system may include a touch sensor system or a gesture sensor system. The indication may be an indication of a direction of digit motion relative to the display. The method may involve controlling the display to present a sequence of images indicating that the geometric shape either enlarges or contracts, depending on the direction of digit motion and changing a headphone transparency setting according to a current size of the geometric shape. The headphone transparency setting may correspond to an external microphone signal gain setting and/or a media signal gain setting of the headphones.

IPC Classes ?

H04R 1/10 - Earpieces; Attachments therefor
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

79. METHOD AND APPARATUS FOR CONTROLLING ENHANCEMENT OF LOW-BITRATE CODED AUDIO

Application Number	US2019048876
Publication Number	2020/047298
Status	In Force
Filing Date	2019-08-29
Publication Date	2020-03-05
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Biswas, Arijit Dai, Jia Master, Aaron Steven

Abstract

Described herein is a method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data at a decoder side, including the steps of: (a) core encoding original audio data at a low bitrate to obtain encoded audio data; (b) generating enhancement metadata to be used for controlling a type and/or amount of audio enhancement at the decoder side after core decoding the encoded audio data; and (c) outputting the encoded audio data and the enhancement metadata. Described is further an encoder configured to perform said method. Described is moreover a method for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata and a decoder configured to perform said method.

IPC Classes ?

G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

80. METHODS, APPARATUS AND SYSTEMS FOR GENERATION, TRANSPORTATION AND PROCESSING OF IMMEDIATE PLAYOUT FRAMES (IPFS)

Application Number	EP2019072258
Publication Number	2020/038938
Status	In Force
Filing Date	2019-08-20
Publication Date	2020-02-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Fersch, Christof Fischer, Daniel

Abstract

Described herein is an audio decoder for decoding a bitstream of encoded audio data, wherein the bitstream of encoded audio data represents a sequence of audio sample values and comprises a plurality of frames, wherein each frame comprises associated encoded audio sample values, the audio decoder comprising: a determiner configured to determine whether a frame of the bitstream of encoded audio data is an immediate playout frame comprising encoded audio sample values associated with a current frame and additional information; and an initializer configured to initialize the decoder if the determiner determines that the frame is an immediate playout frame, wherein initializing the decoder comprises decoding the encoded audio sample values comprised by the additional information before decoding the encoded audio sample values associated with the current frame. Described are further a method for decoding said bitstream of encoded audio data as well as an audio encoder, a system of audio encoders and a method for generating said bitstream of encoded audio data with immediate playout frames. Described are moreover also an apparatus for generating immediate playout frames in a bitstream of encoded audio data or for removing immediate playout frames from a bitstream of encoded audio data and respective non-transitory digital storage media.

IPC Classes ?

G10L 19/16 - Vocoder architecture

81. CODING DENSE TRANSIENT EVENTS WITH COMPANDING

Application Number	EP2019072377
Publication Number	2020/039000
Status	In Force
Filing Date	2019-08-21
Publication Date	2020-02-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Biswas, Arijit Mundt, Harald

Abstract

Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A method of processing an audio signal includes the following operations. A system receives an audio signal. The system determines that a first frame of the audio signal includes a sparse transient signal. The system determines that a second frame of the audio signal includes a dense transient signal. The system compresses/expands (compands) the audio signal using a companding rule that applies a first companding exponent to the first frame of the audio signal and applies a second companding exponent to the second frame of the audio signal, each companding exponent being used to derive a respective degree of dynamic range compression and expansion for a corresponding frame. The system then provides the companded audio signal to a downstream device.

IPC Classes ?

G10L 21/034 - Automatic adjustment
H04B 1/64 - Volume compression or expansion arrangements

82. METHOD AND SYSTEM FOR CREATING OBJECT-BASED AUDIO CONTENT

Application Number	US2019042293
Publication Number	2020/018724
Status	In Force
Filing Date	2019-07-17
Publication Date	2020-01-23
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Hirvonen, Toni Arteaga, Daniel Aylon Pla, Eduard Cabrer Manning, Alex Lu, Lie Roeden, Karl Jonas

Abstract

Described herein is a method for creating object-based audio content from a text input for use in audio books and/or audio play, the method including the steps of: a) receiving the text input; b) performing a semantic analysis of the received text input; c) synthesizing speech and effects based on one or more results of the semantic analysis to generate one or more audio objects; d) generating metadata for the one or more audio objects; and e) creating the object-based audio content including the one or more audio objects and the metadata. Described herein are further a computer-based system including one or more processors configured to perform said method and a computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

IPC Classes ?

G10L 13/02 - Methods for producing synthetic speech; Speech synthesisers
G10L 25/48 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G09B 5/06 - Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
G09B 17/00 - Teaching reading
G10L 15/18 - Speech classification or search using natural language modelling
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 13/10 - Prosody rules derived from text; Stress or intonation

83. DYNAMIC EQ

Application Number	US2019041457
Publication Number	2020/014517
Status	In Force
Filing Date	2019-07-11
Publication Date	2020-01-16
Owner	DOLBY INTERNATIONAL AB (Netherlands) DOLBY LABORATORIES LICENSING CORPORATION (USA)
Inventor	Cengarle, Giulio Mateos Sole, Antonio Breebaart, Dirk Jeroen

Abstract

nnNbNbnff ff f of the input audio signal.

IPC Classes ?

G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

84. PROCESSING MEDIA DATA STRUCTURES

Application Number	EP2019067870
Publication Number	2020/007922
Status	In Force
Filing Date	2019-07-03
Publication Date	2020-01-09
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Fersch, Christof Schildbach, Wolfgang A.

Abstract

A method of playing out media from a media engine run on a receiving apparatus, the method comprising: at the receiving apparatus, receiving a media data structure comprising audio or video content formatted in a plurality of layers, including at least a first layer comprising the audio or video content encoded according to an audio or video encoding scheme respectively, and a second layer encapsulating the encoded content in one or more media containers according to a media container format; determining that at least one of the media containers further encapsulates runnable code for processing at least some of the formatting of the media data structure in order to support playout of the audio or video content by the media engine; running the code on a code engine of the receiving apparatus in order to perform the processing of the media data structure for input to the media engine.

IPC Classes ?

H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
H04N 21/439 - Processing of audio elementary streams
H04N 21/4402 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
H04N 21/81 - Monomedia components thereof

85. METHODS AND DEVICES FOR GENERATING OR DECODING A BITSTREAM COMPRISING IMMERSIVE AUDIO SIGNALS

Application Number	US2019040271
Publication Number	2020/010064
Status	In Force
Filing Date	2019-07-02
Publication Date	2020-01-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Bruhn, Stefan Torres, Juan Felix

Abstract

The present document describes a method (500) for generating a bitstream (101), wherein the bitstream (101) comprises a sequence of superframes (400) for a sequence of frames of an immersive audio signal (111). The method (500) comprises, repeatedly for the sequence of superframes (400), inserting (501) coded audio data (206) for one or more frames of one or more downmix channel signals (203) derived from the immersive audio signal (111), into data fields (411, 421, 412, 422) of a superframe (400); and inserting (502) metadata (202, 205) for reconstructing one or more frames of the immersive audio signal (111) from the coded audio data (206), into a metadata field (403) of the superframe (400).

IPC Classes ?

G10L 19/16 - Vocoder architecture
H04S 3/00 - Systems employing more than two channels, e.g. quadraphonic
G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

86. METHODS AND DEVICES FOR ENCODING AND/OR DECODING IMMERSIVE AUDIO SIGNALS

Application Number	US2019040282
Publication Number	2020/010072
Status	In Force
Filing Date	2019-07-02
Publication Date	2020-01-09
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Mcgrath, David S. Eckert, Michael Purnhagen, Heiko Bruhn, Stefan

Abstract

The present document describes a method (700) for encoding a multi-channel input signal (201). The method (700) comprises determining (701) a plurality of downmix channel signals (203) from the multi-channel input signal (201) and performing (702) energy compaction of the plurality of downmix channel signals (203) to provide a plurality of compacted channel signals (404). Furthermore, the method (700) comprises determining (703) joint coding metadata (205) based on the plurality of compacted channel signals (404) and based on the multi-channel input signal (201), wherein the joint coding metadata (205) is such that it allows upmixing of the plurality of compacted channel signals (404) to an approximation of the multi-channel input signal (201). In addition, the method (700) comprises encoding (704) the plurality of compacted channel signals (404) and the joint coding metadata (205).

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/04 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
G10L 19/18 - Vocoders using multiple modes

87. FRAME CONVERSION FOR ADAPTIVE STREAMING ALIGNMENT

Application Number	US2019039535
Publication Number	2020/006250
Status	In Force
Filing Date	2019-06-27
Publication Date	2020-01-02
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Hoffmann, Michael Donald Fersch, Christof Pribadi, Marvin Hoerich, Holger

Abstract

Methods for generating an AV bitstream (e.g., an MPEG-2 transport stream or bitstream segment having adaptive streaming format) such that the AV bitstream includes at least one video I-frame synchronized with at least one audio I-frame, e.g., including by re-authoring at least one video or audio frame (as a re-authored I-frame or a re-authored P-frame). Typically, a segment of content of the AV bitstream which includes the re-authored frame starts with an I-frame and includes at least one subsequent P-frame. Other aspects are methods for adapting such an AV bitstream, audio/video processing units configured to perform any embodiment of the inventive method, and audio/video processing units which include a buffer memory which stores at least one segment of an AV bitstream generated in accordance with any embodiment of the inventive method.

IPC Classes ?

H04N 21/61 - Network physical structure; Signal processing
H04N 21/647 - Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load or bridging bet

88. METHODS AND SYSTEMS FOR STREAMING MEDIA DATA OVER A CONTENT DELIVERY NETWORK

Application Number	EP2019060919
Publication Number	2019/211237
Status	In Force
Filing Date	2019-04-29
Publication Date	2019-11-07
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Klejsa, Janusz Purnhagen, Heiko

Abstract

The present document describes a method (900) for establishing control information for a control policy of a client (102) for streaming data (103) from at least one server (101, 701). The method (900) comprises performing (901) a message passing process between a server agent of the server (101, 701) and a client agent of the client (102), in order to iteratively establish control information. Furthermore, the method (900) comprises generating (902) a convergence event for the message passing process to indicate that the control information has been established.

IPC Classes ?

H04L 29/08 - Transmission control procedure, e.g. data link level control procedure
H04L 29/06 - Communication control; Communication processing characterised by a protocol
H04N 21/6332 - Control signals issued by server directed to the network components or client directed to client
H04N 21/6373 - Control signals issued by the client directed to the server or network components for rate control

89. INTEGRATION OF HIGH FREQUENCY AUDIO RECONSTRUCTION TECHNIQUES

Application Number	EP2019060600
Publication Number	2019/207036
Status	In Force
Filing Date	2019-04-25
Publication Date	2019-10-31
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kjoerling, Kristofer Villemoes, Lars Purnhagen, Heiko Ekstrand, Per

Abstract

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

IPC Classes ?

G10L 21/0388 - Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques - Details of processing therefor
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

90. METHODS, APPARATUS AND SYSTEMS FOR ENCODING AND DECODING OF DIRECTIONAL SOUND SOURCES

Application Number	US2019027503
Publication Number	2019/204214
Status	In Force
Filing Date	2019-04-15
Publication Date	2019-10-24
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Tsingos, Nicolas R. Thomas, Mark, R., P. Fersch, Christof

Abstract

Some disclosed methods involve encoding or decoding directional audio data. Some encoding methods may involve receiving a mono audio signal corresponding to an audio object and a representation of a radiation pattern corresponding to the audio object. The radiation pattern may include sound levels corresponding to plurality of sample times, a plurality of frequency bands and a plurality of directions. The methods may involve encoding the mono audio signal and encoding the source radiation pattern to determine radiation pattern metadata. Encoding the radiation pattern may involve determining a spherical harmonic transform of the representation of the radiation pattern and compressing the spherical harmonic transform to obtain encoded radiation pattern metadata.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 1/00 - Two-channel systems
H04S 5/00 - Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation

91. METHODS, APPARATUS AND SYSTEMS FOR THREE DEGREES OF FREEDOM (3DOF+) EXTENSION OF MPEG-H 3D AUDIO

Application Number	EP2019058954
Publication Number	2019/197403
Status	In Force
Filing Date	2019-04-09
Publication Date	2019-10-17
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Fersch, Christof Terentiv, Leon Fischer, Daniel

Abstract

Described is a method of processing position information indicative of an object position of an audio object, wherein the object position is usable for rendering of the audio object, that comprises: obtaining listener orientation information indicative of an orientation of a listener's head; obtaining listener displacement information indicative of a displacement of the listener's head; determining the object position from the position information; modifying the object position based on the listener displacement information by applying a translation to the object position; and further modifying the modified object position based on the listener orientation information. Further described is a corresponding apparatus for processing position information indicative of an object position of an audio object, wherein the object position is usable for rendering of the audio object.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer

92. METHODS, APPARATUS AND SYSTEMS FOR A PRE-RENDERED SIGNAL FOR AUDIO RENDERING

Application Number	EP2019058833
Publication Number	2019/197349
Status	In Force
Filing Date	2019-04-08
Publication Date	2019-10-17
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fersch, Christof Fischer, Daniel

Abstract

The present disclosure relates to a method of decoding audio scene content from a bitstream by a decoder that includes an audio renderer with one or more rendering tools. The method comprises receiving the bitstream, decoding a description of an audio scene from the bitstream, determining one or more effective audio elements from the description of the audio scene, determining effective audio element information indicative of effective audio element positions of the one or more effective audio elements from the description of the audio scene, decoding a rendering mode indication from the bitstream, wherein the rendering mode indication is indicative of whether the one or more effective audio elements represent a sound field obtained from pre-rendered audio elements and should be rendered using a predetermined rendering mode, and in response to the rendering mode indication indicating that the one or more effective audio elements represent the sound field obtained from pre-rendered audio elements and should be rendered using the predetermined rendering mode, rendering the one or more effective audio elements using the predetermined rendering mode, wherein rendering the one or more effective audio elements using the predetermined rendering mode takes into account the effective audio element information, and wherein the predetermined rendering mode defines a predetermined configuration of the rendering tools for controlling an impact of an acoustic environment of the audio scene on the rendering output. The disclosure further relates to a method of generating audio scene content and a method of encoding audio scene content into a bitstream.

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

93. METHODS, APPARATUS AND SYSTEMS FOR 6DOF AUDIO RENDERING AND DATA REPRESENTATIONS AND BITSTREAM STRUCTURES FOR 6DOF AUDIO RENDERING

Application Number	EP2019058955
Publication Number	2019/197404
Status	In Force
Filing Date	2019-04-09
Publication Date	2019-10-17
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fersch, Christof Fischer, Daniel

Abstract

The present disclosure relates to methods, apparatus and systems for encoding an audio signal into a bitstream, in particular at an encoder, comprising: encoding or including audio signal data associated with 3DoF audio rendering into one or more first bitstream parts of the bitstream, and encoding or including metadata associated with 6DoF audio rendering into one or more second bitstream parts of the bitstream. The present disclosure further relates to methods, apparatus and systems for decoding an audio signal and audio rendering based on the bitstream.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/16 - Vocoder architecture
G10L 19/24 - Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

94. METHOD AND APPARATUS FOR PROCESSING OF AUXILIARY MEDIA STREAMS EMBEDDED IN A MPEG-H 3D AUDIO STREAM

Application Number	EP2019054432
Publication Number	2019/162434
Status	In Force
Filing Date	2019-02-22
Publication Date	2019-08-29
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Schreiner, Stephan Fersch, Christof

Abstract

The disclosure relates to methods, apparatus and systems for side load processing of packetized media streams. In an embodiment, the apparatus comprises: a receiver for receiving a bitstream, and a splitter for identifying a packet type in the bitstream and splitting, based on the identification of a value of the packet type in the bit stream into a main stream and an auxiliary stream.

IPC Classes ?

H04N 21/439 - Processing of audio elementary streams
H04N 21/435 - Processing of additional data, e.g. decrypting of additional data or reconstructing software from modules extracted from the transport stream
H04N 21/4363 - Adapting the video stream to a specific local network, e.g. a IEEE 1394 or Bluetooth® network
H04N 21/485 - End-user interface for client configuration
H04N 21/81 - Monomedia components thereof
G10L 19/16 - Vocoder architecture
H04N 21/434 - Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams or extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
H04N 21/426 - Internal components of the client

95. BACKWARD-COMPATIBLE INTEGRATION OF HIGH FREQUENCY RECONSTRUCTION TECHNIQUES FOR AUDIO SIGNALS

Application Number	US2019015442
Publication Number	2019/148112
Status	In Force
Filing Date	2019-01-28
Publication Date	2019-08-01
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kjoerling, Kristofer Villemoes, Lars Purnhagen, Heiko Ekstrand, Per

Abstract

IPC Classes ?

G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/16 - Vocoder architecture
H04N 19/00 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals

96. METHODS AND DEVICES FOR CODING SOUNDFIELD REPRESENTATION SIGNALS

Application Number	US2019014090
Publication Number	2019/143867
Status	In Force
Filing Date	2019-01-17
Publication Date	2019-07-25
Owner	DOLBY LABORATORIES LICENSING CORPORATION (USA) DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kjoerling, Kristofer Mcgrath, David S. Purnhagen, Heiko Thomas, Mark R. P.

Abstract

The present document describes a method (400) for encoding a soundfield representation (SR) input signal (101, 301) describing a soundfield at a reference position, wherein the SR input signal (101, 301) comprises a plurality of channels for a plurality of different directivity patterns of the soundfield at the reference position. The method (400) comprises extracting (401) one or more audio objects (103, 303) from the SR input signal (101, 301). Furthermore, the method (400) comprises determining (402) a residual signal (102, 302) based on the SR input signal (101, 301) and based on the one or more audio objects (103, 303). The method (400) also comprises performing joint coding of the one or more audio objects (103, 303) and/or the residual signal (102, 302). In addition, the method (400) comprises generating (403) a bitstream (701) based on data generated in the context of joint coding of the one or more audio objects (103, 303) and/or the residual signal (102, 302).

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

97. METHOD AND SYSTEM FOR HANDLING LOCAL TRANSITIONS BETWEEN LISTENING POSITIONS IN A VIRTUAL REALITY ENVIRONMENT

Application Number	EP2018085639
Publication Number	2019/121773
Status	In Force
Filing Date	2018-12-18
Publication Date	2019-06-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fersch, Christof Fischer, Daniel

Abstract

A method (910) for rendering an audio signal in a virtual reality rendering environment (180) is described. The method (910) comprises rendering (911) an origin audio signal of an audio source (311, 312, 313) from an origin source position on an origin sphere (114) around an origin listening position (301) of a listener (181). Furthermore, the method (900) comprises determining (912) that the listener (181) moves from the origin listening position (301) to a destination listening position (302). In addition, the method (900) comprises determining (913) a destination source position of the audio source (311, 312, 313) on a destination sphere (114) around the destination listening position (302) based on the origin source position, and determining (914) a destination audio signal of the audio source (311, 312, 313) based on the origin audio signal. Furthermore, the method (900) comprises rendering (915) the destination audio signal of the audio source (311, 312, 313) from the destination source position on the destination sphere (114) around the destination listening position (302).

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control

98. METHODS AND APPARATUS SYSTEMS FOR UNIFIED SPEECH AND AUDIO DECODING IMPROVEMENTS

Application Number	EP2018085938
Publication Number	2019/121980
Status	In Force
Filing Date	2018-12-19
Publication Date	2019-06-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kumar, Rajat Katuri, Ramesh Sathuvalli, Saketh Rai, Reshma

Abstract

The present disclosure relates to an apparatus for decoding an encoded Unified Audio and Speech stream. The apparatus comprises a core decoder for decoding the encoded Unified Audio and Speech stream. The core decoder includes a fast Fourier transform, FFT, module implementation based on a Cooley-Tuckey algorithm. The FFT module is configured to determine a discrete Fourier transform, DFT. Determining the DFT involves recursively breaking down the DFT into small FFTs based on the Cooley-Tucker algorithm and using radix-4 if a number of points of the FFT is a power of 4 and using mixed radix if the number is not a power of 4. Performing the small FFTs involves applying twiddle factors. Applying the twiddle factors involves referring to pre-computed values for the twiddle factors. The present disclosure further relates to an apparatus for decoding an encoded Unified Audio and Speech stream, in which the core decoder is configured to decode an LPC filter that has been quantized using a line spectral frequency, LSF, representation from the Unified Audio and Speech stream. Decoding the LPC filter from the Unified Audio and Speech stream comprises computing a first-stage approximation of a LSF vector, reconstructing a residual LSF vector, if an absolute quantization mode has been used for quantizing the LPC filter, determining inverse LSF weights for inverse weighting of the residual LSF vector by referring to pre-computed values for the inverse LSF weights or their respective corresponding LSF weights, inverse weighting the residual LSF vector by the determined inverse LSF weights, and calculating the LPC filter based on the inversely-weighted residual LSF vector and the first-stage approximation of the LSF vector. The present disclosure further relates to corresponding methods and storage media.

IPC Classes ?

G06F 17/14 - Fourier, Walsh or analogous domain transformations
G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 19/07 - Line spectrum pair [LSP] vocoders
G10L 19/18 - Vocoders using multiple modes

99. METHODS, APPARATUS AND SYSTEMS FOR UNIFIED SPEECH AND AUDIO DECODING AND ENCODING DECORRELATION FILTER IMPROVEMENTS

Application Number	EP2018085939
Publication Number	2019/121981
Status	In Force
Filing Date	2018-12-19
Publication Date	2019-06-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Kumar, Rajat Katuri, Ramesh Sathuvalli, Saketh Rai, Reshma

Abstract

The present disclosure relates to an apparatus for decoding an encoded Unified Audio and Speech stream. The apparatus comprises a core decoder for decoding the encoded Unified Audio and Speech stream. The core decoder includes an upmixing unit adapted to perform mono to stereo upmixing. The upmixing unit includes a decorrelator unit D adapted to apply a decorrelation filter to an input signal. The decorrelator unit is adapted to determine filter coefficients for the decorrelation filter by referring to pre-computed values. The present disclosure further relates to a an apparatus for encoding a Unified Audio and Speech stream, as well as to corresponding methods and storage media.

IPC Classes ?

G10L 19/008 - Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
H04S 3/02 - Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
G10L 19/025 - Detection of transients or attacks for time/frequency resolution switching
G10H 7/00 - Instruments in which the tones are synthesised from a data store, e.g. computer organs

100. METHOD AND SYSTEM FOR HANDLING GLOBAL TRANSITIONS BETWEEN LISTENING POSITIONS IN A VIRTUAL REALITY ENVIRONMENT

Application Number	EP2018085641
Publication Number	2019/121775
Status	In Force
Filing Date	2018-12-18
Publication Date	2019-06-27
Owner	DOLBY INTERNATIONAL AB (Netherlands)
Inventor	Terentiv, Leon Fersch, Christof Fischer, Daniel

Abstract

A method (900) for rendering audio in a virtual reality rendering environment (180) is described. The method (900) comprises rendering (901) an origin audio signal of an origin audio source (113) of an origin audio scene (111) from an origin source position on a sphere (114) around a listening position (201) of a listener (181). Furthermore, the method (900) comprises determining (902) that the listener (181) moves from the listening position (201) within the origin audio scene (111) to a listening position (202) within a different destination audio scene (112). In addition, the method (900) comprises applying (903) a fade-out gain to the origin audio signal to determine a modified origin audio signal, and rendering (903) the modified origin audio signal of the origin audio source (113) from the origin source position on the sphere (114) around the listening position (201, 202).

IPC Classes ?

H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer

1 2 3 4 Next Page