Nuance Communications, Inc.

United States of America

1-100 of 109 for Nuance Communications, Inc.

Sort by

Query

Patent
United States - USPTO
Excluding Subsidiaries

Aggregations

Reset Report

Date

IPC Class

Status

103

Found results for

patents

1 2 Next Page

1. System and Method for Spectral Pooling in Streaming Speech Processing

Application Number	18162186
Status	Pending
Filing Date	2023-01-31
First Publication Date	2024-04-18
Owner	Nuance Communications, Inc. (USA)
Inventor	Weninger, Felix Albesano, Dario Zhan, Puming

Abstract

A method, computer program product, and computing system for inserting a spectral pooling layer into a neural network of a speech processing system. An output of a hidden layer of the neural network is filtered using the spectral pooling layer with a non-integer stride. The filtered output is provided to a subsequent hidden layer of the neural network.

IPC Classes ?

G10L 15/16 - Speech classification or search using artificial neural networks
G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation

2. INTERACTIVE VOICE RESPONSE SYSTEMS HAVING IMAGE ANALYSIS

Application Number	17816957
Status	Pending
Filing Date	2022-08-02
First Publication Date	2024-02-08
Owner	Nuance Communications, Inc. (USA)
Inventor	Chawla, Akash Degroot, Jenny Vovk, Sergey A.

Abstract

An interactive voice response system is provided that includes an interactive voice recognition module, an image collection module, and a data extraction module. The image collection module communicates with the voice recognition module and the user device. The extraction module communicates with the image collection module. The voice recognition module collects speech data from a user of the user device and provides an indication to the image collection module when the speech data includes complex data. The image collection module, in response to the indication, communicates with the user device in a text message. The text message includes a link that, when activated, opens a camera on the user device. The image collection module, in response to receiving an image having the complex data from the camera, communicates the image to the extraction module, which extracts the complex data from the image as textual data.

IPC Classes ?

G06V 30/41 - Analysis of document content
G06V 30/146 - Aligning or centering of the image pick-up or image-field
G06V 30/19 - Recognition using electronic means
H04M 3/493 - Interactive information services, e.g. directory enquiries
H04L 51/18 - Commands or executable codes
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog

3. Automated Clinical Documentation System and Method

Application Number	17210292
Status	Pending
Filing Date	2021-03-23
First Publication Date	2022-02-17
Owner	Nuance Communications, Inc. (USA)
Inventor	Gallopyn, Guido Remi Marcel Sharma, Dushyant Jost, Uwe Helmut Owen, Donald E. Naylor, Patrick Nour-Eldin, Amr Almendro Barreda, Daniel Paulino Öz, Mehmet Mert Erskine, Garret N.

Abstract

A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for functionality module communication is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a plurality of functionality modules. At least a portion of the encounter information may be processed via a first functionality module of the plurality of functionality modules to generate a first result. The first result may be provided to a second functionality module of the plurality of functionality modules. The first result set may be processed via the second functionality module to generate a second result. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for functionality module communication is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a plurality of functionality modules. At least a portion of the encounter information may be processed via a first functionality module of the plurality of functionality modules to generate a first result. The first result may be provided to a second functionality module of the plurality of functionality modules. The first result set may be processed via the second functionality module to generate a second result. A computer-implemented method, computer program product, and computing system for synchronizing machine vision and audio is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes machine vision encounter information and audio encounter information. The machine vision encounter information and the audio encounter information are temporally-aligned to produce a temporarily-aligned encounter recording.

IPC Classes ?

G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
G10L 21/0208 - Noise filtering
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

4. Automated Clinical Documentation System and Method

Application Number	17210253
Status	Pending
Filing Date	2021-03-23
First Publication Date	2021-08-05
Owner	Nuance Communications, Inc. (USA)
Inventor	Owen, Donald E. Erskine, Garret N. Öz, Mehmet Mert Jost, Uwe Helmut Almendro Barreda, Daniel Paulino Sharma, Dushyant Gallopyn, Guido Remi Marcel Nour-Eldin, Amr Naylor, Patrick A.

Abstract

A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for automating diarization is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. An encounter transcript is generated based, at least in part, upon the first portion of the encounter information and the at least a second portion of the encounter information. A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for automating diarization is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. An encounter transcript is generated based, at least in part, upon the first portion of the encounter information and the at least a second portion of the encounter information. A computer-implemented method, computer program product, and computing system for automating role assignment is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to associate a first portion of the encounter information with a first encounter participant. A first role is assigned to the first encounter participant.

IPC Classes ?

H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules

5. Automated Clinical Documentation System and Method

Application Number	17210233
Status	Pending
Filing Date	2021-03-23
First Publication Date	2021-07-29
Owner	Nuance Communications, Inc. (USA)
Inventor	Owen, Donald E. Erskine, Garret N. Gallopyn, Guido Remi Marcel Öz, Mehmet Mert Almendro Barreda, Daniel Paulino

Abstract

A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a post-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a post-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a monitoring process is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to determine if the encounter information is indicative of a potential situation. An inquiry is initiated concerning the potential situation.

IPC Classes ?

G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
G16H 80/00 - ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
G16H 20/30 - ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to physical therapies or activities, e.g. physiotherapy, acupressure or exercising
A61B 5/00 - Measuring for diagnostic purposes ; Identification of persons
G06T 1/00 - General purpose image data processing
G16H 30/20 - ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS

6. Automated Clinical Documentation System and Method

Application Number	17210300
Status	Pending
Filing Date	2021-03-23
First Publication Date	2021-07-29
Owner	Nuance Communications, Inc. (USA)
Inventor	Owen, Donald E. Erskine, Garret N. Öz, Mehmet Mert Almendro Barreda, Daniel Paulino

Abstract

A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for reactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. A request is received from a user concerning a specific condition. In response to receiving the request, the encounter information is processed to determine if the encounter information is indicative of the specific condition and to generate a result set. The result set is provided to the user. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for reactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. A request is received from a user concerning a specific condition. In response to receiving the request, the encounter information is processed to determine if the encounter information is indicative of the specific condition and to generate a result set. The result set is provided to the user. A computer-implemented method, computer program product, and computing system for proactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is proactively processed to determine if the encounter information is indicative of one or more conditions and to generate one or more result sets. The one or more result sets are provided to the user.

IPC Classes ?

G16H 40/20 - ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
G16H 30/20 - ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
G06F 16/248 - Presentation of query results
G09B 19/00 - Teaching not covered by other main groups of this subclass
G06F 3/16 - Sound input; Sound output
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G06F 40/40 - Processing or translation of natural language
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

7. Spectral estimation of room acoustic parameters

Application Number	16084771
Grant Number	10403300
Status	In Force
Filing Date	2016-03-17
First Publication Date	2019-03-14
Grant Date	2019-09-03
Owner	Nuance Communications, Inc. (USA)
Inventor	Wolff, Tobias Desiraju, Naveen Kumar

Abstract

60, and can further estimate an additional parameter, such as Direct-to-Reverberant Ratio (DRR). The prediction filter may be adapted during a period of reverberation by minimizing a cost function. Adaptation can include using a gradient descent approach, which can operate according to a step size provided by an adaptation controller configured to determine the period of reverberation. One or more microphones can provide the signals. The reverberation parameters estimated can be applied to a reverberation suppressor, with an estimator that does not require a training phase and without relying on assumptions of the user's position relative to the microphones.

IPC Classes ?

G10L 21/0224 - Processing in the time domain
G10L 21/0208 - Noise filtering
G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
H04R 3/00 - Circuits for transducers
H04R 5/027 - Spatial or constructional arrangements of microphones, e.g. in dummy heads
H04R 5/04 - Circuit arrangements
G01H 7/00 - Measuring reverberation time

8. User dedicated automatic speech recognition

Application Number	15876545
Grant Number	10789950
Status	In Force
Filing Date	2018-01-22
First Publication Date	2018-06-07
Grant Date	2020-09-29
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Wolff, Tobias Buck, Markus Haulick, Tim

Abstract

A multi-mode voice controlled user interface is described. The user interface is adapted to conduct a speech dialog with one or more possible speakers and includes a broad listening mode which accepts speech inputs from the possible speakers without spatial filtering, and a selective listening mode which limits speech inputs to a specific speaker using spatial filtering. The user interface switches listening modes in response to one or more switching cues.

IPC Classes ?

G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/28 - Constructional details of speech recognition systems
G06F 3/16 - Sound input; Sound output
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
G10L 21/0216 - Noise filtering characterised by the method used for estimating noise

9. System and method for speech enhancement using a coherent to diffuse sound ratio

Application Number	15535245
Grant Number	10242690
Status	In Force
Filing Date	2014-12-12
First Publication Date	2017-11-16
Grant Date	2019-03-26
Owner	Nuance Communications, Inc. (USA)
Inventor	Wolff, Tobias Matheja, Timo Buck, Markus

Abstract

Embodiments of the present disclosure may include a system and method for speech enhancement using the coherent to diffuse sound ratio. Embodiments may include receiving an audio signal at one or more microphones and controlling one or more adaptive filters of a beamformer using a coherent to diffuse ratio (“CDR”).

IPC Classes ?

G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
G10L 21/0208 - Noise filtering
H04B 1/62 - TRANSMISSION - Details of transmission systems not characterised by the medium used for transmission for providing a predistortion of the signal in the transmitter and corresponding correction in the receiver, e.g. for improving the signal/noise ratio
H04R 3/00 - Circuits for transducers

10. System and method for generating a self-steering beamformer

Application Number	15535264
Grant Number	10924846
Status	In Force
Filing Date	2014-12-12
First Publication Date	2017-11-09
Grant Date	2021-02-16
Owner	Nuance Communications, Inc. (USA)
Inventor	Wolff, Tobias Buck, Markus

Abstract

A system and method for generating a self-steering beamformer is provided. Embodiments may include receiving, at one or more microphones, a first audio signal and adapting one or more blocking filters based upon, at least in part, the first audio signal. Embodiments may also include generating, using the one or more blocking filters, one or more noise reference signals. Embodiments may further include providing the one or more noise reference signals to an adaptive interference canceller to reduce a beamformer output power level.

IPC Classes ?

H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
G10L 21/0208 - Noise filtering
G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
H04R 1/24 - Structural combinations of separate transducers or of parts of the same transducer and responsive respectively to two or more frequency ranges
G10L 21/0272 - Voice signal separating

11. Text message generation for emergency services as a backup to voice communications

Application Number	15134733
Grant Number	09930502
Status	In Force
Filing Date	2016-04-21
First Publication Date	2016-08-11
Grant Date	2018-03-27
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Basore, David L. Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network nay correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes ?

H04W 4/14 - Short messaging services, e.g. short message service [SMS] or unstructured supplementary service data [USSD]
H04W 4/22 - Emergency connection handling
H04W 4/12 - Messaging; Mailboxes; Announcements

12. Voice commerce

Application Number	14855334
Grant Number	09626703
Status	In Force
Filing Date	2015-09-15
First Publication Date	2016-03-17
Grant Date	2017-04-18
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Kennewick, Sr., Michael R.

Abstract

In certain implementations, a system for facilitating voice commerce is provided. A user input comprising a natural language utterance related to a product or service to be purchased may be received. A first product or service that is to be purchased may be determined based on the utterance. First payment information that is to be used to purchase the first product or service may be obtained. First shipping information that is to be used to deliver the first product or service may be obtained. A purchase transaction for the first product or service may completed based on the first payment information and the first shipping information without further user input, after the receipt of utterance, that identifies a product or service type or a product or service, seller information, payment information, shipping information, or other information related to purchasing the first product or service.

IPC Classes ?

G06Q 30/06 - Buying, selling or leasing transactions
G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog

13. Task switching in dialogue processing

Application Number	14478121
Grant Number	09607102
Status	In Force
Filing Date	2014-09-05
First Publication Date	2016-03-10
Grant Date	2017-03-28
Owner	Nuance Communications, Inc. (USA)
Inventor	Lavallee, Jean-Francois Goussard, Jacques-Olivier Beaufort, Richard

Abstract

Disclosed methods and systems are directed to task switching in dialog processing. The methods and systems may include activating a primary task, receiving, one or more ambiguous natural language commands, and identifying a first candidate task for each of the one or more ambiguous natural language commands. The methods and system may also include identifying, for each of the one or more ambiguous natural language commands and based on one or more rules, a second candidate task of the plurality of tasks corresponding to the ambiguous natural language command, determining whether to modify at least one of the one or more rules-based task switching rules based on whether a quality metric satisfies a threshold quantity, and when the second quality metric satisfies the threshold quantity, changing the task switching rule for the corresponding candidate task from a rules-based model to the optimized statistical based task switching model.

IPC Classes ?

G06F 17/40 - Data acquisition and logging
G06F 17/30 - Information retrieval; Database structures therefor
G10L 15/00 - Speech recognition
G06F 3/16 - Sound input; Sound output

14. System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements

Application Number	14836606
Grant Number	09406078
Status	In Force
Filing Date	2015-08-26
First Publication Date	2015-12-17
Grant Date	2016-08-02
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennewick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and/or provide natural language processing based on advertisements. In one implementation, an advertisement associated with a product or service may be provided for presentation to a user. A natural language utterance of the user may be received. The natural language utterance may be interpreted based on the advertisement and, responsive to the existence of a pronoun in the natural language utterance, a determination of whether the pronoun refers to one or more of the product or service or a provider of the product or service may be effectuated.

IPC Classes ?

G10L 15/26 - Speech to text systems
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
G10L 15/18 - Speech classification or search using natural language modelling
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

15. System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Application Number	14698183
Grant Number	09305547
Status	In Force
Filing Date	2015-04-28
First Publication Date	2015-08-27
Grant Date	2016-04-05
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Ljolje, Andrej Conkie, Alistair D. Syrdal, Ann K.

Abstract

Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

IPC Classes ?

G10L 15/04 - Segmentation; Word boundary detection
G10L 15/187 - Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
G10L 15/07 - Adaptation to the speaker
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]

16. Online maximum-likelihood mean and variance normalization for speech recognition

Application Number	14640912
Grant Number	09280979
Status	In Force
Filing Date	2015-03-06
First Publication Date	2015-08-06
Grant Date	2016-03-08
Owner	Nuance Communications, Inc. (USA)
Inventor	Willett, Daniel

Abstract

A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

IPC Classes ?

G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
G10L 15/08 - Speech classification or search
G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 15/34 - Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing

17. Techniques for evaluation, building and/or retraining of a classification model

Application Number	14686099
Grant Number	09311609
Status	In Force
Filing Date	2015-04-14
First Publication Date	2015-08-06
Grant Date	2016-04-12
Owner	Nuance Communications, Inc. (USA)
Inventor	Marcheret, Etienne

Abstract

Techniques for evaluation and/or retraining of a classification model built using labeled training data. In some aspects, a classification model having a first set of weights is retrained by using unlabeled input to reweight the labeled training data to have a second set of weights, and by retraining the classification model using the labeled training data weighted according to the second set of weights. In some aspects, a classification model is evaluated by building a similarity model that represents similarities between unlabeled input and the labeled training data and using the similarity model to evaluate the labeled training data to identify a subset of the plurality of items of labeled training data that is more similar to the unlabeled input than a remainder of the labeled training data.

IPC Classes ?

G06N 99/00 - Subject matter not provided for in other groups of this subclass
G06N 7/00 - Computing arrangements based on specific mathematical models

18. Multiple web-based content category searching in mobile search application

Application Number	14570404
Grant Number	09619572
Status	In Force
Filing Date	2014-12-15
First Publication Date	2015-04-09
Grant Date	2017-04-11
Owner	Nuance Communications, Inc. (USA)
Inventor	Phillips, Michael S. Nguyen, John N.

Abstract

In embodiments of the present invention improved capabilities are described for multiple web-based content category searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting at least a portion of the captured speech as data through a wireless communication facility to a speech recognition facility; generating speech-to-text results for the captured speech utilizing the speech recognition facility; and transmitting the text results and a plurality of formatting rules specifying how search text may be used to form a query for a search capability on the mobile communications facility, wherein each formatting rule is associated with a category of content to be searched.

IPC Classes ?

G06F 17/30 - Information retrieval; Database structures therefor
G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
G10L 15/26 - Speech to text systems
G10L 25/48 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use

19. Dealing with switch latency in speech recognition

Application Number	14537418
Grant Number	09495956
Status	In Force
Filing Date	2014-11-10
First Publication Date	2015-03-12
Grant Date	2016-11-15
Owner	Nuance Communications, Inc. (USA)
Inventor	Meisel, William S. Phillips, Michael S. Nguyen, John N.

Abstract

In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.

IPC Classes ?

G10L 15/08 - Speech classification or search
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G06F 3/16 - Sound input; Sound output
G10L 17/22 - Interactive procedures; Man-machine interfaces

20. Method and system for dictionary noise removal

Application Number	14010903
Grant Number	09336195
Status	In Force
Filing Date	2013-08-27
First Publication Date	2015-03-05
Grant Date	2016-05-10
Owner	Nuance Communications, Inc. (USA)
Inventor	Barrett, Neil D.

Abstract

A method and system of removing noise from a dictionary using a weighted graph is presented. The method can include mapping, by a noise reducing agent executing on a processor, a plurality of dictionaries to a plurality of vertices of a graphical representation, wherein the plurality of vertices is connected by weighted edges representing noise. The plurality of dictionaries may further comprise a plurality of entries, wherein each entry further comprises a plurality of tokens. The method can include selecting a subset of the weighted edges, constructing an acyclic graphical representation from the selected subset of weighted edges, and determining an ordering based on the acyclic graphical representation. The selected subset of weighted edges may approximate a solution to the Maximum Acyclic Subgraph problem. The method can include removing noise from the plurality of dictionaries according to the determined ordering.

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G06F 17/30 - Information retrieval; Database structures therefor
G06F 19/00 - Digital computing or data processing equipment or methods, specially adapted for specific applications (specially adapted for specific functions G06F 17/00;data processing systems or methods specially adapted for administrative, commercial, financial, managerial, supervisory or forecasting purposes G06Q;healthcare informatics G16H)

21. System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements

Application Number	14537598
Grant Number	09269097
Status	In Force
Filing Date	2014-11-10
First Publication Date	2015-03-05
Grant Date	2016-02-23
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennewick, Mike

Abstract

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
G10L 15/26 - Speech to text systems
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

22. Text message generation for emergency services as a backup to voice communications

Application Number	14478048
Grant Number	09351142
Status	In Force
Filing Date	2014-09-05
First Publication Date	2014-12-18
Grant Date	2016-05-24
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Basore, David L. Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network may correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes ?

H04W 4/22 - Emergency connection handling
H04W 4/12 - Messaging; Mailboxes; Announcements

23. System and method for providing network coordinated conversational services

Application Number	14448216
Grant Number	09761241
Status	In Force
Filing Date	2014-07-31
First Publication Date	2014-11-20
Grant Date	2017-09-12
Owner	Nuance Communications, Inc. (USA)
Inventor	Maes, Stephane H. Gopalakrishnan, Ponani S.

Abstract

A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.

IPC Classes ?

G10L 15/00 - Speech recognition
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
H04L 29/06 - Communication control; Communication processing characterised by a protocol
H04L 12/24 - Arrangements for maintenance or administration

24. Machine translation using global lexical selection and sentence reconstruction

Application Number	14336297
Grant Number	09323745
Status	In Force
Filing Date	2014-07-21
First Publication Date	2014-11-06
Grant Date	2016-04-26
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Bangalore, Srinivas Haffner, Patrick Kanthak, Stephan

Abstract

Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

IPC Classes ?

G06F 17/28 - Processing or translating of natural language
G11B 27/10 - Indexing; Addressing; Timing or synchronising; Measuring tape travel
G11B 27/28 - Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
H04N 5/445 - Receiver circuitry for displaying additional information
H04N 5/45 - Picture in picture
H04N 5/765 - Interface circuits between an apparatus for recording and another apparatus
H04N 21/232 - Content retrieval operation within server, e.g. reading video streams from disk arrays
H04N 21/233 - Processing of audio elementary streams
H04N 21/235 - Processing of additional data, e.g. scrambling of additional data or processing content descriptors
H04N 21/258 - Client or end-user data management, e.g. managing client capabilities, user preferences or demographics or processing of multiple end-users preferences to derive collaborative data
H04N 21/482 - End-user interface for program selection
H04N 21/81 - Monomedia components thereof
H04N 21/84 - Generation or processing of descriptive data, e.g. content descriptors
H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
H04N 21/8547 - Content authoring involving timestamps for synchronizing content
H04N 21/2662 - Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
G10L 15/26 - Speech to text systems

25. Method for determining a set of filter coefficients for an acoustic echo compensator

Application Number	14314106
Grant Number	09264805
Status	In Force
Filing Date	2014-06-25
First Publication Date	2014-10-16
Grant Date	2016-02-16
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Buck, Markus Schmidt, Gerhard Uwe Wolff, Tobias

Abstract

Methods and apparatus for beamforming and performing echo compensation for the beamformed signal with an echo canceller including calculating a set of filter coefficients as an estimate for a new steering direction without a complete adaptation of the echo canceller.

IPC Classes ?

H04R 3/00 - Circuits for transducers
G10L 21/0208 - Noise filtering
H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

26. System and method for handling missing speech data

Application Number	14299745
Grant Number	09305546
Status	In Force
Filing Date	2014-06-09
First Publication Date	2014-09-25
Grant Date	2016-04-05
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Ljolje, Andrej Conkie, Alistair D.

Abstract

Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

IPC Classes ?

G10L 15/00 - Speech recognition
G10L 15/18 - Speech classification or search using natural language modelling
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

27. Biometric authorization for real time access control

Application Number	13787774
Grant Number	09348988
Status	In Force
Filing Date	2013-03-06
First Publication Date	2014-09-11
Grant Date	2016-05-24
Owner	Nuance Communications, Inc. (USA)
Inventor	Dykstra-Erickson, Elizabeth Ann Daniel, Susan Dawnstarr Mauro, David Andrew

Abstract

A method of providing biometric authorization comprising enabling a user to log into an account, and determining whether there is a hold on the account. When there is a hold on the account, informing the user of the hold, and enabling the user to respond to a transaction that caused the hold. The method, in one embodiment further comprising prompting the user to enter a biometric authentication, in conjunction with the response, and processing the unblock request in real-time upon receiving and validating the biometric authentication.

IPC Classes ?

G06F 7/04 - Identity comparison, i.e. for like or unlike values
G06F 21/32 - User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints

28. Speaker localization

Application Number	14178309
Grant Number	09622003
Status	In Force
Filing Date	2014-02-12
First Publication Date	2014-09-04
Grant Date	2017-04-11
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Schmidt, Gerhard Uwe Wolff, Tobias Buck, Markus Valbuena, Olga Gonzalez Wirsching, Gunther

Abstract

Methods and apparatus for determining phase shift information between the first and second microphone signals for a sound signal, and determining an angle of incidence of the sound in relation to the first and second positions of the first and second microphones from the phase shift information of a band-limited test signal received by the first and second microphones for a frequency range of interest.

IPC Classes ?

H04R 3/00 - Circuits for transducers
H04R 29/00 - Monitoring arrangements; Testing arrangements
G10L 21/0272 - Voice signal separating
G10L 21/0216 - Noise filtering characterised by the method used for estimating noise

29. Machine translation using global lexical selection and sentence reconstruction

Application Number	11686681
Grant Number	08788258
Status	In Force
Filing Date	2007-03-15
First Publication Date	2014-07-22
Grant Date	2014-07-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Bangalore, Srinivas Haffner, Patrick Kanthak, Stephan

Abstract

IPC Classes ?

G06F 17/28 - Processing or translating of natural language

30. Beamforming pre-processing for speaker localization

Application Number	14176351
Grant Number	09414159
Status	In Force
Filing Date	2014-02-10
First Publication Date	2014-06-05
Grant Date	2016-08-09
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Wolff, Tobias Buck, Markus Schmidt, Gerhard Uwe

Abstract

Methods and apparatus to beamform a first plurality of microphone signals using at least one beamforming weight to obtain a first beamformed signal, beamform a second plurality of microphone signals using the at least one beamforming weight to obtain a second beamformed signal, and adjust the at least one beamforming weight so that the power density of at least one perturbation component present in the first or the second plurality of microphone signals is reduced.

IPC Classes ?

H04R 3/00 - Circuits for transducers

31. Text message generation for emergency services as a backup to voice communications

Application Number	13689396
Grant Number	08874070
Status	In Force
Filing Date	2012-11-29
First Publication Date	2014-05-29
Grant Date	2014-10-28
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Basore, David L. Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network may correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes ?

H04M 11/04 - Telephonic communication systems specially adapted for combination with other electrical systems with alarm systems, e.g. fire, police or burglar alarm systems
H04W 4/22 - Emergency connection handling

32. Accuracy improvement of spoken queries transcription using co-occurrence information

Application Number	14156788
Grant Number	09330661
Status	In Force
Filing Date	2014-01-16
First Publication Date	2014-05-15
Grant Date	2016-05-03
Owner	Nuance Communications, Inc. (USA)
Inventor	Mamou, Jonathan Sethy, Abhinav Ramabhadran, Bhuvana Hoory, Ron Vozila, Paul Joseph Bodenstab, Nathan

Abstract

Techniques disclosed herein include systems and methods for voice-enabled searching. Techniques include a co-occurrence based approach to improve accuracy of the 1-best hypothesis for non-phrase voice queries, as well as for phrased voice queries. A co-occurrence model is used in addition to a statistical natural language model and acoustic model to recognize spoken queries, such as spoken queries for searching a search engine. Given an utterance and an associated list of automated speech recognition n-best hypotheses, the system rescores the different hypotheses using co-occurrence information. For each hypothesis, the system estimates a frequency of co-occurrence within web documents. Combined scores from a speech recognizer and a co-occurrence engine can be combined to select a best hypothesis with a lower word error rate.

IPC Classes ?

G10L 15/00 - Speech recognition
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 15/26 - Speech to text systems
G10L 15/04 - Segmentation; Word boundary detection
G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
G10L 15/28 - Constructional details of speech recognition systems
G10L 15/18 - Speech classification or search using natural language modelling
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
G10L 15/08 - Speech classification or search
G06F 7/00 - Methods or arrangements for processing data by operating upon the order or content of the data handled
G06F 17/30 - Information retrieval; Database structures therefor

33. System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts

Application Number	14016757
Grant Number	08886536
Status	In Force
Filing Date	2013-09-03
First Publication Date	2014-01-09
Grant Date	2014-11-11
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennwick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and track advertisement interactions in voice recognition contexts. In particular, in response to an input device receiving an utterance, a conversational language processor may select and deliver one or more advertisements targeted to a user that spoke the utterance based on cognitive models associated with the user, various users having similar characteristics to the user, an environment in which the user spoke the utterance, or other criteria. Further, subsequent interaction with the targeted advertisements may be tracked to build and refine the cognitive models and thereby enhance the information used to deliver targeted advertisements in response to subsequent utterances.

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G06Q 30/00 - Commerce
G10L 15/26 - Speech to text systems
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising

34. System and method for a cooperative conversational voice user interface

Application Number	13987645
Grant Number	09015049
Status	In Force
Filing Date	2013-08-19
First Publication Date	2013-12-19
Grant Date	2015-04-21
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Baldwin, Larry Freeman, Tom Tjalve, Michael Ebersold, Blane Weider, Chris

Abstract

A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.

IPC Classes ?

G10L 15/00 - Speech recognition
G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G06F 3/16 - Sound input; Sound output
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

35. Automatic updating of confidence scoring functionality for speech recognition systems with respect to a receiver operating characteristic curve

Application Number	13977174
Grant Number	09330665
Status	In Force
Filing Date	2011-01-07
First Publication Date	2013-10-17
Grant Date	2016-05-03
Owner	Nuance Communications, Inc. (USA)
Inventor	Morales, Nicolas Connolly, Dermot Halberstadt, Andrew

Abstract

Automatically adjusting confidence scoring functionality is described for a speech recognition engine. Operation of the speech recognition system is revised so as to change an associated receiver operating characteristic (ROC) curve describing performance of the speech recognition system with respect to rates of false acceptance (FA) versus correct acceptance (CA). Then a confidence scoring functionality related to recognition reliability for a given input utterance is automatically adjusted such that where the ROC curve is better for a given operating point after revising the operation of the speech recognition system, the adjusting reflects a double gain constraint to maintain FA and CA rates at least as good as before revising operation of the speech recognition system.

IPC Classes ?

G10L 15/01 - Assessment or evaluation of speech recognition systems
G10L 15/065 - Adaptation
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

36. Integrating multimedia and voicemail

Application Number	13868278
Grant Number	09313624
Status	In Force
Filing Date	2013-04-23
First Publication Date	2013-09-05
Grant Date	2016-04-12
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Shaw, Venson M. Silverman, Alexander E.

Abstract

Integrated multimedia voicemail systems and methods allow the creation of voicemail with associated multimedia content. A user can compose a voicemail and select or create multimedia content to be associated with the voicemail. A user can associate files, webpage addresses, applications, and user-created content with a voicemail. A user may operate an interface on a user device to select content and instruct a voicemail system to associate such content with a voicemail. The voicemail with integrated multimedia content may be an originating voicemail or a voicemail in response to another voicemail.

IPC Classes ?

H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems
H04W 4/12 - Messaging; Mailboxes; Announcements
H04M 3/53 - Centralised arrangements for recording incoming messages
H04L 12/58 - Message switching systems
H04M 1/725 - Cordless telephones

37. Message translations

Application Number	13755903
Grant Number	08688433
Status	In Force
Filing Date	2013-01-31
First Publication Date	2013-06-06
Grant Date	2014-04-01
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Davis, Joel A. Kent, Jr., Larry G. Daniell, W. Todd Daigle, Brian K.

Abstract

Systems for translating text messages in an instant messaging system comprise a translation engine for translating text messages into a preferred language of a recipient of the text messages. The systems are preferably configured to send and receive the text messages and to determine whether the text messages that are received in a source language are in the preferred language of the recipients so that the text messages are displayed in the preferred language of the recipients of the text messages. Other systems and methods are also provided.

IPC Classes ?

G06F 17/28 - Processing or translating of natural language

38. System and method for structuring speech recognized text into a pre-selected document format

Application Number	13718568
Grant Number	09396166
Status	In Force
Filing Date	2012-12-18
First Publication Date	2013-05-02
Grant Date	2016-07-19
Owner	Nuance Communications, Inc. (USA)
Inventor	Rosen, Lee Roe, Ed Poust, Wade

Abstract

A system for creating a structured report using a template having at least one predetermined heading and formatting data associated with each heading. The steps include recording a voice file, creating a speech recognized text file corresponding to the voice file, identifying the location of each heading in the text file, and the text corresponding thereto, populating the template with the identified text corresponding to each heading, and formatting the populated template to create the structured report.

IPC Classes ?

G06F 17/21 - Text processing
G10L 15/26 - Speech to text systems
G06F 17/24 - Editing, e.g. insert/delete

39. Automated sentence planning in a task classification system

Application Number	13470913
Grant Number	08620669
Status	In Force
Filing Date	2012-05-14
First Publication Date	2013-02-14
Grant Date	2013-12-31
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Walker, Marilyn A. Rambow, Owen Christopher Rogati, Monica

Abstract

Disclosed is a task classification system that interacts with a user. The task classification system may include a recognizer that may recognize symbols in the user's input communication, and a natural language understanding unit that may determine whether the user's input communication can be understood. If the user's input communication can be understood, the natural language understanding unit may generate understanding data. The system may also include a communicative goal generator that may generate communicative goals based on the symbols recognized by the recognizer and understanding data from the natural language understanding unit. The generated communicative goals may be related to information needed to be obtained from the user. The system may further include a sentence planning unit that may automatically plan one or more sentences based on the communicative goals generated by the communicative goal generator with at least one of the sentences plans being output to the user.

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 15/00 - Speech recognition
G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups

40. Acoustic localization of a speaker

Application Number	13478941
Grant Number	09338549
Status	In Force
Filing Date	2012-05-23
First Publication Date	2012-11-22
Grant Date	2016-05-10
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Haulick, Tim Schmidt, Gerhard Uwe Buck, Markus Wolff, Tobias

Abstract

A system locates a speaker in a room containing a loudspeaker and a microphone array. The loudspeaker transmits a sound that is partly reflected by a speaker. The microphone array detects the reflected sound and converts the sound into a microphone array, the speaker's distance from the microphone array, or both, based on the characteristics of the microphone signals.

IPC Classes ?

G01S 3/80 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves
H04R 3/00 - Circuits for transducers
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
G01S 3/808 - Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems
G01S 5/30 - Determining absolute distances from a plurality of spaced points of known location
G01S 15/00 - Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
G01S 7/52 - RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES - Details of systems according to groups , , of systems according to group
G01S 15/42 - Simultaneous measurement of distance and other coordinates
G01S 15/87 - Combinations of sonar systems
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
G01S 13/00 - Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
G01S 7/292 - Extracting wanted echo-signals
H04R 29/00 - Monitoring arrangements; Testing arrangements
H04M 1/60 - Substation equipment, e.g. for use by subscribers including speech amplifiers
G01S 13/42 - Simultaneous measurement of distance and other coordinates
G01S 5/02 - Position-fixing by co-ordinating two or more direction or position-line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
G01S 15/06 - Systems determining position data of a target
G01S 3/802 - Systems for determining direction or deviation from predetermined direction
G01S 7/523 - RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES - Details of systems according to groups , , of systems according to group - Details of pulse systems
H04B 7/08 - Diversity systems; Multi-antenna systems, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the receiving station
G01S 3/04 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using radio waves - Details
G01S 5/18 - Position-fixing by co-ordinating two or more direction or position-line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
G10K 11/34 - Sound-focusing or directing, e.g. scanning using electrical steering of transducer arrays, e.g. beam steering
G01S 3/00 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
G10K 15/08 - Arrangements for producing a reverberation or echo sound

41. Method and system for automatic transcription prioritization

Application Number	13354142
Grant Number	08407050
Status	In Force
Filing Date	2012-01-19
First Publication Date	2012-06-28
Grant Date	2013-03-26
Owner	Nuance Communications, Inc. (USA)
Inventor	Kobal, Jeffrey S. Dhanakshirur, Girish

Abstract

A visual toolkit for prioritizing speech transcription is provided. The toolkit can include a logger (102) for capturing information from a speech recognition system, a processor (104) for determining an accuracy rating of the information, and a visual display (106) for categorizing the information and prioritizing a transcription of the information based on the accuracy rating. The prioritizing identifies spoken utterances having a transcription priority in view of the recognized result. The visual display can include a transcription category (156) having a modifiable textbox entry with a text entry initially corresponding to a text of the recognized result, and an accept button (157) for validating a transcription of the recognized result. The categories can be automatically ranked by the accuracy rating in an ordered priority for increasing an efficiency of transcription.

IPC Classes ?

G10L 15/26 - Speech to text systems

42. System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts

Application Number	13371870
Grant Number	08527274
Status	In Force
Filing Date	2012-02-13
First Publication Date	2012-06-14
Grant Date	2013-09-03
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennewick, Mike

Abstract

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

43. System and method for isolating and processing common dialog cues

Application Number	11246604
Grant Number	08185400
Status	In Force
Filing Date	2005-10-07
First Publication Date	2012-05-22
Grant Date	2012-05-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Goffin, Vincent J. Parthasarathy, Sarangarajan

Abstract

A method, system and machine-readable medium are provided. Speech input is received at a speech recognition component and recognized output is produced. A common dialog cue from the received speech input or input from a second source is recognized. An action is performed corresponding to the recognized common dialog cue. The performed action includes sending a communication from the speech recognition component to the speech generation component while bypassing a dialog component.

IPC Classes ?

G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 17/00 - Speaker identification or verification
G10L 15/26 - Speech to text systems
G10L 15/00 - Speech recognition
G10L 15/28 - Constructional details of speech recognition systems
G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

44. Text entry with word prediction, completion, or correction supplemented by search of shared corpus

Application Number	12943856
Grant Number	09626429
Status	In Force
Filing Date	2010-11-10
First Publication Date	2012-05-10
Grant Date	2017-04-18
Owner	Nuance Communications, Inc. (USA)
Inventor	Unruh, Erland

Abstract

Searching a shared corpus is used to supplement word prediction, completion, and/or correction of text entry. A user input device at a client device receives user entry of text input comprising a string of symbols. The client device wirelessly transmits instructions to a remote site to conduct a search of a corpus using the string as a contiguous search term. From the remote site, the client device receives results of the search, including multiple sets of one or more words, each set occurring in the corpus immediately after the search term. The client device uses the received sets in word prediction, completion, and/or correction.

IPC Classes ?

G06F 17/30 - Information retrieval; Database structures therefor
G06F 3/023 - Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

45. Multi-state barge-in models for spoken dialog systems

Application Number	13279443
Grant Number	08612234
Status	In Force
Filing Date	2011-10-24
First Publication Date	2012-04-26
Grant Date	2013-12-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Ljolje, Andrej

Abstract

A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.

IPC Classes ?

G10L 15/00 - Speech recognition

46. Voicemail system and method for providing voicemail to text message conversion

Application Number	11954267
Grant Number	08139726
Status	In Force
Filing Date	2007-12-12
First Publication Date	2012-03-20
Grant Date	2012-03-20
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Zetterberg, Carl Peter

Abstract

A method and system for allowing a calling party to send a voicemail message as a text message. A calling party leaves a voicemail message and that message is converted from voice to a text message. If the calling party wishes to confirm the conversion, the text message is then converted to a voicemail message. The converted voicemail message is presented to the calling party so that the calling party can review and edit the message. The calling party can review and edit any portion of the converted voicemail message. The edits of the voicemail message are applied and the voicemail message is converted to a new text message. If the calling party wishes to further review and edit the text message, it is converted to a new voicemail; otherwise the text message is sent to the called party.

IPC Classes ?

H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems

47. System and method for a cooperative conversational voice user interface

Application Number	13251712
Grant Number	08515765
Status	In Force
Filing Date	2011-10-03
First Publication Date	2012-01-26
Grant Date	2013-08-20
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Baldwin, Larry Freeman, Tom Tjalve, Michael Ebersold, Blane Weider, Chris

Abstract

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

48. Multi-pass echo residue detection with speech application intelligence

Application Number	13236968
Grant Number	08244529
Status	In Force
Filing Date	2011-09-20
First Publication Date	2012-01-12
Grant Date	2012-08-14
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Wong, Ngai Chiu

Abstract

A method is provided for multi-pass echo residue detection. The method includes detecting audio data, and determining whether the audio data is recognized as speech. Additionally, the method categorizes the audio data recognized as speech as including an acceptable level of residual echo, and categorizes categorizing unrecognizable audio data as including an unacceptable level of residual echo. Furthermore, the method determines whether the unrecognizable audio data contains a user input, and also determines whether a duration of the user input is at least a predetermined duration, and when the user input is at least the predetermined duration, the method extracts the predetermined duration of the user input from a total duration of the user input.

IPC Classes ?

G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 11/02 - Detection of presence or absence of speech signals

49. Method and system for using input signal quality in speech recognition

Application Number	13205775
Grant Number	08190430
Status	In Force
Filing Date	2011-08-09
First Publication Date	2012-01-05
Grant Date	2012-05-29
Owner	Nuance Communications, Inc. (USA)
Inventor	Doyle, John Pickering, John Brian

Abstract

A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.

IPC Classes ?

G10L 15/00 - Speech recognition

50. Method and system for identifying and correcting accent-induced speech recognition difficulties

Application Number	13228879
Grant Number	08285546
Status	In Force
Filing Date	2011-09-09
First Publication Date	2011-12-29
Grant Date	2012-10-09
Owner	Nuance Communications, Inc. (USA)
Inventor	Reich, David E.

Abstract

A system for use in speech recognition includes an acoustic module accessing a plurality of distinct-language acoustic models, each based upon a different language; a lexicon module accessing at least one lexicon model; and a speech recognition output module. The speech recognition output module generates a first speech recognition output using a first model combination that combines one of the plurality of distinct-language acoustic models with the at least one lexicon model. In response to a threshold determination, the speech recognition output module generates a second speech recognition output using a second model combination that combines a different one of the plurality of distinct-language acoustic models with the at least one distinct-language lexicon model.

IPC Classes ?

G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
G10L 15/00 - Speech recognition
G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/10 - Speech classification or search using distance or distortion measures between unknown speech and reference templates
G10L 15/28 - Constructional details of speech recognition systems
G10L 17/00 - Speaker identification or verification

51. Automated sentence planning in a task classification system

Application Number	13230254
Grant Number	08180647
Status	In Force
Filing Date	2011-09-12
First Publication Date	2011-12-29
Grant Date	2012-05-15
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Walker, Marilyn A. Rambow, Owen Christopher Rogati, Monica

Abstract

The invention relates to a task classification system (900) that interacts with a user. The task classification system (900) may include a recognizer (920) that may recognize symbols in the user's input communication, and a natural language understanding unit (900) that may determine whether the user's input communication can be understood. If the user's input communication can be understood, the natural language understanding unit (930) may generate understanding data. The system may also include a communicative goal generator that may generate communicative goals based on the symbols recognized by the recognizer (920) and understanding data from the natural language understanding unit (930). The generated communicative goals may be related to information needed to be obtained from the user. The system may further include a sentence planning unit (120) that may automatically plan one or more sentences based on the communicative goals generated by the communicative goal generator with at least one of the sentences plans being output to the user.

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

52. Method for automated sentence planning in a task classification system

Application Number	13110628
Grant Number	08209186
Status	In Force
Filing Date	2011-05-18
First Publication Date	2011-09-08
Grant Date	2012-06-26
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Walker, Marilyn A. Rambow, Owen Christopher Rogati, Monica

Abstract

The invention relates to a method for sentence planning (120) in a task classification system that interacts with a user. The method may include recognizing symbols in the user's input communication and determining whether the user's input communication can be understood. If the user's communication can be understood, understanding data may be generated (220). The method may further include generating communicative goals (3010) based on the recognized symbols and understanding data. The generated communicative goals (3010) may be related to information needed to be obtained form the user. The method may also include automatically planning one or more sentences (3020) based on the generated communicative goals and outputting at least one of the sentence plans to the user (3080).

IPC Classes ?

G10L 21/06 - Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

53. Integrating multimedia and voicemail

Application Number	12606503
Grant Number	08447261
Status	In Force
Filing Date	2009-10-27
First Publication Date	2011-04-28
Grant Date	2013-05-21
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Shaw, Venson M. Silverman, Alexander E.

Abstract

IPC Classes ?

H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems
H04M 11/10 - Telephonic communication systems specially adapted for combination with other electrical systems with dictation recording and playback systems
H04M 1/00 - Substation equipment, e.g. for use by subscribers

54. System and method for improving robustness of speech recognition using vocal tract length normalization codebooks

Application Number	12869039
Grant Number	08160875
Status	In Force
Filing Date	2010-08-26
First Publication Date	2010-12-23
Grant Date	2012-04-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Gilbert, Mazin

Abstract

Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook.

IPC Classes ?

G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

55. System and method for selecting and presenting advertisements based on natural language processing of voice-based input

Application Number	12847564
Grant Number	08145489
Status	In Force
Filing Date	2010-07-30
First Publication Date	2010-11-25
Grant Date	2012-03-27
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennewick, Mike

Abstract

A system and method for selecting and presenting advertisements based on natural language processing of voice-based inputs is provided. A user utterance may be received at an input device, and a conversational, natural language processor may identify a request from the utterance. At least one advertisement may be selected and presented to the user based on the identified request. The advertisement may be presented as a natural language response, thereby creating a conversational feel to the presentation of advertisements. The request and the user's subsequent interaction with the advertisement may be tracked to build user statistical profiles, thus enhancing subsequent selection and presentation of advertisements.

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G06Q 30/00 - Commerce

56. Automatic setting of reminders in telephony using speech recognition

Application Number	12465731
Grant Number	08145274
Status	In Force
Filing Date	2009-05-14
First Publication Date	2010-11-18
Grant Date	2012-03-27
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Gandhi, Salil P. Kottawar, Saidas T. Macias, Mike V. Mahajan, Sandip D.

Abstract

Systems and methods for automatically setting reminders. A method for automatically setting reminders includes receiving utterances, determining whether the utterances match a stored phrase, and in response to determining that there is a match, automatically setting a reminder in a mobile communication device. Various filters can be applied to determine whether or not to set a reminder. Examples of suitable filters include location, date/time, callee's phone number, etc.

IPC Classes ?

H04B 1/38 - Transceivers, i.e. devices in which transmitter and receiver form a structural unit and in which at least one part is used for functions of transmitting and receiving

57. Automated sentence planning in a task classification system

Application Number	12789883
Grant Number	08185401
Status	In Force
Filing Date	2010-05-28
First Publication Date	2010-09-23
Grant Date	2012-05-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Walker, Marilyn A. Rambow, Owen Christopher Rogati, Monica

Abstract

The invention relates to a system that interacts with a user in an automated dialog system (100). The system may include a communicative goal generator (210) that generates communicative goals based on a first communication received from the user. The generated communicative goals (210) may be related to information needed to be obtained from the user. The system may further include a sentence planning unit (220) that automatically plans one or more sentences based on the communicative goals generated by the communicative goal generator (210). At least one of the planned sentences may be then output to the user (230).

IPC Classes ?

G10L 21/06 - Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

58. Method for determining a set of filter coefficients for an acoustic echo compensator

Application Number	12708172
Grant Number	08787560
Status	In Force
Filing Date	2010-02-18
First Publication Date	2010-08-26
Grant Date	2014-07-22
Owner	Nuance Communications, Inc. (USA)
Inventor	Buck, Markus Schmidt, Gerhard Wolff, Tobias

Abstract

The invention provides a method for determining a set of filter coefficients for an acoustic echo compensator in a beamformer arrangement. The acoustic echo compensator compensates for echoes within the beamformed signal. A plurality of sets of filter coefficients for the acoustic echo compensator is provided. Each set of filter coefficients corresponds to one of a predetermined number of steering directions of the beamformer arrangement. The predetermined number of steering directions is equal to or greater than the number of microphones in the microphone array. For a current steering direction, a current set of filter coefficients for the acoustic echo compensator is determined based on the provided sets of filter coefficients.

IPC Classes ?

H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
G01S 15/00 - Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
H04R 3/00 - Circuits for transducers

59. Speech recognition of a list entry

Application Number	12706245
Grant Number	08532990
Status	In Force
Filing Date	2010-02-16
First Publication Date	2010-08-19
Grant Date	2013-09-10
Owner	Nuance Communications, Inc. (USA)
Inventor	Hillebrecht, Christian Schwarz, Markus

Abstract

The present invention relates to a method of generating a candidate list from a list of entries in accordance with a string of subword units corresponding to a speech input in a speech recognition system, the list of entries including plural list entries each comprising at least one fragment having one or more subword units. For each list entry, the fragments of the list entry are compared with the string of subword units. A matching score for each of the compared fragments based on the comparison is determined. The matching score for a fragment is further based on a comparison of at least one other fragment of the same list entry with the string of subword units. A total score for each list entry is determined based on the matching scores for the compared fragments of the respective list entry. A candidate list with the best matching entries from the list of entries based on the total scores of the list entries is generated.

IPC Classes ?

G10L 15/00 - Speech recognition

60. System and method for enhancing speech recognition accuracy

Application Number	12339802
Grant Number	08160879
Status	In Force
Filing Date	2008-12-19
First Publication Date	2010-06-24
Grant Date	2012-04-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Czahor, Michael

Abstract

Systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments. Exclusively activating each weighted grammar can include a transition period blending the previously activated grammar and the grammar to be activated.

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling

61. User intention based on N-best list of recognition hypotheses for utterances in a dialog

Application Number	12325786
Grant Number	08140328
Status	In Force
Filing Date	2008-12-01
First Publication Date	2010-06-03
Grant Date	2012-03-20
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Williams, Jason

Abstract

Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for using alternate recognition hypotheses to improve whole-dialog understanding accuracy. The method includes receiving an utterance as part of a user dialog, generating an N-best list of recognition hypotheses for the user dialog turn, selecting an underlying user intention based on a belief distribution across the generated N-best list and at least one contextually similar N-best list, and responding to the user based on the selected underlying user intention. Selecting an intention can further be based on confidence scores associated with recognition hypotheses in the generated N-best lists, and also on the probability of a user's action given their underlying intention. A belief or cumulative confidence score can be assigned to each inferred user intention.

IPC Classes ?

G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]

62. Method and device for locating a sound source

Application Number	12547681
Grant Number	08194500
Status	In Force
Filing Date	2009-08-26
First Publication Date	2010-03-04
Grant Date	2012-06-05
Owner	Nuance Communications, Inc. (USA)
Inventor	Wolff, Tobias Buck, Markus Schmidt, Gerhard Valbuena, Olga González Wirsching, Günther

Abstract

A method of locating a sound source based on sound received at an array of microphones comprises the steps of determining a correlation function of signals provided by microphones of the array and establishing a direction in which the sound source is located based on at least one eigenvector of a matrix having matrix elements which are determined based on the correlation function. The correlation function has first and second frequency components associated with a first and second frequency band, respectively. The first frequency component is determined based on signals from microphones having a first distance, and the second frequency component is determined based on signals from microphones having a second distance different from the first distance.

IPC Classes ?

G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
G01S 3/80 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves
H04R 3/00 - Circuits for transducers

63. Method and apparatus for providing voice control for accessing teleconference services

Application Number	12553700
Grant Number	08184792
Status	In Force
Filing Date	2009-09-03
First Publication Date	2009-12-31
Grant Date	2012-05-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Croak, Marian Eslambolchi, Hossein

Abstract

A method and apparatus for providing access to teleconference services using voice recognition technology to receive information on packet networks such as Voice over Internet Protocol (VoIP) and Service over Internet Protocol (SoIP) networks are disclosed. In one embodiment, the service provider enables a caller to enter access information for accessing a conference service using at least one natural language response.

IPC Classes ?

H04M 3/42 - Systems providing special services or facilities to subscribers

64. Method and system for training a text-to-speech synthesis system using a specific domain speech database

Application Number	12540441
Grant Number	08135591
Status	In Force
Filing Date	2009-08-13
First Publication Date	2009-12-03
Grant Date	2012-03-13
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Schroeter, Horst Juergen

Abstract

A method and system are disclosed that train a text-to-speech synthesis system for use in speech synthesis. The method includes generating a speech database of audio files comprising domain-specific voices having various prosodies, and training a text-to-speech synthesis system using the speech database by selecting audio segments having a prosody based on at least one dialog state. The system includes a processor, a speech database of audio files, and modules for implementing the method.

IPC Classes ?

G10L 13/00 - Speech synthesis; Text to speech systems

65. Low latency real-time vocal tract length normalization

Application Number	12490634
Grant Number	08909527
Status	In Force
Filing Date	2009-06-24
First Publication Date	2009-10-15
Grant Date	2014-12-09
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Goffin, Vincent Ljolje, Andrej Saraclar, Murat

Abstract

A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.

IPC Classes ?

G10L 15/12 - Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit

66. System for distinguishing desired audio signals from noise

Application Number	12269837
Grant Number	08131544
Status	In Force
Filing Date	2008-11-12
First Publication Date	2009-09-10
Grant Date	2012-03-06
Owner	Nuance Communications, Inc. (USA)
Inventor	Herbig, Tobias Gaupp, Oliver Gerl, Franz

Abstract

A system distinguishes a primary audio source and background noise to improve the quality of an audio signal. A speech signal from a microphone may be improved by identifying and dampening background noise to enhance speech. Stochastic models may be used to model speech and to model background noise. The models may determine which portions of the signal are speech and which portions are noise. The distinction may be used to improve the signal's quality, and for speaker identification or verification.

IPC Classes ?

G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

67. Voice response system

Application Number	12253849
Grant Number	08145494
Status	In Force
Filing Date	2008-10-17
First Publication Date	2009-06-04
Grant Date	2012-03-27
Owner	Nuance Communications, Inc. (USA)
Inventor	Horioka, Masaru Atake, Yoshinori Tahara, Yoshinori

Abstract

A voice response system attempts to respond to spoken user input and to provide computer-generated responses. If the system decides it cannot provide valid responses, the current state of user session is determined and forwarded to a human operator for further action. The system maintains a recorded history of the session in the form of a dialog history log. The dialog history and information as to the reliability of past speech recognition efforts is employed in making the current state determination. The system includes formatting rules for controlling the display of information presented to the human operator.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

68. System and method for conducting a search using a wireless mobile device

Application Number	12350848
Grant Number	08285273
Status	In Force
Filing Date	2009-01-08
First Publication Date	2009-05-07
Grant Date	2012-10-09
Owner	Nuance Communications, Inc. (USA)
Inventor	Roth, Daniel L.

Abstract

A method and system are provided by which a wireless mobile device takes a vocally entered query and transmits it in a text message format over a wireless network to a search engine; receives search results based on the query from the search engine over the wireless network; and displays the search results.

IPC Classes ?

H04W 4/00 - Services specially adapted for wireless communication networks; Facilities therefor

69. Voice conversion method and system

Application Number	12240148
Grant Number	08234110
Status	In Force
Filing Date	2008-09-29
First Publication Date	2009-04-02
Grant Date	2012-07-31
Owner	Nuance Communications, Inc. (USA)
Inventor	Meng, Fan Ping Qin, Yong Shi, Qin Shuang, Zhi Wei

Abstract

A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

IPC Classes ?

G10L 19/06 - Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

70. Creation and use of application-generic class-based statistical language models for automatic speech recognition

Application Number	11845015
Grant Number	08135578
Status	In Force
Filing Date	2007-08-24
First Publication Date	2009-02-26
Grant Date	2012-03-13
Owner	Nuance Communications, Inc. (USA)
Inventor	Hébert, Matthieu

Abstract

A method of creating an application-generic class-based SLM includes, for each of a plurality of speech applications, parsing a corpus of utterance transcriptions to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application. The method further includes, for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set. The method further includes processing the resulting second output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM.

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

71. Using speech recognition results based on an unstructured language model in a mobile communication facility application

Application Number	12184375
Grant Number	08886540
Status	In Force
Filing Date	2008-08-01
First Publication Date	2009-01-29
Grant Date	2014-11-11
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Cerra, Joseph P. Nguyen, John N. Phillips, Michael S. Shu, Han Mischke, Alexandra Beth

Abstract

A method and system for entering information into a software application resident on a mobile communication facility is provided. The method and system may include recording speech presented by a user using a mobile communication facility resident capture facility, transmitting the recording through a wireless communication facility to a speech recognition facility, transmitting information relating to the software application to the speech recognition facility, generating results utilizing the speech recognition facility using an unstructured language model based at least in part on the information relating to the software application and the recording, transmitting the results to the mobile communications facility, loading the results into the software application and simultaneously displaying the results as a set of words and as a set of application results based on those words.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

72. System and method of performing user-specific automatic speech recognition

Application Number	12207175
Grant Number	08145481
Status	In Force
Filing Date	2008-09-09
First Publication Date	2009-01-01
Grant Date	2012-03-27
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Gajic, Bojana Narayanan, Shrikanth Sambasivan Parthasarathy, Sarangarajan Rose, Richard Cameron Rosenberg, Aaron Edward

Abstract

Speech recognition models are dynamically re-configurable based on user information, application information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. Word recognition lattices are generated for each data field of an application and dynamically concatenated into a single word recognition lattice. A language model is applied to the concatenated word recognition lattice to determine the relationships between the word recognition lattices and repeated until the generated word recognition lattices are acceptable or differ from a predetermined value only by a threshold amount. These techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

73. Method and device for providing speech-to-text encoding and telephony service

Application Number	12200292
Grant Number	08265931
Status	In Force
Filing Date	2008-08-28
First Publication Date	2008-12-25
Grant Date	2012-09-11
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Caldwell, Charles David Harlow, John Bruce Sayko, Robert J. Shaye, Norman

Abstract

A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.

IPC Classes ?

G10L 15/26 - Speech to text systems
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations

74. Method and system for speech based document history tracking

Application Number	12096068
Grant Number	08140338
Status	In Force
Filing Date	2006-11-10
First Publication Date	2008-12-18
Grant Date	2012-03-20
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Grobauer, Gerhard Papai, Miklos

Abstract

A method and a system of history tracking corrections in a speech based document are disclosed. The speech based document comprises one or more sections of text recognized or transcribed from sections of speech, wherein the sections of speech are dictated by a user and processed by a speech recognizer in a speech recognition system into corresponding sections of text of the speech based document. The method comprises associating of at least one speech attribute (14) to each section of text in the speech based document, said speech attribute (14) comprising information related to said section of text, respectively; presenting said speech based document on a presenting unit (8); detecting an action being performed within any of said sections of text; and updating information of said speech attributes (14) related to the kind of action detected on one of said sections of text for updating said speech based document, whereby said updated information of said speech attributes (14) is used for history tracking corrections of said speech based document.

IPC Classes ?

G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 15/26 - Speech to text systems
G06F 3/00 - Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

75. Speech recognition system with huge vocabulary

Application Number	12096046
Grant Number	08140336
Status	In Force
Filing Date	2006-12-06
First Publication Date	2008-11-27
Grant Date	2012-03-20
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Saffer, Zsolt

Abstract

The invention deals with speech recognition, such as a system for recognizing words in continuous speech. A speech recognition system is disclosed which is capable of recognizing a huge number of words, and in principle even an unlimited number of words. The speech recognition system comprises a word recognizer for deriving a best path through a word graph, and wherein words are assigned to the speech based on the best path. The word score being obtained from applying a phonemic language model to each word of the word graph. Moreover, the invention deals with an apparatus and a method for identifying words from a sound block and to computer readable code for implementing the method.

IPC Classes ?

G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/04 - Segmentation; Word boundary detection
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

76. Acoustic localization of a speaker

Application Number	12104836
Grant Number	08204248
Status	In Force
Filing Date	2008-04-17
First Publication Date	2008-11-20
Grant Date	2012-06-19
Owner	Nuance Communications, Inc. (USA)
Inventor	Haulick, Tim Schmidt, Gerhard Uwe Buck, Markus Wolff, Tobias

Abstract

A system locates a speaker in a room containing a loudspeaker and a microphone array. The loudspeaker transmits a sound that is partly reflected by a speaker. The microphone array detects the reflected sound and converts the sound into a microphone signal. A processor determines the speaker's direction relative to the microphone array, the speaker's distance from the microphone array, or both, based on the characteristics of the microphone signals.

IPC Classes ?

H04R 3/00 - Circuits for transducers

77. Categorization of information using natural language processing and predefined templates

Application Number	12121527
Grant Number	08185553
Status	In Force
Filing Date	2008-05-15
First Publication Date	2008-10-16
Grant Date	2012-05-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Carus, Alwin B. Ogrinc, Harry J.

Abstract

A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

IPC Classes ?

G06F 17/30 - Information retrieval; Database structures therefor

78. Method for dialog management

Application Number	12140805
Grant Number	08600747
Status	In Force
Filing Date	2008-06-17
First Publication Date	2008-10-09
Grant Date	2013-12-03
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Abella, Alicia Gorin, Allen Louis

Abstract

A spoken dialog system and method having a dialog management module are disclosed. The dialog management module includes a plurality of dialog motivators for handling various operations during a spoken dialog. The dialog motivators comprise an error handling, disambiguation, assumption, confirmation, missing information, and continuation. The spoken dialog system uses the assumption dialog motivator in either a-priori or a-posteriori modes. A-priori assumption is based on predefined requirements for the call flow and a-posteriori assumption can work with the confirmation dialog motivator to assume the content of received user input and confirm received user input.

IPC Classes ?

G10L 15/00 - Speech recognition
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
G06F 9/46 - Multiprogramming arrangements
G06F 9/44 - Arrangements for executing specific programs
G06F 17/20 - Handling natural language data
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 15/26 - Speech to text systems
H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems

79. Natural error handling in speech recognition

Application Number	12135452
Grant Number	08355920
Status	In Force
Filing Date	2008-06-09
First Publication Date	2008-10-02
Grant Date	2013-01-15
Owner	Nuance Communications, Inc. (USA)
Inventor	Gopinath, Ramesh A. Maison, Benoit Wu, Brian C.

Abstract

A user interface, and associated techniques, that permit a fast and efficient way of correcting speech recognition errors, or of diminishing their impact. The user may correct mistakes in a natural way, essentially by repeating the information that was incorrectly recognized previously. Such a mechanism closely approximates what human-to-human dialogue would be in similar circumstances. Such a system fully takes advantage of all the information provided by the user, and on its own estimates the quality of the recognition in order to determine the correct sequence of words in the fewest number of steps.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

80. Method and apparatus for data capture using a voice activated workstation

Application Number	12089033
Grant Number	08165876
Status	In Force
Filing Date	2006-09-04
First Publication Date	2008-09-25
Grant Date	2012-04-24
Owner	Nuance Communications, Inc. (USA)
Inventor	Emam, Ossama Gamal, Khaled

Abstract

A method and apparatus for capturing data in a workstation, wherein a large number of data associated with a sample which is viewed, by a user, through an optical device, such as a microscope, is to be entered in a computer related file. The optical device can be moved to a data-sampling position utilizing voice commands. A pointer can then be moved to an appropriate place in the file to receive the data relating to the data-sampling position. Data can be then entered in the appropriate position utilizing a voice command. The steps of moving the pointer and entering the data can then be repeated until all data is provided with respect to the data-sampling positions.

IPC Classes ?

G10L 15/00 - Speech recognition

81. Invoking tapered prompts in a multimodal application

Application Number	11678920
Grant Number	08150698
Status	In Force
Filing Date	2007-02-26
First Publication Date	2008-08-28
Grant Date	2012-04-03
Owner	Nuance Communications, Inc. (USA)
Inventor	Ativanichayaphong, Soonthorn Cross, Jr., Charles W. Mccobb, Gerald M.

Abstract

Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

82. System and method for selecting and presenting advertisements based on natural language processing of voice-based input

Application Number	11671526
Grant Number	07818176
Status	In Force
Filing Date	2007-02-06
First Publication Date	2008-08-07
Grant Date	2010-10-19
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Freeman, Tom Kennewick, Mike

Abstract

IPC Classes ?

G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 15/18 - Speech classification or search using natural language modelling
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G06Q 30/00 - Commerce

83. Method and an apparatus to disambiguate requests

Application Number	11701811
Grant Number	08175248
Status	In Force
Filing Date	2007-02-02
First Publication Date	2008-08-07
Grant Date	2012-05-08
Owner	Nuance Communications, Inc. (USA)
Inventor	Agarwal, Rajeev Ardman, David Master, Muneeb Mauro, David Andrew Raman, Vijay R. Ulug, Amy E. Valli, Zulfikar

Abstract

A method and an apparatus to disambiguate requests are presented. In one embodiment, the method includes receiving a request for information from a user. Then data is retrieved from a back-end database in response to the request. Based on a predetermined configuration of a disambiguation system and the data retrieved, the ambiguity within the request is dynamically resolved.

IPC Classes ?

H04M 3/42 - Systems providing special services or facilities to subscribers

84. Method and apparatus for recognizing and reacting to user personality in accordance with speech recognition system

Application Number	12055952
Grant Number	08719035
Status	In Force
Filing Date	2008-03-26
First Publication Date	2008-07-24
Grant Date	2014-05-06
Owner	Nuance Communications, Inc. (USA)
Inventor	Stewart, Osamuyimen Thompson Dai, Liwei

Abstract

Techniques are disclosed for recognizing user personality in accordance with a speech recognition system. For example, a technique for recognizing a personality trait associated with a user interacting with a speech recognition system includes the following steps/operations. One or more decoded spoken utterances of the user are obtained. The one or more decoded spoken utterances are generated by the speech recognition system. The one or more decoded spoken utterances are analyzed to determine one or more linguistic attributes (morphological and syntactic filters) that are associated with the one or more decoded spoken utterances. The personality trait associated with the user is then determined based on the analyzing step/operation.

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 15/00 - Speech recognition
G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
G09B 3/00 - Manually- or mechanically-operated teaching appliances working with questions and answers
G09B 7/00 - Electrically-operated teaching apparatus or devices working with questions and answers
G09B 19/00 - Teaching not covered by other main groups of this subclass
G09B 19/04 - Speaking
G09B 17/04 - Teaching reading for increasing the rate of reading; Reading rate control
G09B 1/00 - Manually- or mechanically-operated educational appliances using elements forming or bearing symbols, signs, pictures, or the like which are arranged or adapted to be arranged in one or more particular ways

85. Software program and method for providing promotions on a phone prior to call connection

Application Number	11636334
Grant Number	08160552
Status	In Force
Filing Date	2006-12-08
First Publication Date	2008-06-12
Grant Date	2012-04-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Stone, Kevin M.

Abstract

The present invention includes a method and software application for providing a promotion to a user on a phone. The software application resides on a user's phone and “listens” for phone numbers dialed by a user. In response to the user dialing a phone number, the software determines whether a promotion or an offer for a promotion should be provided to the user. In response to determining to play or offer to play a promotion to the user, the software application on the phone effectively “intercepts” the call and plays to the user either a promotion or an offer to hear about a promotion prior to placing an outbound voice call. The software application may retrieve the promotion from local memory or may connect with a remote server to download an applicable promotion.

IPC Classes ?

H04M 3/42 - Systems providing special services or facilities to subscribers

86. Web integrated interactive voice response

Application Number	11961005
Grant Number	08204184
Status	In Force
Filing Date	2007-12-20
First Publication Date	2008-05-08
Grant Date	2012-06-19
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Gao, Xiaofeng Scott, David Zellner, Sam

Abstract

One embodiment of a representative system for web integrated interactive voice response includes an interactive voice response system adapted to provide a plurality of voice menus to a user over a telephone and a graphical user interface system adapted to provide a plurality of menus in a graphical format to the user over a network connection. Information provided in the voice menus corresponds to information provided in the menus in the graphical format and is responsive to commands received by the graphical user interface system from the user. Other systems and methods are also provided.

IPC Classes ?

H04M 11/06 - Simultaneous speech and data transmission, e.g. telegraphic transmission over the same conductors
G06F 3/00 - Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

87. Methods for voice activated dialing

Application Number	11959822
Grant Number	08150001
Status	In Force
Filing Date	2007-12-19
First Publication Date	2008-05-01
Grant Date	2012-04-03
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Bishop, Michael Koch, Robert

Abstract

Methods for routing a call based on voice activated dialing (VAD). A VAD device module may respond to a VAD instruction, or to a call received with a VAD instruction with a corresponding call destination number obtained from a personal VAD directory. If the personal VAD directory fails to include the call destination number, the VAD device module may route the call or initiate a call through a gateway to a VAD network module. The VAD network module may obtain call destination information from the VAD instruction, and may use the call destination information obtain the call destination number. The VAD network module may obtain additional information from the call or other source, and use the additional information to obtain the call destination number. The call then is routed to the call destination number. The call destination number may be added to the personal VAD directory.

IPC Classes ?

H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations

88. System and method for a cooperative conversational voice user interface

Application Number	11580926
Grant Number	08073681
Status	In Force
Filing Date	2006-10-16
First Publication Date	2008-04-17
Grant Date	2011-12-06
Owner	VB ASSETS, LLC (USA) NUANCE COMMUNICATIONS, INC. (USA) VB ASSETS, LLC (USA)
Inventor	Baldwin, Larry Freeman, Tom Tjalve, Michael Ebersold, Blane Weider, Chris

Abstract

IPC Classes ?

G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

89. Establishing a preferred mode of interaction between a user and a multimodal application

Application Number	11530599
Grant Number	08145493
Status	In Force
Filing Date	2006-09-11
First Publication Date	2008-03-13
Grant Date	2012-03-27
Owner	Nuance Communications, Inc. (USA)
Inventor	Cross, Jr., Charles W. Pike, Hilary A.

Abstract

Establishing a preferred mode of interaction between a user and a multimodal application, including evaluating, by a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, user modal preference, and dynamically configuring multimodal content of the multimodal application in dependence upon the evaluation of user modal preference.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

90. Method and apparatus for recognizing a user personality trait based on a number of compound words used by the user

Application Number	11436295
Grant Number	08150692
Status	In Force
Filing Date	2006-05-18
First Publication Date	2007-11-22
Grant Date	2012-04-03
Owner	Nuance Communications, Inc. (USA)
Inventor	Stewart, Osamuyimen Thompson Dai, Liwei

Abstract

Techniques for recognizing a personality trait associated with a user. Input from the user is analyzed to determine a number of words, including a number of compound words. The personality trait associated with the user is determined based, at least in part, on the number of compound words exceeding a threshold.

IPC Classes ?

G10L 15/08 - Speech classification or search

91. Mass-scale, user-independent, device-independent voice messaging system

Application Number	11673746
Grant Number	08903053
Status	In Force
Filing Date	2007-02-12
First Publication Date	2007-06-07
Grant Date	2014-12-02
Owner	Nuance Communications, Inc. (USA)
Inventor	Doulton, Daniel Michael

Abstract

A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.

IPC Classes ?

H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
H04M 3/51 - Centralised call answering arrangements requiring operator intervention
H04M 3/493 - Interactive information services, e.g. directory enquiries
H04M 3/533 - Voice mail systems
G10L 15/26 - Speech to text systems

92. System and method for conducting a search using a wireless mobile device

Application Number	11263601
Grant Number	07477909
Status	In Force
Filing Date	2005-10-31
First Publication Date	2007-05-03
Grant Date	2009-01-13
Owner	Nuance Communications, Inc. (USA)
Inventor	Roth, Daniel Lawrence

Abstract

IPC Classes ?

H04N 7/173 - Analogue secrecy systems; Analogue subscription systems with two-way working, e.g. subscriber sending a programme selection signal
G06F 17/30 - Information retrieval; Database structures therefor
H04M 3/00 - Automatic or semi-automatic exchanges
H04Q 7/20 -

93. Method, system and apparatus for data reuse

Application Number	11545414
Grant Number	08370734
Status	In Force
Filing Date	2006-10-10
First Publication Date	2007-02-15
Grant Date	2013-02-05
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Boone, Keith W. Chaparala, Sunitha Fordyce, Cameron Gervais, Sean Manoukian, Roubik Ogrinc, Harry J. Titemore, Robert G. Hopkins, Jeffrey G.

Abstract

A system and method may be disclosed for facilitating the creation or modification of a document by providing a mechanism for locating relevant data from external sources and organizing and incorporating some or all of said data into the document. In the method for reusing data, there may be a set of documents that may be queried, where each document may be divided into a plurality of sections. A plurality of section text groups may be formed based on the set of documents, where each section text group may be associated with a respective section from the plurality of sections and each section group includes a plurality of items. Each item may be associated with a respective section from each document of the set of documents. A selected item within a selected section text group may be focused. The selected item may be extracted to a current document. The current document may be exported to a host application.

IPC Classes ?

G06F 17/00 - Digital computing or data processing equipment or methods, specially adapted for specific functions

94. Methods and apparatus for automatically extending the voice vocabulary of mobile communications devices

Application Number	11347666
Grant Number	08160884
Status	In Force
Filing Date	2006-02-03
First Publication Date	2006-08-03
Grant Date	2012-04-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Roth, Daniel L. Cohen, Jordan Behrakis, Elias P.

Abstract

The invention is a method of improving the performance of a speech recognizer. The method generally involves: providing a lexicon for the speech recognizer; monitoring a user's interaction with a network; accessing a plurality of words associated with the monitored interaction; and including the plurality of words in the lexicon.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

95. Speech signal processing with combined noise reduction and echo compensation

Application Number	11218687
Grant Number	07747001
Status	In Force
Filing Date	2005-09-02
First Publication Date	2006-07-13
Grant Date	2010-06-29
Owner	Nuance Communications, Inc. (USA)
Inventor	Kellermann, Walter Herbordt, Wolfgang

Abstract

A speech signal processing system combines acoustic noise reduction and echo cancellation to enhance acoustic performance. The speech signal processing system may be used in vehicles or other environments where noise-suppressed communication is desirable. The system includes an adaptive beamforming signal processing unit, an adaptive echo compensating unit to reduce acoustic echoes, and an adaptation unit to combine noise reduction and adaptive echo compensating.

IPC Classes ?

H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic

96. System and method of providing an automated data-collection in spoken dialog systems

Application Number	11029798
Grant Number	08185399
Status	In Force
Filing Date	2005-01-05
First Publication Date	2006-07-06
Grant Date	2012-05-22
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Di Fabbrizio, Giuseppe Hakkani-Tur, Dilek Z. Rahim, Mazin G. Renger, Bernard S. Tur, Gokhan

Abstract

The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

97. System and method for providing network coordinated conversational services

Application Number	11303768
Grant Number	07519536
Status	In Force
Filing Date	2005-12-16
First Publication Date	2006-05-25
Grant Date	2009-04-14
Owner	Nuance Communications, Inc. (USA)
Inventor	Maes, Stephane H. Gopalakrishnan, Ponani

Abstract

IPC Classes ?

G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G06F 15/16 - Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

98. Method and system of generating a speech signal with overlayed random frequency signal

Application Number	10957222
Grant Number	07558389
Status	In Force
Filing Date	2004-10-01
First Publication Date	2006-04-06
Grant Date	2009-07-07
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Desimone, Joseph

Abstract

A method and apparatus utilizing prosody modification of a speech signal output by a text-to-speech (TTS) system to substantially prevent an interactive voice response (IVR) system from understanding the speech signal without significantly degrading the speech signal with respect to human understanding. The present invention involves modifying the prosody of the speech output signal by using the prosody of the user's response to a prompt. In addition, a randomly generated overlay frequency is used to modify the speech signal to further prevent an IVR system from recognizing the TTS output. The randomly generated frequency may be periodically changed using an overlay timer that changes the random frequency signal at a predetermined intervals.

IPC Classes ?

H04L 9/00 - Arrangements for secret or secure communications; Network security protocols
H04N 7/167 - Systems rendering the television signal unintelligible and subsequently intelligible
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

99. Combined speech recognition and sound recording

Application Number	11005568
Grant Number	07505911
Status	In Force
Filing Date	2004-12-05
First Publication Date	2005-07-21
Grant Date	2009-03-17
Owner	NUANCE COMMUNICATIONS, INC. (USA)
Inventor	Roth, Daniel L. Cohen, Jordan R. Johnston, David F. Porter, Edward W.

Abstract

A handheld device with both large-vocabulary speech recognition and audio recoding allows users to switch between at least two of the following three modes: (1) recording audio without corresponding speech recognition; (2) recording with speech recognition; and (3) speech recognition without audio recording. A handheld device with both large-vocabulary speech recognition and audio recoding enables a user to select a portion of previously recorded sound and have speech recognition performed upon it. A system enables a user to search for a text label associated with portions of unrecognized recorded sound by uttering the label's words. A large-vocabulary system allows users to switch between playing back recorded audio and speech recognition with a single input, with successive audio playbacks automatically starting slightly before the end of prior playback. And a cell phone that allows both large-vocabulary speech recognition and audio recording and playback.

IPC Classes ?

G01L 21/06 - Vacuum gauges having a compression chamber in which gas, whose pressure is to be measured, is compressed wherein the chamber is closed by liquid; Vacuum gauges of the McLeod type actuated by rotating or inverting the measuring device

100. Electronic device and user interface and input method therefor

Application Number	10719576
Grant Number	08136050
Status	In Force
Filing Date	2003-11-21
First Publication Date	2005-05-26
Grant Date	2012-03-13
Owner	Nuance Communications, Inc. (USA)
Inventor	Sacher, Heiko K. Romera, Maria E. Nagel, Jens

Abstract

A portable electronic device (100,400) and user interface (425) are operated using a method including initiating entry of a content string; determining the most probable completion alternative or a content prediction using a personalized and learning database (430); displaying the most probable completion alternative or next content prediction; determining whether a user has accepted the most probable completion alternative or next content prediction; and adding the most probable completion alternative or next content prediction to the content string upon user acceptance.

IPC Classes ?

G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]

1 2 Next Page