Nuance Communications, Inc.

United States of America

Back to Profile

1-100 of 109 for Nuance Communications, Inc. Sort by
Query
Patent
United States - USPTO
Excluding Subsidiaries
Aggregations Reset Report
Date
New (last 4 weeks) 1
2024 April (MTD) 1
2024 February 1
2024 (YTD) 2
2022 1
See more
IPC Class
G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility 23
G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction 21
G10L 15/00 - Speech recognition 14
G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00 13
G10L 15/26 - Speech to text systems 13
See more
Status
Pending 6
Registered / In Force 103
Found results for  patents
  1     2        Next Page

1.

System and Method for Spectral Pooling in Streaming Speech Processing

      
Application Number 18162186
Status Pending
Filing Date 2023-01-31
First Publication Date 2024-04-18
Owner Nuance Communications, Inc. (USA)
Inventor
  • Weninger, Felix
  • Albesano, Dario
  • Zhan, Puming

Abstract

A method, computer program product, and computing system for inserting a spectral pooling layer into a neural network of a speech processing system. An output of a hidden layer of the neural network is filtered using the spectral pooling layer with a non-integer stride. The filtered output is provided to a subsequent hidden layer of the neural network.

IPC Classes  ?

  • G10L 15/16 - Speech classification or search using artificial neural networks
  • G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation

2.

INTERACTIVE VOICE RESPONSE SYSTEMS HAVING IMAGE ANALYSIS

      
Application Number 17816957
Status Pending
Filing Date 2022-08-02
First Publication Date 2024-02-08
Owner Nuance Communications, Inc. (USA)
Inventor
  • Chawla, Akash
  • Degroot, Jenny
  • Vovk, Sergey A.

Abstract

An interactive voice response system is provided that includes an interactive voice recognition module, an image collection module, and a data extraction module. The image collection module communicates with the voice recognition module and the user device. The extraction module communicates with the image collection module. The voice recognition module collects speech data from a user of the user device and provides an indication to the image collection module when the speech data includes complex data. The image collection module, in response to the indication, communicates with the user device in a text message. The text message includes a link that, when activated, opens a camera on the user device. The image collection module, in response to receiving an image having the complex data from the camera, communicates the image to the extraction module, which extracts the complex data from the image as textual data.

IPC Classes  ?

  • G06V 30/41 - Analysis of document content
  • G06V 30/146 - Aligning or centering of the image pick-up or image-field
  • G06V 30/19 - Recognition using electronic means
  • H04M 3/493 - Interactive information services, e.g. directory enquiries
  • H04L 51/18 - Commands or executable codes
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog

3.

Automated Clinical Documentation System and Method

      
Application Number 17210292
Status Pending
Filing Date 2021-03-23
First Publication Date 2022-02-17
Owner Nuance Communications, Inc. (USA)
Inventor
  • Gallopyn, Guido Remi Marcel
  • Sharma, Dushyant
  • Jost, Uwe Helmut
  • Owen, Donald E.
  • Naylor, Patrick
  • Nour-Eldin, Amr
  • Almendro Barreda, Daniel Paulino
  • Öz, Mehmet Mert
  • Erskine, Garret N.

Abstract

A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for functionality module communication is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a plurality of functionality modules. At least a portion of the encounter information may be processed via a first functionality module of the plurality of functionality modules to generate a first result. The first result may be provided to a second functionality module of the plurality of functionality modules. The first result set may be processed via the second functionality module to generate a second result. A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module. One or more additional functionalities are added to the compartmentalized virtual assistant on an as-needed basis. A computer-implemented method, computer program product, and computing system for functionality module communication is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a plurality of functionality modules. At least a portion of the encounter information may be processed via a first functionality module of the plurality of functionality modules to generate a first result. The first result may be provided to a second functionality module of the plurality of functionality modules. The first result set may be processed via the second functionality module to generate a second result. A computer-implemented method, computer program product, and computing system for synchronizing machine vision and audio is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes machine vision encounter information and audio encounter information. The machine vision encounter information and the audio encounter information are temporally-aligned to produce a temporarily-aligned encounter recording.

IPC Classes  ?

  • G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
  • G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
  • G10L 21/0208 - Noise filtering
  • G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

4.

Automated Clinical Documentation System and Method

      
Application Number 17210253
Status Pending
Filing Date 2021-03-23
First Publication Date 2021-08-05
Owner Nuance Communications, Inc. (USA)
Inventor
  • Owen, Donald E.
  • Erskine, Garret N.
  • Öz, Mehmet Mert
  • Jost, Uwe Helmut
  • Almendro Barreda, Daniel Paulino
  • Sharma, Dushyant
  • Gallopyn, Guido Remi Marcel
  • Nour-Eldin, Amr
  • Naylor, Patrick A.

Abstract

A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for automating diarization is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. An encounter transcript is generated based, at least in part, upon the first portion of the encounter information and the at least a second portion of the encounter information. A computer-implemented method, computer program product, and computing system for rendering content is executed on a computing device and includes receiving a request to render content during a user encounter. If it is determined that the content includes sensitive content, a complete version of the content is rendered on a first device (wherein the complete version of the content includes the sensitive content) and a limited version of the content on a second device (wherein the limited version of the content excludes the sensitive content). A modular ACD system is configured to automate clinical documentation and includes a machine vision system configured to obtain machine vision encounter information concerning a user encounter. An audio recording system is configured to obtain audio encounter information concerning the user encounter. A compute system is configured to receive the machine vision encounter information and the audio encounter information. A computer-implemented method, computer program product, and computing system for automating diarization is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. An encounter transcript is generated based, at least in part, upon the first portion of the encounter information and the at least a second portion of the encounter information. A computer-implemented method, computer program product, and computing system for automating role assignment is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to associate a first portion of the encounter information with a first encounter participant. A first role is assigned to the first encounter participant.

IPC Classes  ?

  • H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
  • G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
  • G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules

5.

Automated Clinical Documentation System and Method

      
Application Number 17210233
Status Pending
Filing Date 2021-03-23
First Publication Date 2021-07-29
Owner Nuance Communications, Inc. (USA)
Inventor
  • Owen, Donald E.
  • Erskine, Garret N.
  • Gallopyn, Guido Remi Marcel
  • Öz, Mehmet Mert
  • Almendro Barreda, Daniel Paulino

Abstract

A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a post-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automated clinical documentation is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to generate an encounter transcript. At least a portion of the encounter transcript is processed to populate at least a portion of a record associated with the user encounter. A computer-implemented method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a pre-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a user to provide encounter information via a virtual assistant during a post-visit portion of a user encounter. Encounter information is obtained from the user in response to the prompting by the virtual assistant. A computer-implemented method, computer program product, and computing system for automating a monitoring process is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is processed to determine if the encounter information is indicative of a potential situation. An inquiry is initiated concerning the potential situation.

IPC Classes  ?

  • G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
  • G16H 80/00 - ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
  • G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
  • G16H 20/30 - ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to physical therapies or activities, e.g. physiotherapy, acupressure or exercising
  • A61B 5/00 - Measuring for diagnostic purposes ; Identification of persons
  • G06T 1/00 - General purpose image data processing
  • G16H 30/20 - ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS

6.

Automated Clinical Documentation System and Method

      
Application Number 17210300
Status Pending
Filing Date 2021-03-23
First Publication Date 2021-07-29
Owner Nuance Communications, Inc. (USA)
Inventor
  • Owen, Donald E.
  • Erskine, Garret N.
  • Öz, Mehmet Mert
  • Almendro Barreda, Daniel Paulino

Abstract

A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for reactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. A request is received from a user concerning a specific condition. In response to receiving the request, the encounter information is processed to determine if the encounter information is indicative of the specific condition and to generate a result set. The result set is provided to the user. A computer-implemented method, computer program product, and computing system for visual diarization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter participant, and associate at least a second portion of the encounter information with at least a second encounter participant. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for visual compartmentalization of a user encounter is executed on a computing device and includes obtaining encounter information of the user encounter. The encounter information is processed to: associate a first portion of the encounter information with a first encounter portion, and associate at least a second portion of the encounter information with at least a second encounter portion. A visual representation of the encounter information is rendered. A first visual representation of the first portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. At least a second visual representation of the at least a second portion of the encounter information is rendered that is temporally-aligned with the visual representation of the encounter information. A computer-implemented method, computer program product, and computing system for reactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. A request is received from a user concerning a specific condition. In response to receiving the request, the encounter information is processed to determine if the encounter information is indicative of the specific condition and to generate a result set. The result set is provided to the user. A computer-implemented method, computer program product, and computing system for proactive encounter scanning is executed on a computing device and includes obtaining encounter information of a user encounter. The encounter information is proactively processed to determine if the encounter information is indicative of one or more conditions and to generate one or more result sets. The one or more result sets are provided to the user.

IPC Classes  ?

  • G16H 40/20 - ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
  • G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
  • G16H 15/00 - ICT specially adapted for medical reports, e.g. generation or transmission thereof
  • G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
  • G16H 30/20 - ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
  • G06F 16/248 - Presentation of query results
  • G09B 19/00 - Teaching not covered by other main groups of this subclass
  • G06F 3/16 - Sound input; Sound output
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G06F 40/40 - Processing or translation of natural language
  • G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
  • G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

7.

Spectral estimation of room acoustic parameters

      
Application Number 16084771
Grant Number 10403300
Status In Force
Filing Date 2016-03-17
First Publication Date 2019-03-14
Grant Date 2019-09-03
Owner Nuance Communications, Inc. (USA)
Inventor
  • Wolff, Tobias
  • Desiraju, Naveen Kumar

Abstract

60, and can further estimate an additional parameter, such as Direct-to-Reverberant Ratio (DRR). The prediction filter may be adapted during a period of reverberation by minimizing a cost function. Adaptation can include using a gradient descent approach, which can operate according to a step size provided by an adaptation controller configured to determine the period of reverberation. One or more microphones can provide the signals. The reverberation parameters estimated can be applied to a reverberation suppressor, with an estimator that does not require a training phase and without relying on assumptions of the user's position relative to the microphones.

IPC Classes  ?

8.

User dedicated automatic speech recognition

      
Application Number 15876545
Grant Number 10789950
Status In Force
Filing Date 2018-01-22
First Publication Date 2018-06-07
Grant Date 2020-09-29
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Wolff, Tobias
  • Buck, Markus
  • Haulick, Tim

Abstract

A multi-mode voice controlled user interface is described. The user interface is adapted to conduct a speech dialog with one or more possible speakers and includes a broad listening mode which accepts speech inputs from the possible speakers without spatial filtering, and a selective listening mode which limits speech inputs to a specific speaker using spatial filtering. The user interface switches listening modes in response to one or more switching cues.

IPC Classes  ?

  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/28 - Constructional details of speech recognition systems
  • G06F 3/16 - Sound input; Sound output
  • G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
  • G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
  • G10L 21/0216 - Noise filtering characterised by the method used for estimating noise

9.

System and method for speech enhancement using a coherent to diffuse sound ratio

      
Application Number 15535245
Grant Number 10242690
Status In Force
Filing Date 2014-12-12
First Publication Date 2017-11-16
Grant Date 2019-03-26
Owner Nuance Communications, Inc. (USA)
Inventor
  • Wolff, Tobias
  • Matheja, Timo
  • Buck, Markus

Abstract

Embodiments of the present disclosure may include a system and method for speech enhancement using the coherent to diffuse sound ratio. Embodiments may include receiving an audio signal at one or more microphones and controlling one or more adaptive filters of a beamformer using a coherent to diffuse ratio (“CDR”).

IPC Classes  ?

  • G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
  • G10L 21/0208 - Noise filtering
  • H04B 1/62 - TRANSMISSION - Details of transmission systems not characterised by the medium used for transmission for providing a predistortion of the signal in the transmitter and corresponding correction in the receiver, e.g. for improving the signal/noise ratio
  • H04R 3/00 - Circuits for transducers

10.

System and method for generating a self-steering beamformer

      
Application Number 15535264
Grant Number 10924846
Status In Force
Filing Date 2014-12-12
First Publication Date 2017-11-09
Grant Date 2021-02-16
Owner Nuance Communications, Inc. (USA)
Inventor
  • Wolff, Tobias
  • Buck, Markus

Abstract

A system and method for generating a self-steering beamformer is provided. Embodiments may include receiving, at one or more microphones, a first audio signal and adapting one or more blocking filters based upon, at least in part, the first audio signal. Embodiments may also include generating, using the one or more blocking filters, one or more noise reference signals. Embodiments may further include providing the one or more noise reference signals to an adaptive interference canceller to reduce a beamformer output power level.

IPC Classes  ?

  • H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
  • G10L 21/0208 - Noise filtering
  • G10L 21/0216 - Noise filtering characterised by the method used for estimating noise
  • H04R 1/24 - Structural combinations of separate transducers or of parts of the same transducer and responsive respectively to two or more frequency ranges
  • G10L 21/0272 - Voice signal separating

11.

Text message generation for emergency services as a backup to voice communications

      
Application Number 15134733
Grant Number 09930502
Status In Force
Filing Date 2016-04-21
First Publication Date 2016-08-11
Grant Date 2018-03-27
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Basore, David L.
  • Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network nay correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes  ?

  • H04W 4/14 - Short messaging services, e.g. short message service [SMS] or unstructured supplementary service data [USSD]
  • H04W 4/22 - Emergency connection handling
  • H04W 4/12 - Messaging; Mailboxes; Announcements

12.

Voice commerce

      
Application Number 14855334
Grant Number 09626703
Status In Force
Filing Date 2015-09-15
First Publication Date 2016-03-17
Grant Date 2017-04-18
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor Kennewick, Sr., Michael R.

Abstract

In certain implementations, a system for facilitating voice commerce is provided. A user input comprising a natural language utterance related to a product or service to be purchased may be received. A first product or service that is to be purchased may be determined based on the utterance. First payment information that is to be used to purchase the first product or service may be obtained. First shipping information that is to be used to deliver the first product or service may be obtained. A purchase transaction for the first product or service may completed based on the first payment information and the first shipping information without further user input, after the receipt of utterance, that identifies a product or service type or a product or service, seller information, payment information, shipping information, or other information related to purchasing the first product or service.

IPC Classes  ?

  • G06Q 30/06 - Buying, selling or leasing transactions
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog

13.

Task switching in dialogue processing

      
Application Number 14478121
Grant Number 09607102
Status In Force
Filing Date 2014-09-05
First Publication Date 2016-03-10
Grant Date 2017-03-28
Owner Nuance Communications, Inc. (USA)
Inventor
  • Lavallee, Jean-Francois
  • Goussard, Jacques-Olivier
  • Beaufort, Richard

Abstract

Disclosed methods and systems are directed to task switching in dialog processing. The methods and systems may include activating a primary task, receiving, one or more ambiguous natural language commands, and identifying a first candidate task for each of the one or more ambiguous natural language commands. The methods and system may also include identifying, for each of the one or more ambiguous natural language commands and based on one or more rules, a second candidate task of the plurality of tasks corresponding to the ambiguous natural language command, determining whether to modify at least one of the one or more rules-based task switching rules based on whether a quality metric satisfies a threshold quantity, and when the second quality metric satisfies the threshold quantity, changing the task switching rule for the corresponding candidate task from a rules-based model to the optimized statistical based task switching model.

IPC Classes  ?

14.

System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements

      
Application Number 14836606
Grant Number 09406078
Status In Force
Filing Date 2015-08-26
First Publication Date 2015-12-17
Grant Date 2016-08-02
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennewick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and/or provide natural language processing based on advertisements. In one implementation, an advertisement associated with a product or service may be provided for presentation to a user. A natural language utterance of the user may be received. The natural language utterance may be interpreted based on the advertisement and, responsive to the existence of a pronoun in the natural language utterance, a determination of whether the pronoun refers to one or more of the product or service or a provider of the product or service may be effectuated.

IPC Classes  ?

  • G10L 15/26 - Speech to text systems
  • G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

15.

System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

      
Application Number 14698183
Grant Number 09305547
Status In Force
Filing Date 2015-04-28
First Publication Date 2015-08-27
Grant Date 2016-04-05
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Ljolje, Andrej
  • Conkie, Alistair D.
  • Syrdal, Ann K.

Abstract

Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

IPC Classes  ?

  • G10L 15/04 - Segmentation; Word boundary detection
  • G10L 15/187 - Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
  • G10L 15/07 - Adaptation to the speaker
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]

16.

Online maximum-likelihood mean and variance normalization for speech recognition

      
Application Number 14640912
Grant Number 09280979
Status In Force
Filing Date 2015-03-06
First Publication Date 2015-08-06
Grant Date 2016-03-08
Owner Nuance Communications, Inc. (USA)
Inventor Willett, Daniel

Abstract

A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

IPC Classes  ?

  • G10L 19/02 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
  • G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
  • G10L 15/08 - Speech classification or search
  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
  • G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
  • G10L 15/34 - Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing

17.

Techniques for evaluation, building and/or retraining of a classification model

      
Application Number 14686099
Grant Number 09311609
Status In Force
Filing Date 2015-04-14
First Publication Date 2015-08-06
Grant Date 2016-04-12
Owner Nuance Communications, Inc. (USA)
Inventor Marcheret, Etienne

Abstract

Techniques for evaluation and/or retraining of a classification model built using labeled training data. In some aspects, a classification model having a first set of weights is retrained by using unlabeled input to reweight the labeled training data to have a second set of weights, and by retraining the classification model using the labeled training data weighted according to the second set of weights. In some aspects, a classification model is evaluated by building a similarity model that represents similarities between unlabeled input and the labeled training data and using the similarity model to evaluate the labeled training data to identify a subset of the plurality of items of labeled training data that is more similar to the unlabeled input than a remainder of the labeled training data.

IPC Classes  ?

  • G06N 99/00 - Subject matter not provided for in other groups of this subclass
  • G06N 7/00 - Computing arrangements based on specific mathematical models

18.

Multiple web-based content category searching in mobile search application

      
Application Number 14570404
Grant Number 09619572
Status In Force
Filing Date 2014-12-15
First Publication Date 2015-04-09
Grant Date 2017-04-11
Owner Nuance Communications, Inc. (USA)
Inventor
  • Phillips, Michael S.
  • Nguyen, John N.

Abstract

In embodiments of the present invention improved capabilities are described for multiple web-based content category searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting at least a portion of the captured speech as data through a wireless communication facility to a speech recognition facility; generating speech-to-text results for the captured speech utilizing the speech recognition facility; and transmitting the text results and a plurality of formatting rules specifying how search text may be used to form a query for a search capability on the mobile communications facility, wherein each formatting rule is associated with a category of content to be searched.

IPC Classes  ?

  • G06F 17/30 - Information retrieval; Database structures therefor
  • G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
  • G10L 15/26 - Speech to text systems
  • G10L 25/48 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use

19.

Dealing with switch latency in speech recognition

      
Application Number 14537418
Grant Number 09495956
Status In Force
Filing Date 2014-11-10
First Publication Date 2015-03-12
Grant Date 2016-11-15
Owner Nuance Communications, Inc. (USA)
Inventor
  • Meisel, William S.
  • Phillips, Michael S.
  • Nguyen, John N.

Abstract

In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.

IPC Classes  ?

  • G10L 15/08 - Speech classification or search
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G06F 3/16 - Sound input; Sound output
  • G10L 17/22 - Interactive procedures; Man-machine interfaces

20.

Method and system for dictionary noise removal

      
Application Number 14010903
Grant Number 09336195
Status In Force
Filing Date 2013-08-27
First Publication Date 2015-03-05
Grant Date 2016-05-10
Owner Nuance Communications, Inc. (USA)
Inventor Barrett, Neil D.

Abstract

A method and system of removing noise from a dictionary using a weighted graph is presented. The method can include mapping, by a noise reducing agent executing on a processor, a plurality of dictionaries to a plurality of vertices of a graphical representation, wherein the plurality of vertices is connected by weighted edges representing noise. The plurality of dictionaries may further comprise a plurality of entries, wherein each entry further comprises a plurality of tokens. The method can include selecting a subset of the weighted edges, constructing an acyclic graphical representation from the selected subset of weighted edges, and determining an ordering based on the acyclic graphical representation. The selected subset of weighted edges may approximate a solution to the Maximum Acyclic Subgraph problem. The method can include removing noise from the plurality of dictionaries according to the determined ordering.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G06F 17/30 - Information retrieval; Database structures therefor
  • G06F 19/00 - Digital computing or data processing equipment or methods, specially adapted for specific applications (specially adapted for specific functions G06F 17/00;data processing systems or methods specially adapted for administrative, commercial, financial, managerial, supervisory or forecasting purposes G06Q;healthcare informatics G16H)

21.

System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements

      
Application Number 14537598
Grant Number 09269097
Status In Force
Filing Date 2014-11-10
First Publication Date 2015-03-05
Grant Date 2016-02-23
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennewick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and/or provide natural language processing based on advertisements. In one implementation, an advertisement associated with a product or service may be provided for presentation to a user. A natural language utterance of the user may be received. The natural language utterance may be interpreted based on the advertisement and, responsive to the existence of a pronoun in the natural language utterance, a determination of whether the pronoun refers to one or more of the product or service or a provider of the product or service may be effectuated.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling
  • G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
  • G10L 15/26 - Speech to text systems
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

22.

Text message generation for emergency services as a backup to voice communications

      
Application Number 14478048
Grant Number 09351142
Status In Force
Filing Date 2014-09-05
First Publication Date 2014-12-18
Grant Date 2016-05-24
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Basore, David L.
  • Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network may correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes  ?

  • H04W 4/22 - Emergency connection handling
  • H04W 4/12 - Messaging; Mailboxes; Announcements

23.

System and method for providing network coordinated conversational services

      
Application Number 14448216
Grant Number 09761241
Status In Force
Filing Date 2014-07-31
First Publication Date 2014-11-20
Grant Date 2017-09-12
Owner Nuance Communications, Inc. (USA)
Inventor
  • Maes, Stephane H.
  • Gopalakrishnan, Ponani S.

Abstract

A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • H04L 29/06 - Communication control; Communication processing characterised by a protocol
  • H04L 12/24 - Arrangements for maintenance or administration

24.

Machine translation using global lexical selection and sentence reconstruction

      
Application Number 14336297
Grant Number 09323745
Status In Force
Filing Date 2014-07-21
First Publication Date 2014-11-06
Grant Date 2016-04-26
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Bangalore, Srinivas
  • Haffner, Patrick
  • Kanthak, Stephan

Abstract

Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

IPC Classes  ?

  • G06F 17/28 - Processing or translating of natural language
  • G11B 27/10 - Indexing; Addressing; Timing or synchronising; Measuring tape travel
  • G11B 27/28 - Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
  • H04N 5/445 - Receiver circuitry for displaying additional information
  • H04N 5/45 - Picture in picture
  • H04N 5/765 - Interface circuits between an apparatus for recording and another apparatus
  • H04N 21/232 - Content retrieval operation within server, e.g. reading video streams from disk arrays
  • H04N 21/233 - Processing of audio elementary streams
  • H04N 21/235 - Processing of additional data, e.g. scrambling of additional data or processing content descriptors
  • H04N 21/258 - Client or end-user data management, e.g. managing client capabilities, user preferences or demographics or processing of multiple end-users preferences to derive collaborative data
  • H04N 21/482 - End-user interface for program selection
  • H04N 21/81 - Monomedia components thereof
  • H04N 21/84 - Generation or processing of descriptive data, e.g. content descriptors
  • H04N 21/845 - Structuring of content, e.g. decomposing content into time segments
  • H04N 21/8547 - Content authoring involving timestamps for synchronizing content
  • H04N 21/2662 - Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
  • G10L 15/26 - Speech to text systems

25.

Method for determining a set of filter coefficients for an acoustic echo compensator

      
Application Number 14314106
Grant Number 09264805
Status In Force
Filing Date 2014-06-25
First Publication Date 2014-10-16
Grant Date 2016-02-16
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Buck, Markus
  • Schmidt, Gerhard Uwe
  • Wolff, Tobias

Abstract

Methods and apparatus for beamforming and performing echo compensation for the beamformed signal with an echo canceller including calculating a set of filter coefficients as an estimate for a new steering direction without a complete adaptation of the echo canceller.

IPC Classes  ?

  • H04R 3/00 - Circuits for transducers
  • G10L 21/0208 - Noise filtering
  • H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g.  for suppressing echoes for one or both directions of traffic
  • H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

26.

System and method for handling missing speech data

      
Application Number 14299745
Grant Number 09305546
Status In Force
Filing Date 2014-06-09
First Publication Date 2014-09-25
Grant Date 2016-04-05
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Ljolje, Andrej
  • Conkie, Alistair D.

Abstract

Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

27.

Biometric authorization for real time access control

      
Application Number 13787774
Grant Number 09348988
Status In Force
Filing Date 2013-03-06
First Publication Date 2014-09-11
Grant Date 2016-05-24
Owner Nuance Communications, Inc. (USA)
Inventor
  • Dykstra-Erickson, Elizabeth Ann
  • Daniel, Susan Dawnstarr
  • Mauro, David Andrew

Abstract

A method of providing biometric authorization comprising enabling a user to log into an account, and determining whether there is a hold on the account. When there is a hold on the account, informing the user of the hold, and enabling the user to respond to a transaction that caused the hold. The method, in one embodiment further comprising prompting the user to enter a biometric authentication, in conjunction with the response, and processing the unblock request in real-time upon receiving and validating the biometric authentication.

IPC Classes  ?

  • G06F 7/04 - Identity comparison, i.e. for like or unlike values
  • G06F 21/32 - User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints

28.

Speaker localization

      
Application Number 14178309
Grant Number 09622003
Status In Force
Filing Date 2014-02-12
First Publication Date 2014-09-04
Grant Date 2017-04-11
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Schmidt, Gerhard Uwe
  • Wolff, Tobias
  • Buck, Markus
  • Valbuena, Olga Gonzalez
  • Wirsching, Gunther

Abstract

Methods and apparatus for determining phase shift information between the first and second microphone signals for a sound signal, and determining an angle of incidence of the sound in relation to the first and second positions of the first and second microphones from the phase shift information of a band-limited test signal received by the first and second microphones for a frequency range of interest.

IPC Classes  ?

  • H04R 3/00 - Circuits for transducers
  • H04R 29/00 - Monitoring arrangements; Testing arrangements
  • G10L 21/0272 - Voice signal separating
  • G10L 21/0216 - Noise filtering characterised by the method used for estimating noise

29.

Machine translation using global lexical selection and sentence reconstruction

      
Application Number 11686681
Grant Number 08788258
Status In Force
Filing Date 2007-03-15
First Publication Date 2014-07-22
Grant Date 2014-07-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Bangalore, Srinivas
  • Haffner, Patrick
  • Kanthak, Stephan

Abstract

Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

IPC Classes  ?

  • G06F 17/28 - Processing or translating of natural language

30.

Beamforming pre-processing for speaker localization

      
Application Number 14176351
Grant Number 09414159
Status In Force
Filing Date 2014-02-10
First Publication Date 2014-06-05
Grant Date 2016-08-09
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Wolff, Tobias
  • Buck, Markus
  • Schmidt, Gerhard Uwe

Abstract

Methods and apparatus to beamform a first plurality of microphone signals using at least one beamforming weight to obtain a first beamformed signal, beamform a second plurality of microphone signals using the at least one beamforming weight to obtain a second beamformed signal, and adjust the at least one beamforming weight so that the power density of at least one perturbation component present in the first or the second plurality of microphone signals is reduced.

IPC Classes  ?

31.

Text message generation for emergency services as a backup to voice communications

      
Application Number 13689396
Grant Number 08874070
Status In Force
Filing Date 2012-11-29
First Publication Date 2014-05-29
Grant Date 2014-10-28
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Basore, David L.
  • Lawser, John Jutten

Abstract

A mobile device may detect when a calling party dials an emergency service to request emergency assistance. Following input of the dialed digits, the device may automatically generate a text message in addition to initiating a voice call, both of which may be transmitted over a wireless data network. The wireless network may correlate the two calls as originating from the same emergency situation and may attempt to deliver the two calls to a Public Services Answering Position (PSAP) at an appropriate emergency center. If the PSAP does not receive a voice call, the PSAP may communicate with the device via test messaging.

IPC Classes  ?

  • H04M 11/04 - Telephonic communication systems specially adapted for combination with other electrical systems with alarm systems, e.g. fire, police or burglar alarm systems
  • H04W 4/22 - Emergency connection handling

32.

Accuracy improvement of spoken queries transcription using co-occurrence information

      
Application Number 14156788
Grant Number 09330661
Status In Force
Filing Date 2014-01-16
First Publication Date 2014-05-15
Grant Date 2016-05-03
Owner Nuance Communications, Inc. (USA)
Inventor
  • Mamou, Jonathan
  • Sethy, Abhinav
  • Ramabhadran, Bhuvana
  • Hoory, Ron
  • Vozila, Paul Joseph
  • Bodenstab, Nathan

Abstract

Techniques disclosed herein include systems and methods for voice-enabled searching. Techniques include a co-occurrence based approach to improve accuracy of the 1-best hypothesis for non-phrase voice queries, as well as for phrased voice queries. A co-occurrence model is used in addition to a statistical natural language model and acoustic model to recognize spoken queries, such as spoken queries for searching a search engine. Given an utterance and an associated list of automated speech recognition n-best hypotheses, the system rescores the different hypotheses using co-occurrence information. For each hypothesis, the system estimates a frequency of co-occurrence within web documents. Combined scores from a speech recognizer and a co-occurrence engine can be combined to select a best hypothesis with a lower word error rate.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G10L 15/16 - Speech classification or search using artificial neural networks
  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/26 - Speech to text systems
  • G10L 15/04 - Segmentation; Word boundary detection
  • G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
  • G10L 15/28 - Constructional details of speech recognition systems
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
  • G10L 15/08 - Speech classification or search
  • G06F 7/00 - Methods or arrangements for processing data by operating upon the order or content of the data handled
  • G06F 17/30 - Information retrieval; Database structures therefor

33.

System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts

      
Application Number 14016757
Grant Number 08886536
Status In Force
Filing Date 2013-09-03
First Publication Date 2014-01-09
Grant Date 2014-11-11
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennwick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and track advertisement interactions in voice recognition contexts. In particular, in response to an input device receiving an utterance, a conversational language processor may select and deliver one or more advertisements targeted to a user that spoke the utterance based on cognitive models associated with the user, various users having similar characteristics to the user, an environment in which the user spoke the utterance, or other criteria. Further, subsequent interaction with the targeted advertisements may be tracked to build and refine the cognitive models and thereby enhance the information used to deliver targeted advertisements in response to subsequent utterances.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G06Q 30/00 - Commerce
  • G10L 15/26 - Speech to text systems
  • G06Q 30/02 - Marketing; Price estimation or determination; Fundraising

34.

System and method for a cooperative conversational voice user interface

      
Application Number 13987645
Grant Number 09015049
Status In Force
Filing Date 2013-08-19
First Publication Date 2013-12-19
Grant Date 2015-04-21
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Baldwin, Larry
  • Freeman, Tom
  • Tjalve, Michael
  • Ebersold, Blane
  • Weider, Chris

Abstract

A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G06F 3/16 - Sound input; Sound output
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

35.

Automatic updating of confidence scoring functionality for speech recognition systems with respect to a receiver operating characteristic curve

      
Application Number 13977174
Grant Number 09330665
Status In Force
Filing Date 2011-01-07
First Publication Date 2013-10-17
Grant Date 2016-05-03
Owner Nuance Communications, Inc. (USA)
Inventor
  • Morales, Nicolas
  • Connolly, Dermot
  • Halberstadt, Andrew

Abstract

Automatically adjusting confidence scoring functionality is described for a speech recognition engine. Operation of the speech recognition system is revised so as to change an associated receiver operating characteristic (ROC) curve describing performance of the speech recognition system with respect to rates of false acceptance (FA) versus correct acceptance (CA). Then a confidence scoring functionality related to recognition reliability for a given input utterance is automatically adjusted such that where the ROC curve is better for a given operating point after revising the operation of the speech recognition system, the adjusting reflects a double gain constraint to maintain FA and CA rates at least as good as before revising operation of the speech recognition system.

IPC Classes  ?

  • G10L 15/01 - Assessment or evaluation of speech recognition systems
  • G10L 15/065 - Adaptation
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

36.

Integrating multimedia and voicemail

      
Application Number 13868278
Grant Number 09313624
Status In Force
Filing Date 2013-04-23
First Publication Date 2013-09-05
Grant Date 2016-04-12
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Shaw, Venson M.
  • Silverman, Alexander E.

Abstract

Integrated multimedia voicemail systems and methods allow the creation of voicemail with associated multimedia content. A user can compose a voicemail and select or create multimedia content to be associated with the voicemail. A user can associate files, webpage addresses, applications, and user-created content with a voicemail. A user may operate an interface on a user device to select content and instruct a voicemail system to associate such content with a voicemail. The voicemail with integrated multimedia content may be an originating voicemail or a voicemail in response to another voicemail.

IPC Classes  ?

  • H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems
  • H04W 4/12 - Messaging; Mailboxes; Announcements
  • H04M 3/53 - Centralised arrangements for recording incoming messages
  • H04L 12/58 - Message switching systems
  • H04M 1/725 - Cordless telephones

37.

Message translations

      
Application Number 13755903
Grant Number 08688433
Status In Force
Filing Date 2013-01-31
First Publication Date 2013-06-06
Grant Date 2014-04-01
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Davis, Joel A.
  • Kent, Jr., Larry G.
  • Daniell, W. Todd
  • Daigle, Brian K.

Abstract

Systems for translating text messages in an instant messaging system comprise a translation engine for translating text messages into a preferred language of a recipient of the text messages. The systems are preferably configured to send and receive the text messages and to determine whether the text messages that are received in a source language are in the preferred language of the recipients so that the text messages are displayed in the preferred language of the recipients of the text messages. Other systems and methods are also provided.

IPC Classes  ?

  • G06F 17/28 - Processing or translating of natural language

38.

System and method for structuring speech recognized text into a pre-selected document format

      
Application Number 13718568
Grant Number 09396166
Status In Force
Filing Date 2012-12-18
First Publication Date 2013-05-02
Grant Date 2016-07-19
Owner Nuance Communications, Inc. (USA)
Inventor
  • Rosen, Lee
  • Roe, Ed
  • Poust, Wade

Abstract

A system for creating a structured report using a template having at least one predetermined heading and formatting data associated with each heading. The steps include recording a voice file, creating a speech recognized text file corresponding to the voice file, identifying the location of each heading in the text file, and the text corresponding thereto, populating the template with the identified text corresponding to each heading, and formatting the populated template to create the structured report.

IPC Classes  ?

39.

Automated sentence planning in a task classification system

      
Application Number 13470913
Grant Number 08620669
Status In Force
Filing Date 2012-05-14
First Publication Date 2013-02-14
Grant Date 2013-12-31
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Walker, Marilyn A.
  • Rambow, Owen Christopher
  • Rogati, Monica

Abstract

Disclosed is a task classification system that interacts with a user. The task classification system may include a recognizer that may recognize symbols in the user's input communication, and a natural language understanding unit that may determine whether the user's input communication can be understood. If the user's input communication can be understood, the natural language understanding unit may generate understanding data. The system may also include a communicative goal generator that may generate communicative goals based on the symbols recognized by the recognizer and understanding data from the natural language understanding unit. The generated communicative goals may be related to information needed to be obtained from the user. The system may further include a sentence planning unit that may automatically plan one or more sentences based on the communicative goals generated by the communicative goal generator with at least one of the sentences plans being output to the user.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 15/00 - Speech recognition
  • G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups

40.

Acoustic localization of a speaker

      
Application Number 13478941
Grant Number 09338549
Status In Force
Filing Date 2012-05-23
First Publication Date 2012-11-22
Grant Date 2016-05-10
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Haulick, Tim
  • Schmidt, Gerhard Uwe
  • Buck, Markus
  • Wolff, Tobias

Abstract

A system locates a speaker in a room containing a loudspeaker and a microphone array. The loudspeaker transmits a sound that is partly reflected by a speaker. The microphone array detects the reflected sound and converts the sound into a microphone array, the speaker's distance from the microphone array, or both, based on the characteristics of the microphone signals.

IPC Classes  ?

  • G01S 3/80 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves
  • H04R 3/00 - Circuits for transducers
  • H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
  • G01S 3/808 - Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems
  • G01S 5/30 - Determining absolute distances from a plurality of spaced points of known location
  • G01S 15/00 - Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
  • G01S 7/52 - RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES - Details of systems according to groups , , of systems according to group
  • G01S 15/42 - Simultaneous measurement of distance and other coordinates
  • G01S 15/87 - Combinations of sonar systems
  • H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
  • G01S 13/00 - Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
  • G01S 7/292 - Extracting wanted echo-signals
  • H04R 29/00 - Monitoring arrangements; Testing arrangements
  • H04M 1/60 - Substation equipment, e.g. for use by subscribers including speech amplifiers
  • G01S 13/42 - Simultaneous measurement of distance and other coordinates
  • G01S 5/02 - Position-fixing by co-ordinating two or more direction or position-line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
  • G01S 15/06 - Systems determining position data of a target
  • G01S 3/802 - Systems for determining direction or deviation from predetermined direction
  • G01S 7/523 - RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES - Details of systems according to groups , , of systems according to group - Details of pulse systems
  • H04B 7/08 - Diversity systems; Multi-antenna systems, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the receiving station
  • G01S 3/04 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using radio waves - Details
  • G01S 5/18 - Position-fixing by co-ordinating two or more direction or position-line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
  • G10K 11/34 - Sound-focusing or directing, e.g. scanning using electrical steering of transducer arrays, e.g. beam steering
  • G01S 3/00 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
  • G10K 15/08 - Arrangements for producing a reverberation or echo sound

41.

Method and system for automatic transcription prioritization

      
Application Number 13354142
Grant Number 08407050
Status In Force
Filing Date 2012-01-19
First Publication Date 2012-06-28
Grant Date 2013-03-26
Owner Nuance Communications, Inc. (USA)
Inventor
  • Kobal, Jeffrey S.
  • Dhanakshirur, Girish

Abstract

A visual toolkit for prioritizing speech transcription is provided. The toolkit can include a logger (102) for capturing information from a speech recognition system, a processor (104) for determining an accuracy rating of the information, and a visual display (106) for categorizing the information and prioritizing a transcription of the information based on the accuracy rating. The prioritizing identifies spoken utterances having a transcription priority in view of the recognized result. The visual display can include a transcription category (156) having a modifiable textbox entry with a text entry initially corresponding to a text of the recognized result, and an accept button (157) for validating a transcription of the recognized result. The categories can be automatically ranked by the accuracy rating in an ordered priority for increasing an efficiency of transcription.

IPC Classes  ?

42.

System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts

      
Application Number 13371870
Grant Number 08527274
Status In Force
Filing Date 2012-02-13
First Publication Date 2012-06-14
Grant Date 2013-09-03
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennewick, Mike

Abstract

The system and method described herein may use various natural language models to deliver targeted advertisements and track advertisement interactions in voice recognition contexts. In particular, in response to an input device receiving an utterance, a conversational language processor may select and deliver one or more advertisements targeted to a user that spoke the utterance based on cognitive models associated with the user, various users having similar characteristics to the user, an environment in which the user spoke the utterance, or other criteria. Further, subsequent interaction with the targeted advertisements may be tracked to build and refine the cognitive models and thereby enhance the information used to deliver targeted advertisements in response to subsequent utterances.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

43.

System and method for isolating and processing common dialog cues

      
Application Number 11246604
Grant Number 08185400
Status In Force
Filing Date 2005-10-07
First Publication Date 2012-05-22
Grant Date 2012-05-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Goffin, Vincent J.
  • Parthasarathy, Sarangarajan

Abstract

A method, system and machine-readable medium are provided. Speech input is received at a speech recognition component and recognized output is produced. A common dialog cue from the received speech input or input from a second source is recognized. An action is performed corresponding to the recognized common dialog cue. The performed action includes sending a communication from the speech recognition component to the speech generation component while bypassing a dialog component.

IPC Classes  ?

  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
  • G10L 17/00 - Speaker identification or verification
  • G10L 15/26 - Speech to text systems
  • G10L 15/00 - Speech recognition
  • G10L 15/28 - Constructional details of speech recognition systems
  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

44.

Text entry with word prediction, completion, or correction supplemented by search of shared corpus

      
Application Number 12943856
Grant Number 09626429
Status In Force
Filing Date 2010-11-10
First Publication Date 2012-05-10
Grant Date 2017-04-18
Owner Nuance Communications, Inc. (USA)
Inventor Unruh, Erland

Abstract

Searching a shared corpus is used to supplement word prediction, completion, and/or correction of text entry. A user input device at a client device receives user entry of text input comprising a string of symbols. The client device wirelessly transmits instructions to a remote site to conduct a search of a corpus using the string as a contiguous search term. From the remote site, the client device receives results of the search, including multiple sets of one or more words, each set occurring in the corpus immediately after the search term. The client device uses the received sets in word prediction, completion, and/or correction.

IPC Classes  ?

  • G06F 17/30 - Information retrieval; Database structures therefor
  • G06F 3/023 - Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

45.

Multi-state barge-in models for spoken dialog systems

      
Application Number 13279443
Grant Number 08612234
Status In Force
Filing Date 2011-10-24
First Publication Date 2012-04-26
Grant Date 2013-12-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Ljolje, Andrej

Abstract

A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.

IPC Classes  ?

46.

Voicemail system and method for providing voicemail to text message conversion

      
Application Number 11954267
Grant Number 08139726
Status In Force
Filing Date 2007-12-12
First Publication Date 2012-03-20
Grant Date 2012-03-20
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Zetterberg, Carl Peter

Abstract

A method and system for allowing a calling party to send a voicemail message as a text message. A calling party leaves a voicemail message and that message is converted from voice to a text message. If the calling party wishes to confirm the conversion, the text message is then converted to a voicemail message. The converted voicemail message is presented to the calling party so that the calling party can review and edit the message. The calling party can review and edit any portion of the converted voicemail message. The edits of the voicemail message are applied and the voicemail message is converted to a new text message. If the calling party wishes to further review and edit the text message, it is converted to a new voicemail; otherwise the text message is sent to the called party.

IPC Classes  ?

  • H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems

47.

System and method for a cooperative conversational voice user interface

      
Application Number 13251712
Grant Number 08515765
Status In Force
Filing Date 2011-10-03
First Publication Date 2012-01-26
Grant Date 2013-08-20
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Baldwin, Larry
  • Freeman, Tom
  • Tjalve, Michael
  • Ebersold, Blane
  • Weider, Chris

Abstract

A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

48.

Multi-pass echo residue detection with speech application intelligence

      
Application Number 13236968
Grant Number 08244529
Status In Force
Filing Date 2011-09-20
First Publication Date 2012-01-12
Grant Date 2012-08-14
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Wong, Ngai Chiu

Abstract

A method is provided for multi-pass echo residue detection. The method includes detecting audio data, and determining whether the audio data is recognized as speech. Additionally, the method categorizes the audio data recognized as speech as including an acceptable level of residual echo, and categorizes categorizing unrecognizable audio data as including an unacceptable level of residual echo. Furthermore, the method determines whether the unrecognizable audio data contains a user input, and also determines whether a duration of the user input is at least a predetermined duration, and when the user input is at least the predetermined duration, the method extracts the predetermined duration of the user input from a total duration of the user input.

IPC Classes  ?

  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
  • G10L 11/02 - Detection of presence or absence of speech signals

49.

Method and system for using input signal quality in speech recognition

      
Application Number 13205775
Grant Number 08190430
Status In Force
Filing Date 2011-08-09
First Publication Date 2012-01-05
Grant Date 2012-05-29
Owner Nuance Communications, Inc. (USA)
Inventor
  • Doyle, John
  • Pickering, John Brian

Abstract

A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.

IPC Classes  ?

50.

Method and system for identifying and correcting accent-induced speech recognition difficulties

      
Application Number 13228879
Grant Number 08285546
Status In Force
Filing Date 2011-09-09
First Publication Date 2011-12-29
Grant Date 2012-10-09
Owner Nuance Communications, Inc. (USA)
Inventor Reich, David E.

Abstract

A system for use in speech recognition includes an acoustic module accessing a plurality of distinct-language acoustic models, each based upon a different language; a lexicon module accessing at least one lexicon model; and a speech recognition output module. The speech recognition output module generates a first speech recognition output using a first model combination that combines one of the plurality of distinct-language acoustic models with the at least one lexicon model. In response to a threshold determination, the speech recognition output module generates a second speech recognition output using a second model combination that combines a different one of the plurality of distinct-language acoustic models with the at least one distinct-language lexicon model.

IPC Classes  ?

  • G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
  • G10L 15/00 - Speech recognition
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 15/10 - Speech classification or search using distance or distortion measures between unknown speech and reference templates
  • G10L 15/28 - Constructional details of speech recognition systems
  • G10L 17/00 - Speaker identification or verification

51.

Automated sentence planning in a task classification system

      
Application Number 13230254
Grant Number 08180647
Status In Force
Filing Date 2011-09-12
First Publication Date 2011-12-29
Grant Date 2012-05-15
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Walker, Marilyn A.
  • Rambow, Owen Christopher
  • Rogati, Monica

Abstract

The invention relates to a task classification system (900) that interacts with a user. The task classification system (900) may include a recognizer (920) that may recognize symbols in the user's input communication, and a natural language understanding unit (900) that may determine whether the user's input communication can be understood. If the user's input communication can be understood, the natural language understanding unit (930) may generate understanding data. The system may also include a communicative goal generator that may generate communicative goals based on the symbols recognized by the recognizer (920) and understanding data from the natural language understanding unit (930). The generated communicative goals may be related to information needed to be obtained from the user. The system may further include a sentence planning unit (120) that may automatically plan one or more sentences based on the communicative goals generated by the communicative goal generator with at least one of the sentences plans being output to the user.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

52.

Method for automated sentence planning in a task classification system

      
Application Number 13110628
Grant Number 08209186
Status In Force
Filing Date 2011-05-18
First Publication Date 2011-09-08
Grant Date 2012-06-26
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Walker, Marilyn A.
  • Rambow, Owen Christopher
  • Rogati, Monica

Abstract

The invention relates to a method for sentence planning (120) in a task classification system that interacts with a user. The method may include recognizing symbols in the user's input communication and determining whether the user's input communication can be understood. If the user's communication can be understood, understanding data may be generated (220). The method may further include generating communicative goals (3010) based on the recognized symbols and understanding data. The generated communicative goals (3010) may be related to information needed to be obtained form the user. The method may also include automatically planning one or more sentences (3020) based on the generated communicative goals and outputting at least one of the sentence plans to the user (3080).

IPC Classes  ?

  • G10L 21/06 - Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

53.

Integrating multimedia and voicemail

      
Application Number 12606503
Grant Number 08447261
Status In Force
Filing Date 2009-10-27
First Publication Date 2011-04-28
Grant Date 2013-05-21
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Shaw, Venson M.
  • Silverman, Alexander E.

Abstract

Integrated multimedia voicemail systems and methods allow the creation of voicemail with associated multimedia content. A user can compose a voicemail and select or create multimedia content to be associated with the voicemail. A user can associate files, webpage addresses, applications, and user-created content with a voicemail. A user may operate an interface on a user device to select content and instruct a voicemail system to associate such content with a voicemail. The voicemail with integrated multimedia content may be an originating voicemail or a voicemail in response to another voicemail.

IPC Classes  ?

  • H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems
  • H04M 11/10 - Telephonic communication systems specially adapted for combination with other electrical systems with dictation recording and playback systems
  • H04M 1/00 - Substation equipment, e.g. for use by subscribers

54.

System and method for improving robustness of speech recognition using vocal tract length normalization codebooks

      
Application Number 12869039
Grant Number 08160875
Status In Force
Filing Date 2010-08-26
First Publication Date 2010-12-23
Grant Date 2012-04-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Gilbert, Mazin

Abstract

Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook.

IPC Classes  ?

  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

55.

System and method for selecting and presenting advertisements based on natural language processing of voice-based input

      
Application Number 12847564
Grant Number 08145489
Status In Force
Filing Date 2010-07-30
First Publication Date 2010-11-25
Grant Date 2012-03-27
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennewick, Mike

Abstract

A system and method for selecting and presenting advertisements based on natural language processing of voice-based inputs is provided. A user utterance may be received at an input device, and a conversational, natural language processor may identify a request from the utterance. At least one advertisement may be selected and presented to the user based on the identified request. The advertisement may be presented as a natural language response, thereby creating a conversational feel to the presentation of advertisements. The request and the user's subsequent interaction with the advertisement may be tracked to build user statistical profiles, thus enhancing subsequent selection and presentation of advertisements.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G06Q 30/00 - Commerce

56.

Automatic setting of reminders in telephony using speech recognition

      
Application Number 12465731
Grant Number 08145274
Status In Force
Filing Date 2009-05-14
First Publication Date 2010-11-18
Grant Date 2012-03-27
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Gandhi, Salil P.
  • Kottawar, Saidas T.
  • Macias, Mike V.
  • Mahajan, Sandip D.

Abstract

Systems and methods for automatically setting reminders. A method for automatically setting reminders includes receiving utterances, determining whether the utterances match a stored phrase, and in response to determining that there is a match, automatically setting a reminder in a mobile communication device. Various filters can be applied to determine whether or not to set a reminder. Examples of suitable filters include location, date/time, callee's phone number, etc.

IPC Classes  ?

  • H04B 1/38 - Transceivers, i.e. devices in which transmitter and receiver form a structural unit and in which at least one part is used for functions of transmitting and receiving

57.

Automated sentence planning in a task classification system

      
Application Number 12789883
Grant Number 08185401
Status In Force
Filing Date 2010-05-28
First Publication Date 2010-09-23
Grant Date 2012-05-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Walker, Marilyn A.
  • Rambow, Owen Christopher
  • Rogati, Monica

Abstract

The invention relates to a system that interacts with a user in an automated dialog system (100). The system may include a communicative goal generator (210) that generates communicative goals based on a first communication received from the user. The generated communicative goals (210) may be related to information needed to be obtained from the user. The system may further include a sentence planning unit (220) that automatically plans one or more sentences based on the communicative goals generated by the communicative goal generator (210). At least one of the planned sentences may be then output to the user (230).

IPC Classes  ?

  • G10L 21/06 - Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

58.

Method for determining a set of filter coefficients for an acoustic echo compensator

      
Application Number 12708172
Grant Number 08787560
Status In Force
Filing Date 2010-02-18
First Publication Date 2010-08-26
Grant Date 2014-07-22
Owner Nuance Communications, Inc. (USA)
Inventor
  • Buck, Markus
  • Schmidt, Gerhard
  • Wolff, Tobias

Abstract

The invention provides a method for determining a set of filter coefficients for an acoustic echo compensator in a beamformer arrangement. The acoustic echo compensator compensates for echoes within the beamformed signal. A plurality of sets of filter coefficients for the acoustic echo compensator is provided. Each set of filter coefficients corresponds to one of a predetermined number of steering directions of the beamformer arrangement. The predetermined number of steering directions is equal to or greater than the number of microphones in the microphone array. For a current steering direction, a current set of filter coefficients for the acoustic echo compensator is determined based on the provided sets of filter coefficients.

IPC Classes  ?

  • H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g.  for suppressing echoes for one or both directions of traffic
  • G01S 15/00 - Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
  • H04R 3/00 - Circuits for transducers

59.

Speech recognition of a list entry

      
Application Number 12706245
Grant Number 08532990
Status In Force
Filing Date 2010-02-16
First Publication Date 2010-08-19
Grant Date 2013-09-10
Owner Nuance Communications, Inc. (USA)
Inventor
  • Hillebrecht, Christian
  • Schwarz, Markus

Abstract

The present invention relates to a method of generating a candidate list from a list of entries in accordance with a string of subword units corresponding to a speech input in a speech recognition system, the list of entries including plural list entries each comprising at least one fragment having one or more subword units. For each list entry, the fragments of the list entry are compared with the string of subword units. A matching score for each of the compared fragments based on the comparison is determined. The matching score for a fragment is further based on a comparison of at least one other fragment of the same list entry with the string of subword units. A total score for each list entry is determined based on the matching scores for the compared fragments of the respective list entry. A candidate list with the best matching entries from the list of entries based on the total scores of the list entries is generated.

IPC Classes  ?

60.

System and method for enhancing speech recognition accuracy

      
Application Number 12339802
Grant Number 08160879
Status In Force
Filing Date 2008-12-19
First Publication Date 2010-06-24
Grant Date 2012-04-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Czahor, Michael

Abstract

Systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments. Exclusively activating each weighted grammar can include a transition period blending the previously activated grammar and the grammar to be activated.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling

61.

User intention based on N-best list of recognition hypotheses for utterances in a dialog

      
Application Number 12325786
Grant Number 08140328
Status In Force
Filing Date 2008-12-01
First Publication Date 2010-06-03
Grant Date 2012-03-20
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Williams, Jason

Abstract

Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for using alternate recognition hypotheses to improve whole-dialog understanding accuracy. The method includes receiving an utterance as part of a user dialog, generating an N-best list of recognition hypotheses for the user dialog turn, selecting an underlying user intention based on a belief distribution across the generated N-best list and at least one contextually similar N-best list, and responding to the user based on the selected underlying user intention. Selecting an intention can further be based on confidence scores associated with recognition hypotheses in the generated N-best lists, and also on the probability of a user's action given their underlying intention. A belief or cumulative confidence score can be assigned to each inferred user intention.

IPC Classes  ?

  • G10L 15/14 - Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]

62.

Method and device for locating a sound source

      
Application Number 12547681
Grant Number 08194500
Status In Force
Filing Date 2009-08-26
First Publication Date 2010-03-04
Grant Date 2012-06-05
Owner Nuance Communications, Inc. (USA)
Inventor
  • Wolff, Tobias
  • Buck, Markus
  • Schmidt, Gerhard
  • Valbuena, Olga González
  • Wirsching, Günther

Abstract

A method of locating a sound source based on sound received at an array of microphones comprises the steps of determining a correlation function of signals provided by microphones of the array and establishing a direction in which the sound source is located based on at least one eigenvector of a matrix having matrix elements which are determined based on the correlation function. The correlation function has first and second frequency components associated with a first and second frequency band, respectively. The first frequency component is determined based on signals from microphones having a first distance, and the second frequency component is determined based on signals from microphones having a second distance different from the first distance.

IPC Classes  ?

  • G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
  • G01S 3/80 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves
  • H04R 3/00 - Circuits for transducers

63.

Method and apparatus for providing voice control for accessing teleconference services

      
Application Number 12553700
Grant Number 08184792
Status In Force
Filing Date 2009-09-03
First Publication Date 2009-12-31
Grant Date 2012-05-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Croak, Marian
  • Eslambolchi, Hossein

Abstract

A method and apparatus for providing access to teleconference services using voice recognition technology to receive information on packet networks such as Voice over Internet Protocol (VoIP) and Service over Internet Protocol (SoIP) networks are disclosed. In one embodiment, the service provider enables a caller to enter access information for accessing a conference service using at least one natural language response.

IPC Classes  ?

  • H04M 3/42 - Systems providing special services or facilities to subscribers

64.

Method and system for training a text-to-speech synthesis system using a specific domain speech database

      
Application Number 12540441
Grant Number 08135591
Status In Force
Filing Date 2009-08-13
First Publication Date 2009-12-03
Grant Date 2012-03-13
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Schroeter, Horst Juergen

Abstract

A method and system are disclosed that train a text-to-speech synthesis system for use in speech synthesis. The method includes generating a speech database of audio files comprising domain-specific voices having various prosodies, and training a text-to-speech synthesis system using the speech database by selecting audio segments having a prosody based on at least one dialog state. The system includes a processor, a speech database of audio files, and modules for implementing the method.

IPC Classes  ?

  • G10L 13/00 - Speech synthesis; Text to speech systems

65.

Low latency real-time vocal tract length normalization

      
Application Number 12490634
Grant Number 08909527
Status In Force
Filing Date 2009-06-24
First Publication Date 2009-10-15
Grant Date 2014-12-09
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Goffin, Vincent
  • Ljolje, Andrej
  • Saraclar, Murat

Abstract

A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.

IPC Classes  ?

  • G10L 15/12 - Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit

66.

System for distinguishing desired audio signals from noise

      
Application Number 12269837
Grant Number 08131544
Status In Force
Filing Date 2008-11-12
First Publication Date 2009-09-10
Grant Date 2012-03-06
Owner Nuance Communications, Inc. (USA)
Inventor
  • Herbig, Tobias
  • Gaupp, Oliver
  • Gerl, Franz

Abstract

A system distinguishes a primary audio source and background noise to improve the quality of an audio signal. A speech signal from a microphone may be improved by identifying and dampening background noise to enhance speech. Stochastic models may be used to model speech and to model background noise. The models may determine which portions of the signal are speech and which portions are noise. The distinction may be used to improve the signal's quality, and for speaker identification or verification.

IPC Classes  ?

  • G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech

67.

Voice response system

      
Application Number 12253849
Grant Number 08145494
Status In Force
Filing Date 2008-10-17
First Publication Date 2009-06-04
Grant Date 2012-03-27
Owner Nuance Communications, Inc. (USA)
Inventor
  • Horioka, Masaru
  • Atake, Yoshinori
  • Tahara, Yoshinori

Abstract

A voice response system attempts to respond to spoken user input and to provide computer-generated responses. If the system decides it cannot provide valid responses, the current state of user session is determined and forwarded to a human operator for further action. The system maintains a recorded history of the session in the form of a dialog history log. The dialog history and information as to the reliability of past speech recognition efforts is employed in making the current state determination. The system includes formatting rules for controlling the display of information presented to the human operator.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

68.

System and method for conducting a search using a wireless mobile device

      
Application Number 12350848
Grant Number 08285273
Status In Force
Filing Date 2009-01-08
First Publication Date 2009-05-07
Grant Date 2012-10-09
Owner Nuance Communications, Inc. (USA)
Inventor Roth, Daniel L.

Abstract

A method and system are provided by which a wireless mobile device takes a vocally entered query and transmits it in a text message format over a wireless network to a search engine; receives search results based on the query from the search engine over the wireless network; and displays the search results.

IPC Classes  ?

  • H04W 4/00 - Services specially adapted for wireless communication networks; Facilities therefor

69.

Voice conversion method and system

      
Application Number 12240148
Grant Number 08234110
Status In Force
Filing Date 2008-09-29
First Publication Date 2009-04-02
Grant Date 2012-07-31
Owner Nuance Communications, Inc. (USA)
Inventor
  • Meng, Fan Ping
  • Qin, Yong
  • Shi, Qin
  • Shuang, Zhi Wei

Abstract

A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

IPC Classes  ?

  • G10L 19/06 - Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

70.

Creation and use of application-generic class-based statistical language models for automatic speech recognition

      
Application Number 11845015
Grant Number 08135578
Status In Force
Filing Date 2007-08-24
First Publication Date 2009-02-26
Grant Date 2012-03-13
Owner Nuance Communications, Inc. (USA)
Inventor Hébert, Matthieu

Abstract

A method of creating an application-generic class-based SLM includes, for each of a plurality of speech applications, parsing a corpus of utterance transcriptions to produce a first output set, in which expressions identified in the corpus are replaced with corresponding grammar tags from a grammar that is specific to the application. The method further includes, for each of the plurality of speech applications, replacing each of the grammar tags in the first output set with a class identifier of an application-generic class, to produce a second output set. The method further includes processing the resulting second output sets with a statistical language model (SLM) trainer to generate an application-generic class-based SLM.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

71.

Using speech recognition results based on an unstructured language model in a mobile communication facility application

      
Application Number 12184375
Grant Number 08886540
Status In Force
Filing Date 2008-08-01
First Publication Date 2009-01-29
Grant Date 2014-11-11
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Cerra, Joseph P.
  • Nguyen, John N.
  • Phillips, Michael S.
  • Shu, Han
  • Mischke, Alexandra Beth

Abstract

A method and system for entering information into a software application resident on a mobile communication facility is provided. The method and system may include recording speech presented by a user using a mobile communication facility resident capture facility, transmitting the recording through a wireless communication facility to a speech recognition facility, transmitting information relating to the software application to the speech recognition facility, generating results utilizing the speech recognition facility using an unstructured language model based at least in part on the information relating to the software application and the recording, transmitting the results to the mobile communications facility, loading the results into the software application and simultaneously displaying the results as a set of words and as a set of application results based on those words.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

72.

System and method of performing user-specific automatic speech recognition

      
Application Number 12207175
Grant Number 08145481
Status In Force
Filing Date 2008-09-09
First Publication Date 2009-01-01
Grant Date 2012-03-27
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Gajic, Bojana
  • Narayanan, Shrikanth Sambasivan
  • Parthasarathy, Sarangarajan
  • Rose, Richard Cameron
  • Rosenberg, Aaron Edward

Abstract

Speech recognition models are dynamically re-configurable based on user information, application information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. Word recognition lattices are generated for each data field of an application and dynamically concatenated into a single word recognition lattice. A language model is applied to the concatenated word recognition lattice to determine the relationships between the word recognition lattices and repeated until the generated word recognition lattices are acceptable or differ from a predetermined value only by a threshold amount. These techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

73.

Method and device for providing speech-to-text encoding and telephony service

      
Application Number 12200292
Grant Number 08265931
Status In Force
Filing Date 2008-08-28
First Publication Date 2008-12-25
Grant Date 2012-09-11
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Caldwell, Charles David
  • Harlow, John Bruce
  • Sayko, Robert J.
  • Shaye, Norman

Abstract

A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.

IPC Classes  ?

  • G10L 15/26 - Speech to text systems
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations

74.

Method and system for speech based document history tracking

      
Application Number 12096068
Grant Number 08140338
Status In Force
Filing Date 2006-11-10
First Publication Date 2008-12-18
Grant Date 2012-03-20
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Grobauer, Gerhard
  • Papai, Miklos

Abstract

A method and a system of history tracking corrections in a speech based document are disclosed. The speech based document comprises one or more sections of text recognized or transcribed from sections of speech, wherein the sections of speech are dictated by a user and processed by a speech recognizer in a speech recognition system into corresponding sections of text of the speech based document. The method comprises associating of at least one speech attribute (14) to each section of text in the speech based document, said speech attribute (14) comprising information related to said section of text, respectively; presenting said speech based document on a presenting unit (8); detecting an action being performed within any of said sections of text; and updating information of said speech attributes (14) related to the kind of action detected on one of said sections of text for updating said speech based document, whereby said updated information of said speech attributes (14) is used for history tracking corrections of said speech based document.

IPC Classes  ?

  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G10L 15/26 - Speech to text systems
  • G06F 3/00 - Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

75.

Speech recognition system with huge vocabulary

      
Application Number 12096046
Grant Number 08140336
Status In Force
Filing Date 2006-12-06
First Publication Date 2008-11-27
Grant Date 2012-03-20
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Saffer, Zsolt

Abstract

The invention deals with speech recognition, such as a system for recognizing words in continuous speech. A speech recognition system is disclosed which is capable of recognizing a huge number of words, and in principle even an unlimited number of words. The speech recognition system comprises a word recognizer for deriving a best path through a word graph, and wherein words are assigned to the speech based on the best path. The word score being obtained from applying a phonemic language model to each word of the word graph. Moreover, the invention deals with an apparatus and a method for identifying words from a sound block and to computer readable code for implementing the method.

IPC Classes  ?

  • G10L 15/18 - Speech classification or search using natural language modelling
  • G10L 15/04 - Segmentation; Word boundary detection
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

76.

Acoustic localization of a speaker

      
Application Number 12104836
Grant Number 08204248
Status In Force
Filing Date 2008-04-17
First Publication Date 2008-11-20
Grant Date 2012-06-19
Owner Nuance Communications, Inc. (USA)
Inventor
  • Haulick, Tim
  • Schmidt, Gerhard Uwe
  • Buck, Markus
  • Wolff, Tobias

Abstract

A system locates a speaker in a room containing a loudspeaker and a microphone array. The loudspeaker transmits a sound that is partly reflected by a speaker. The microphone array detects the reflected sound and converts the sound into a microphone signal. A processor determines the speaker's direction relative to the microphone array, the speaker's distance from the microphone array, or both, based on the characteristics of the microphone signals.

IPC Classes  ?

77.

Categorization of information using natural language processing and predefined templates

      
Application Number 12121527
Grant Number 08185553
Status In Force
Filing Date 2008-05-15
First Publication Date 2008-10-16
Grant Date 2012-05-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Carus, Alwin B.
  • Ogrinc, Harry J.

Abstract

A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

IPC Classes  ?

  • G06F 17/30 - Information retrieval; Database structures therefor

78.

Method for dialog management

      
Application Number 12140805
Grant Number 08600747
Status In Force
Filing Date 2008-06-17
First Publication Date 2008-10-09
Grant Date 2013-12-03
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Abella, Alicia
  • Gorin, Allen Louis

Abstract

A spoken dialog system and method having a dialog management module are disclosed. The dialog management module includes a plurality of dialog motivators for handling various operations during a spoken dialog. The dialog motivators comprise an error handling, disambiguation, assumption, confirmation, missing information, and continuation. The spoken dialog system uses the assumption dialog motivator in either a-priori or a-posteriori modes. A-priori assumption is based on predefined requirements for the call flow and a-posteriori assumption can work with the confirmation dialog motivator to assume the content of received user input and confirm received user input.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
  • G06F 9/46 - Multiprogramming arrangements
  • G06F 9/44 - Arrangements for executing specific programs
  • G06F 17/20 - Handling natural language data
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 15/26 - Speech to text systems
  • H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
  • H04M 11/00 - Telephonic communication systems specially adapted for combination with other electrical systems

79.

Natural error handling in speech recognition

      
Application Number 12135452
Grant Number 08355920
Status In Force
Filing Date 2008-06-09
First Publication Date 2008-10-02
Grant Date 2013-01-15
Owner Nuance Communications, Inc. (USA)
Inventor
  • Gopinath, Ramesh A.
  • Maison, Benoit
  • Wu, Brian C.

Abstract

A user interface, and associated techniques, that permit a fast and efficient way of correcting speech recognition errors, or of diminishing their impact. The user may correct mistakes in a natural way, essentially by repeating the information that was incorrectly recognized previously. Such a mechanism closely approximates what human-to-human dialogue would be in similar circumstances. Such a system fully takes advantage of all the information provided by the user, and on its own estimates the quality of the recognition in order to determine the correct sequence of words in the fewest number of steps.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

80.

Method and apparatus for data capture using a voice activated workstation

      
Application Number 12089033
Grant Number 08165876
Status In Force
Filing Date 2006-09-04
First Publication Date 2008-09-25
Grant Date 2012-04-24
Owner Nuance Communications, Inc. (USA)
Inventor
  • Emam, Ossama
  • Gamal, Khaled

Abstract

A method and apparatus for capturing data in a workstation, wherein a large number of data associated with a sample which is viewed, by a user, through an optical device, such as a microscope, is to be entered in a computer related file. The optical device can be moved to a data-sampling position utilizing voice commands. A pointer can then be moved to an appropriate place in the file to receive the data relating to the data-sampling position. Data can be then entered in the appropriate position utilizing a voice command. The steps of moving the pointer and entering the data can then be repeated until all data is provided with respect to the data-sampling positions.

IPC Classes  ?

81.

Invoking tapered prompts in a multimodal application

      
Application Number 11678920
Grant Number 08150698
Status In Force
Filing Date 2007-02-26
First Publication Date 2008-08-28
Grant Date 2012-04-03
Owner Nuance Communications, Inc. (USA)
Inventor
  • Ativanichayaphong, Soonthorn
  • Cross, Jr., Charles W.
  • Mccobb, Gerald M.

Abstract

Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

82.

System and method for selecting and presenting advertisements based on natural language processing of voice-based input

      
Application Number 11671526
Grant Number 07818176
Status In Force
Filing Date 2007-02-06
First Publication Date 2008-08-07
Grant Date 2010-10-19
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Freeman, Tom
  • Kennewick, Mike

Abstract

A system and method for selecting and presenting advertisements based on natural language processing of voice-based inputs is provided. A user utterance may be received at an input device, and a conversational, natural language processor may identify a request from the utterance. At least one advertisement may be selected and presented to the user based on the identified request. The advertisement may be presented as a natural language response, thereby creating a conversational feel to the presentation of advertisements. The request and the user's subsequent interaction with the advertisement may be tracked to build user statistical profiles, thus enhancing subsequent selection and presentation of advertisements.

IPC Classes  ?

  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G10L 15/18 - Speech classification or search using natural language modelling
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G06Q 30/00 - Commerce

83.

Method and an apparatus to disambiguate requests

      
Application Number 11701811
Grant Number 08175248
Status In Force
Filing Date 2007-02-02
First Publication Date 2008-08-07
Grant Date 2012-05-08
Owner Nuance Communications, Inc. (USA)
Inventor
  • Agarwal, Rajeev
  • Ardman, David
  • Master, Muneeb
  • Mauro, David Andrew
  • Raman, Vijay R.
  • Ulug, Amy E.
  • Valli, Zulfikar

Abstract

A method and an apparatus to disambiguate requests are presented. In one embodiment, the method includes receiving a request for information from a user. Then data is retrieved from a back-end database in response to the request. Based on a predetermined configuration of a disambiguation system and the data retrieved, the ambiguity within the request is dynamically resolved.

IPC Classes  ?

  • H04M 3/42 - Systems providing special services or facilities to subscribers

84.

Method and apparatus for recognizing and reacting to user personality in accordance with speech recognition system

      
Application Number 12055952
Grant Number 08719035
Status In Force
Filing Date 2008-03-26
First Publication Date 2008-07-24
Grant Date 2014-05-06
Owner Nuance Communications, Inc. (USA)
Inventor
  • Stewart, Osamuyimen Thompson
  • Dai, Liwei

Abstract

Techniques are disclosed for recognizing user personality in accordance with a speech recognition system. For example, a technique for recognizing a personality trait associated with a user interacting with a speech recognition system includes the following steps/operations. One or more decoded spoken utterances of the user are obtained. The one or more decoded spoken utterances are generated by the speech recognition system. The one or more decoded spoken utterances are analyzed to determine one or more linguistic attributes (morphological and syntactic filters) that are associated with the one or more decoded spoken utterances. The personality trait associated with the user is then determined based on the analyzing step/operation.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 15/00 - Speech recognition
  • G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
  • G09B 3/00 - Manually- or mechanically-operated teaching appliances working with questions and answers
  • G09B 7/00 - Electrically-operated teaching apparatus or devices working with questions and answers
  • G09B 19/00 - Teaching not covered by other main groups of this subclass
  • G09B 19/04 - Speaking
  • G09B 17/04 - Teaching reading for increasing the rate of reading; Reading rate control
  • G09B 1/00 - Manually- or mechanically-operated educational appliances using elements forming or bearing symbols, signs, pictures, or the like which are arranged or adapted to be arranged in one or more particular ways

85.

Software program and method for providing promotions on a phone prior to call connection

      
Application Number 11636334
Grant Number 08160552
Status In Force
Filing Date 2006-12-08
First Publication Date 2008-06-12
Grant Date 2012-04-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Stone, Kevin M.

Abstract

The present invention includes a method and software application for providing a promotion to a user on a phone. The software application resides on a user's phone and “listens” for phone numbers dialed by a user. In response to the user dialing a phone number, the software determines whether a promotion or an offer for a promotion should be provided to the user. In response to determining to play or offer to play a promotion to the user, the software application on the phone effectively “intercepts” the call and plays to the user either a promotion or an offer to hear about a promotion prior to placing an outbound voice call. The software application may retrieve the promotion from local memory or may connect with a remote server to download an applicable promotion.

IPC Classes  ?

  • H04M 3/42 - Systems providing special services or facilities to subscribers

86.

Web integrated interactive voice response

      
Application Number 11961005
Grant Number 08204184
Status In Force
Filing Date 2007-12-20
First Publication Date 2008-05-08
Grant Date 2012-06-19
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Gao, Xiaofeng
  • Scott, David
  • Zellner, Sam

Abstract

One embodiment of a representative system for web integrated interactive voice response includes an interactive voice response system adapted to provide a plurality of voice menus to a user over a telephone and a graphical user interface system adapted to provide a plurality of menus in a graphical format to the user over a network connection. Information provided in the voice menus corresponds to information provided in the menus in the graphical format and is responsive to commands received by the graphical user interface system from the user. Other systems and methods are also provided.

IPC Classes  ?

  • H04M 11/06 - Simultaneous speech and data transmission, e.g. telegraphic transmission over the same conductors
  • G06F 3/00 - Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

87.

Methods for voice activated dialing

      
Application Number 11959822
Grant Number 08150001
Status In Force
Filing Date 2007-12-19
First Publication Date 2008-05-01
Grant Date 2012-04-03
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Bishop, Michael
  • Koch, Robert

Abstract

Methods for routing a call based on voice activated dialing (VAD). A VAD device module may respond to a VAD instruction, or to a call received with a VAD instruction with a corresponding call destination number obtained from a personal VAD directory. If the personal VAD directory fails to include the call destination number, the VAD device module may route the call or initiate a call through a gateway to a VAD network module. The VAD network module may obtain call destination information from the VAD instruction, and may use the call destination information obtain the call destination number. The VAD network module may obtain additional information from the call or other source, and use the additional information to obtain the call destination number. The call then is routed to the call destination number. The call destination number may be added to the personal VAD directory.

IPC Classes  ?

  • H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations

88.

System and method for a cooperative conversational voice user interface

      
Application Number 11580926
Grant Number 08073681
Status In Force
Filing Date 2006-10-16
First Publication Date 2008-04-17
Grant Date 2011-12-06
Owner
  • VB ASSETS, LLC (USA)
  • NUANCE COMMUNICATIONS, INC. (USA)
  • VB ASSETS, LLC (USA)
Inventor
  • Baldwin, Larry
  • Freeman, Tom
  • Tjalve, Michael
  • Ebersold, Blane
  • Weider, Chris

Abstract

A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.

IPC Classes  ?

  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction
  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00

89.

Establishing a preferred mode of interaction between a user and a multimodal application

      
Application Number 11530599
Grant Number 08145493
Status In Force
Filing Date 2006-09-11
First Publication Date 2008-03-13
Grant Date 2012-03-27
Owner Nuance Communications, Inc. (USA)
Inventor
  • Cross, Jr., Charles W.
  • Pike, Hilary A.

Abstract

Establishing a preferred mode of interaction between a user and a multimodal application, including evaluating, by a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, user modal preference, and dynamically configuring multimodal content of the multimodal application in dependence upon the evaluation of user modal preference.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

90.

Method and apparatus for recognizing a user personality trait based on a number of compound words used by the user

      
Application Number 11436295
Grant Number 08150692
Status In Force
Filing Date 2006-05-18
First Publication Date 2007-11-22
Grant Date 2012-04-03
Owner Nuance Communications, Inc. (USA)
Inventor
  • Stewart, Osamuyimen Thompson
  • Dai, Liwei

Abstract

Techniques for recognizing a personality trait associated with a user. Input from the user is analyzed to determine a number of words, including a number of compound words. The personality trait associated with the user is determined based, at least in part, on the number of compound words exceeding a threshold.

IPC Classes  ?

91.

Mass-scale, user-independent, device-independent voice messaging system

      
Application Number 11673746
Grant Number 08903053
Status In Force
Filing Date 2007-02-12
First Publication Date 2007-06-07
Grant Date 2014-12-02
Owner Nuance Communications, Inc. (USA)
Inventor Doulton, Daniel Michael

Abstract

A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.

IPC Classes  ?

  • H04M 1/64 - Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
  • H04M 3/51 - Centralised call answering arrangements requiring operator intervention
  • H04M 3/493 - Interactive information services, e.g. directory enquiries
  • H04M 3/533 - Voice mail systems
  • G10L 15/26 - Speech to text systems

92.

System and method for conducting a search using a wireless mobile device

      
Application Number 11263601
Grant Number 07477909
Status In Force
Filing Date 2005-10-31
First Publication Date 2007-05-03
Grant Date 2009-01-13
Owner Nuance Communications, Inc. (USA)
Inventor Roth, Daniel Lawrence

Abstract

A method and system are provided by which a wireless mobile device takes a vocally entered query and transmits it in a text message format over a wireless network to a search engine; receives search results based on the query from the search engine over the wireless network; and displays the search results.

IPC Classes  ?

  • H04N 7/173 - Analogue secrecy systems; Analogue subscription systems with two-way working, e.g. subscriber sending a programme selection signal
  • G06F 17/30 - Information retrieval; Database structures therefor
  • H04M 3/00 - Automatic or semi-automatic exchanges
  • H04Q 7/20 -

93.

Method, system and apparatus for data reuse

      
Application Number 11545414
Grant Number 08370734
Status In Force
Filing Date 2006-10-10
First Publication Date 2007-02-15
Grant Date 2013-02-05
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Boone, Keith W.
  • Chaparala, Sunitha
  • Fordyce, Cameron
  • Gervais, Sean
  • Manoukian, Roubik
  • Ogrinc, Harry J.
  • Titemore, Robert G.
  • Hopkins, Jeffrey G.

Abstract

A system and method may be disclosed for facilitating the creation or modification of a document by providing a mechanism for locating relevant data from external sources and organizing and incorporating some or all of said data into the document. In the method for reusing data, there may be a set of documents that may be queried, where each document may be divided into a plurality of sections. A plurality of section text groups may be formed based on the set of documents, where each section text group may be associated with a respective section from the plurality of sections and each section group includes a plurality of items. Each item may be associated with a respective section from each document of the set of documents. A selected item within a selected section text group may be focused. The selected item may be extracted to a current document. The current document may be exported to a host application.

IPC Classes  ?

  • G06F 17/00 - Digital computing or data processing equipment or methods, specially adapted for specific functions

94.

Methods and apparatus for automatically extending the voice vocabulary of mobile communications devices

      
Application Number 11347666
Grant Number 08160884
Status In Force
Filing Date 2006-02-03
First Publication Date 2006-08-03
Grant Date 2012-04-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Roth, Daniel L.
  • Cohen, Jordan
  • Behrakis, Elias P.

Abstract

The invention is a method of improving the performance of a speech recognizer. The method generally involves: providing a lexicon for the speech recognizer; monitoring a user's interaction with a network; accessing a plurality of words associated with the monitored interaction; and including the plurality of words in the lexicon.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

95.

Speech signal processing with combined noise reduction and echo compensation

      
Application Number 11218687
Grant Number 07747001
Status In Force
Filing Date 2005-09-02
First Publication Date 2006-07-13
Grant Date 2010-06-29
Owner Nuance Communications, Inc. (USA)
Inventor
  • Kellermann, Walter
  • Herbordt, Wolfgang

Abstract

A speech signal processing system combines acoustic noise reduction and echo cancellation to enhance acoustic performance. The speech signal processing system may be used in vehicles or other environments where noise-suppressed communication is desirable. The system includes an adaptive beamforming signal processing unit, an adaptive echo compensating unit to reduce acoustic echoes, and an adaptation unit to combine noise reduction and adaptive echo compensating.

IPC Classes  ?

  • H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g.  for suppressing echoes for one or both directions of traffic

96.

System and method of providing an automated data-collection in spoken dialog systems

      
Application Number 11029798
Grant Number 08185399
Status In Force
Filing Date 2005-01-05
First Publication Date 2006-07-06
Grant Date 2012-05-22
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Di Fabbrizio, Giuseppe
  • Hakkani-Tur, Dilek Z.
  • Rahim, Mazin G.
  • Renger, Bernard S.
  • Tur, Gokhan

Abstract

The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
  • G06F 17/27 - Automatic analysis, e.g. parsing, orthograph correction

97.

System and method for providing network coordinated conversational services

      
Application Number 11303768
Grant Number 07519536
Status In Force
Filing Date 2005-12-16
First Publication Date 2006-05-25
Grant Date 2009-04-14
Owner Nuance Communications, Inc. (USA)
Inventor
  • Maes, Stephane H.
  • Gopalakrishnan, Ponani

Abstract

A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.

IPC Classes  ?

  • G10L 21/00 - Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 11/00 - Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
  • G06F 15/16 - Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

98.

Method and system of generating a speech signal with overlayed random frequency signal

      
Application Number 10957222
Grant Number 07558389
Status In Force
Filing Date 2004-10-01
First Publication Date 2006-04-06
Grant Date 2009-07-07
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor Desimone, Joseph

Abstract

A method and apparatus utilizing prosody modification of a speech signal output by a text-to-speech (TTS) system to substantially prevent an interactive voice response (IVR) system from understanding the speech signal without significantly degrading the speech signal with respect to human understanding. The present invention involves modifying the prosody of the speech output signal by using the prosody of the user's response to a prompt. In addition, a randomly generated overlay frequency is used to modify the speech signal to further prevent an IVR system from recognizing the TTS output. The randomly generated frequency may be periodically changed using an overlay timer that changes the random frequency signal at a predetermined intervals.

IPC Classes  ?

  • H04L 9/00 - Arrangements for secret or secure communications; Network security protocols
  • H04N 7/167 - Systems rendering the television signal unintelligible and subsequently intelligible
  • G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

99.

Combined speech recognition and sound recording

      
Application Number 11005568
Grant Number 07505911
Status In Force
Filing Date 2004-12-05
First Publication Date 2005-07-21
Grant Date 2009-03-17
Owner NUANCE COMMUNICATIONS, INC. (USA)
Inventor
  • Roth, Daniel L.
  • Cohen, Jordan R.
  • Johnston, David F.
  • Porter, Edward W.

Abstract

A handheld device with both large-vocabulary speech recognition and audio recoding allows users to switch between at least two of the following three modes: (1) recording audio without corresponding speech recognition; (2) recording with speech recognition; and (3) speech recognition without audio recording. A handheld device with both large-vocabulary speech recognition and audio recoding enables a user to select a portion of previously recorded sound and have speech recognition performed upon it. A system enables a user to search for a text label associated with portions of unrecognized recorded sound by uttering the label's words. A large-vocabulary system allows users to switch between playing back recorded audio and speech recognition with a single input, with successive audio playbacks automatically starting slightly before the end of prior playback. And a cell phone that allows both large-vocabulary speech recognition and audio recording and playback.

IPC Classes  ?

  • G01L 21/06 - Vacuum gauges having a compression chamber in which gas, whose pressure is to be measured, is compressed wherein the chamber is closed by liquid; Vacuum gauges of the McLeod type actuated by rotating or inverting the measuring device

100.

Electronic device and user interface and input method therefor

      
Application Number 10719576
Grant Number 08136050
Status In Force
Filing Date 2003-11-21
First Publication Date 2005-05-26
Grant Date 2012-03-13
Owner Nuance Communications, Inc. (USA)
Inventor
  • Sacher, Heiko K.
  • Romera, Maria E.
  • Nagel, Jens

Abstract

A portable electronic device (100,400) and user interface (425) are operated using a method including initiating entry of a content string; determining the most probable completion alternative or a content prediction using a personalized and learning database (430); displaying the most probable completion alternative or next content prediction; determining whether a user has accepted the most probable completion alternative or next content prediction; and adding the most probable completion alternative or next content prediction to the content string upon user acceptance.

IPC Classes  ?

  • G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
  1     2        Next Page