IPC Classification

All sections
G - Physics
G10L - Speech analysis or synthesis; speech recognition; speech or voice processing; speech or audio coding or decoding

G10L 11/00	Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 11/02	Detection of presence or absence of speech signals
G10L 11/04	Pitch determination of speech signals
G10L 11/06	Discriminating between voiced and unvoiced parts of speech signals (G10L 11/04 takes precedence);;
G10L 13/00	Speech synthesis; Text to speech systems
G10L 13/02	Methods for producing synthetic speech; Speech synthesisers
G10L 13/04	Methods for producing synthetic speech; Speech synthesisers - Details of speech synthesis systems, e.g. synthesiser structure or memory management
G10L 13/06	Elementary speech units used in speech synthesisers; Concatenation rules
G10L 13/07	Concatenation rules
G10L 13/08	Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 13/10	Prosody rules derived from text; Stress or intonation
G10L 13/027	Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
G10L 13/033	Voice editing, e.g. manipulating the voice of the synthesiser
G10L 13/047	Architecture of speech synthesisers
G10L 15/00	Speech recognition
G10L 15/01	Assessment or evaluation of speech recognition systems
G10L 15/02	Feature extraction for speech recognition; Selection of recognition unit
G10L 15/04	Segmentation; Word boundary detection
G10L 15/05	Word boundary detection
G10L 15/06	Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 15/07	Adaptation to the speaker
G10L 15/08	Speech classification or search
G10L 15/10	Speech classification or search using distance or distortion measures between unknown speech and reference templates
G10L 15/12	Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
G10L 15/14	Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
G10L 15/16	Speech classification or search using artificial neural networks
G10L 15/18	Speech classification or search using natural language modelling
G10L 15/19	Grammatical context, e.g. disambiguation of recognition hypotheses based on word sequence rules
G10L 15/20	Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 15/22	Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/24	Speech recognition using non-acoustical features
G10L 15/25	Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
G10L 15/26	Speech to text systems
G10L 15/28	Constructional details of speech recognition systems
G10L 15/30	Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
G10L 15/32	Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
G10L 15/34	Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
G10L 15/065	Adaptation
G10L 15/183	Speech classification or search using natural language modelling using context dependencies, e.g. language models
G10L 15/187	Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
G10L 15/193	Formal grammars, e.g. finite state automata, context free grammars or word networks
G10L 15/197	Probabilistic grammars, e.g. word n-grams
G10L 17/00	Speaker identification or verification
G10L 17/02	Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
G10L 17/04	Training, enrolment or model building
G10L 17/06	Decision making techniques; Pattern matching strategies
G10L 17/08	Use of distortion metrics or a particular distance between probe pattern and reference templates
G10L 17/10	Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
G10L 17/12	Score normalisation
G10L 17/14	Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
G10L 17/16	Hidden Markov models [HMM]
G10L 17/18	Artificial neural networks; Connectionist approaches
G10L 17/20	Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
G10L 17/22	Interactive procedures; Man-machine interfaces
G10L 17/24	the user being prompted to utter a password or a predefined phrase
G10L 17/26	Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
G10L 19/00	Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/002	Dynamic bit allocation
G10L 19/03	Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4
G10L 19/04	Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
G10L 19/005	Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 19/06	Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
G10L 19/07	Line spectrum pair [LSP] vocoders
G10L 19/008	Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/09	Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor
G10L 19/10	Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
G10L 19/012	Comfort noise or silence coding
G10L 19/13	Residual excited linear prediction [RELP]
G10L 19/14	Details not provided for in groups ; G10L 19/06-G10L 19/12, e.g. gain coding, post filtering design or vocoder structure
G10L 19/16	Vocoder architecture
G10L 19/018	Audio watermarking, i.e. embedding inaudible data in the audio signal
G10L 19/20	Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
G10L 19/022	Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
G10L 19/24	Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
G10L 19/025	Detection of transients or attacks for time/frequency resolution switching
G10L 19/26	Pre-filtering or post-filtering
G10L 19/028	Noise substitution, e.g. substituting non-tonal spectral components by noisy source
G10L 19/032	Quantisation or dequantisation of spectral components
G10L 19/035	Scalar quantisation
G10L 19/038	Vector quantisation, e.g. TwinVQ audio
G10L 19/083	Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being an excitation gain
G10L 19/087	Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC
G10L 19/093	Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using sinusoidal excitation models
G10L 19/097	Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using prototype waveform decomposition or prototype waveform interpolative [PWI] coders
G10L 19/107	Sparse pulse excitation, e.g. by using algebraic codebook
G10L 19/113	Regular pulse excitation
G10L 19/125	Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP]
G10L 19/135	Vector sum excited linear prediction [VSELP]
G10L 21/00	Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 21/01	Correction of time axis
G10L 21/02	Speech enhancement, e.g. noise reduction or echo cancellation
G10L 21/003	Changing voice quality, e.g. pitch or formants
G10L 21/04	Time compression or expansion
G10L 21/06	Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
G10L 21/007	Changing voice quality, e.g. pitch or formants characterised by the process used
G10L 21/10	Transforming into visible information
G10L 21/12	Transforming into visible information by displaying time domain information
G10L 21/013	Adapting to target pitch
G10L 21/14	Transforming into visible information by displaying frequency domain information
G10L 21/16	Transforming into a non-visible representation
G10L 21/18	Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids - Details of the transformation process
G10L 21/028	Voice signal separating using properties of sound source
G10L 21/034	Automatic adjustment
G10L 21/038	Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
G10L 21/043	Time compression or expansion by changing speed
G10L 21/045	Time compression or expansion by changing speed using thinning out or insertion of a waveform
G10L 21/047	Time compression or expansion by changing speed using thinning out or insertion of a waveform characterised by the type of waveform to be thinned out or inserted
G10L 21/049	Time compression or expansion by changing speed using thinning out or insertion of a waveform characterised by the interconnection of waveforms
G10L 21/055	Time compression or expansion for synchronising with other signals, e.g. video signals
G10L 21/057	Time compression or expansion for improving intelligibility
G10L 21/0208	Noise filtering
G10L 21/0216	Noise filtering characterised by the method used for estimating noise
G10L 21/0224	Processing in the time domain
G10L 21/0232	Processing in the frequency domain
G10L 21/0264	Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 21/0272	Voice signal separating
G10L 21/0308	Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 21/0316	Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
G10L 21/0324	Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude - Details of processing therefor
G10L 21/0332	Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude - Details of processing therefor involving modification of waveforms
G10L 21/0356	Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for synchronising with other signals, e.g. video signals
G10L 21/0364	Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 21/0388	Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques - Details of processing therefor
G10L 23/00	Speech analysis not provided for in other groups of this subclass
G10L 25/00	Speech or voice analysis techniques not restricted to a single one of groups
G10L 25/03	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters
G10L 25/06	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being correlation coefficients
G10L 25/09	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being zero crossing rates
G10L 25/12	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being prediction coefficients
G10L 25/15	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being formant information
G10L 25/18	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/21	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being power information
G10L 25/24	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being the cepstrum
G10L 25/27	Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique
G10L 25/30	Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/33	Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using fuzzy logic
G10L 25/36	Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using chaos theory
G10L 25/39	Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using genetic algorithms
G10L 25/45	Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of analysis window
G10L 25/48	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use
G10L 25/51	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/54	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for retrieval
G10L 25/57	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
G10L 25/60	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
G10L 25/63	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
G10L 25/66	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
G10L 25/69	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for evaluating synthetic or decoded voice signals
G10L 25/72	Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for transmitting results of analysis
G10L 25/75	Speech or voice analysis techniques not restricted to a single one of groups for modelling vocal tract parameters
G10L 25/78	Detection of presence or absence of voice signals
G10L 25/81	Detection of presence or absence of voice signals for discriminating voice from music
G10L 25/84	Detection of presence or absence of voice signals for discriminating voice from noise
G10L 25/87	Detection of discrete points within a voice signal
G10L 25/90	Pitch determination of speech signals
G10L 25/93	Discriminating between voiced and unvoiced parts of speech signals
G10L 99/00	Subject matter not provided for in other groups of this subclass