|
Résultats pour
brevets
1.
|
SYSTEMS AND METHODS FOR UNSUPERVISED PARAPHRASE MINING
Numéro d'application |
18366890 |
Statut |
En instance |
Date de dépôt |
2023-08-08 |
Date de la première publication |
2024-01-18 |
Propriétaire |
Recruit Co., Ltd. (Japon)
|
Inventeur(s) |
- Golshan, Behzad
- Chen, Chen
- Tan, Wang-Chiew
- Ma, Danni
|
Abrégé
Disclosed embodiments relate to aligning pairs of sentences. Techniques can include receiving a plurality of sentences; generating a graph for each of at least two sentences of the plurality of sentences, wherein generating a graph for each sentence of the at least two sentences comprises: identifying one or more tokens for the sentence; and connecting via edges the one or more tokens; generating a combined graph for the at least two sentences wherein generating a combined graph comprises: aligning the identified tokens of the at least two sentences of the plurality of sentences; identifying matching and non-matching tokens between the at least two sentences based on the alignment; and merging matching tokens into a combined graph node.
Classes IPC ?
- G06F 40/35 - Représentation du discours ou du dialogue
- G06F 40/268 - Analyse morphologique
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06F 18/2323 - Techniques non hiérarchiques basées sur la théorie des graphes, p.ex. les arbres couvrants de poids minimal [MST] ou les coupes de graphes
|
2.
|
SYSTEMS AND METHODS FOR GENERALIZED ENTITY MATCHING
Numéro d'application |
17660813 |
Statut |
En instance |
Date de dépôt |
2022-04-26 |
Date de la première publication |
2023-10-26 |
Propriétaire |
Recruit Co., Ltd. (Japon)
|
Inventeur(s) |
- Wang, Jin
- Li, Yuliang
- Hirota, Wataru
|
Abrégé
Disclosed embodiments relate to generalized entity matching. Techniques can include receiving a data pair of two entities that may be pre-processed to have parsable data structures, and serializing the data pair into a sequence of tokens based on data structure of each entity in the data pair. Techniques can further include encoding the serialized data pair to include topic attributes that may be mapped to data in the data pair and the topic of the mapped data matches the topic represented by topic attribute and the data in the data pair is concatenated. Techniques can further include pooling attributes in the data pair based on contextualized attributed representations of each encoded entity in the data pair and schema of each entity of the data pairs, where the contextual attribute representations are based on a first token of each encoded attribute in the sequence of tokens, and predicting matching labels between the data pairs based on pooled attributes.
Classes IPC ?
- G06F 40/40 - Traitement ou traduction du langage naturel
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06F 40/205 - Analyse syntaxique
|
3.
|
Systems and methods for enhanced review comprehension using domain-specific knowledgebases
Numéro d'application |
18295735 |
Numéro de brevet |
11934783 |
Statut |
Délivré - en vigueur |
Date de dépôt |
2023-04-04 |
Date de la première publication |
2023-09-07 |
Date d'octroi |
2024-03-19 |
Propriétaire |
RECRUIT CO., LTD. (Japon)
|
Inventeur(s) |
- Suhara, Yoshihiko
- Golshan, Behzad
- Li, Yuliang
- Chen, Chen
- Wang, Xiaolan
- Li, Jinfeng
- Tan, Wang-Chiew
- Demiralp, Çagatay
- Traylor, Aaron
|
Abrégé
Disclosed embodiments relate to natural language processing. Techniques can include receiving input text, extracting, from the input text, at least one modifier and aspect pair, receiving data from a knowledgebase, based on the at least one modifier and aspect pair and commonsense data, generate one or more premise embeddings, convert the input text into tokens, generating at least one vector for one or more of the tokens based on an analysis of the tokens, combine the at least one vector with the one or more premise embeddings to create at least one combined vector, and analyze the at least one combined vector wherein the analysis generates an output indicative of a feature of the input text.
Classes IPC ?
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06F 16/35 - Groupement; Classement
- G06F 18/211 - Sélection du sous-ensemble de caractéristiques le plus significatif
- G06N 7/01 - Modèles graphiques probabilistes, p.ex. réseaux probabilistes
|
4.
|
META-LEARNING DATA AUGMENTATION FRAMEWORK
Numéro d'application |
17246354 |
Statut |
En instance |
Date de dépôt |
2021-04-30 |
Date de la première publication |
2022-11-03 |
Propriétaire |
Recruit Co., Ltd., (Japon)
|
Inventeur(s) |
- Li, Yuliang
- Wang, Xiaolan
- Miao, Zhengjie
|
Abrégé
Disclosed embodiments relate to generating training data for a machine learning model. Techniques can include accessing a machine learning model from a machine learning model repository and identifying a data set associated with the machine learning model. The identified data set is utilized to generate a set of data augmentation operators. The data augmentation operators applied on a selected sequence of tokens associated with the machine learning model to generate sequences of tokens. A subset of sequences of tokens are selected and stored in a training data repository. The stored sequences of tokens are provided to the machine learning model as training data.
Classes IPC ?
- G06N 20/00 - Apprentissage automatique
- G06N 5/02 - Représentation de la connaissance; Représentation symbolique
- G06F 16/21 - Conception, administration ou maintenance des bases de données
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
|
5.
|
SYSTEMS AND METHODS FOR SEMI-SUPERVISED EXTRACTION OF TEXT CLASSIFICATION INFORMATION
Numéro d'application |
17151088 |
Statut |
En instance |
Date de dépôt |
2021-01-15 |
Date de la première publication |
2022-07-21 |
Propriétaire |
Recruit Co., Ltd., (Japon)
|
Inventeur(s) |
- Miao, Zhengjie
- Li, Yuliang
- Wang, Xiaolan
- Tan, Wang-Chiew
|
Abrégé
Disclosed embodiments relate to extracting classification information from input text. Techniques can include obtaining input text, identifying a plurality of tokens in the input text, pre-training a machine learning model, determining tagging information of the plurality of tokens using a first classification layer of the machine learning model, pairing sequences of tokens using the tagging information associated with the plurality of tokens, wherein the paired sequences of tokens are determined by a second classification layer, determining one or more attribute classifiers to apply to the one or more paired sequences, wherein the attribute classifiers are determined by a third classification layer of the machine learning model, evaluating sentiments of the paired sequences, wherein the sentiments of the paired sequences are determined by a fourth classification layer of the language machine learning model, aggregating sentiments of the paired sequences associated with an attribute classifier, and storing the aggregated sentiments.
Classes IPC ?
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06N 20/00 - Apprentissage automatique
- G06N 5/04 - Modèles d’inférence ou de raisonnement
- G06F 40/289 - Analyse syntagmatique, p.ex. techniques d’états finis ou regroupement
|
6.
|
SYSTEMS AND METHODS FOR MULTILINGUAL SENTENCE EMBEDDINGS
Numéro d'application |
17008569 |
Statut |
En instance |
Date de dépôt |
2020-08-31 |
Date de la première publication |
2022-03-03 |
Propriétaire |
Recruit Co., Ltd., (Japon)
|
Inventeur(s) |
- Hirota, Wataru
- Suhara, Yoshihiko
- Golshan, Behzad
- Tan, Wang-Chiew
|
Abrégé
Disclosed embodiments relate to natural language processing. Techniques can include obtaining an encoding model, obtaining a first sentence in a first language and a label associated with the first sentence, obtaining a second sentence in a second language, encoding the first sentence and second sentence using the encoding model, determining the intent of the first encoded sentence, determining the language of the first encoded sentence and the language of the second encoded sentence, and updating the encoding model based on the determined intent of the first encoded sentence, the label, the determined language of the first encoded sentence, and the determined language of the second encoded sentence
|
7.
|
Systems and methods for enhanced review comprehension using domain-specific knowledgebases
Numéro d'application |
17008572 |
Numéro de brevet |
11620448 |
Statut |
Délivré - en vigueur |
Date de dépôt |
2020-08-31 |
Date de la première publication |
2022-03-03 |
Date d'octroi |
2023-04-04 |
Propriétaire |
RECRUIT CO., LTD. (Japon)
|
Inventeur(s) |
- Suhara, Yoshihiko
- Golshan, Behzad
- Li, Yuliang
- Chen, Chen
- Wang, Xiaolan
- Li, Jinfeng
- Tan, Wang-Chiew
- Demiralp, Çağatay
- Traylor, Aaron
|
Abrégé
Disclosed embodiments relate to natural language processing. Techniques can include receiving input text, extracting, from the input text, at least one modifier and aspect pair, receiving data from a knowledgebase, based on the at least one modifier and aspect pair and commonsense data, generate one or more premise embeddings, convert the input text into tokens, generating at least one vector for one or more of the tokens based on an analysis of the tokens, combine the at least one vector with the one or more premise embeddings to create at least one combined vector, and analyze the at least one combined vector wherein the analysis generates an output indicative of a feature of the input text.
Classes IPC ?
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06F 16/35 - Groupement; Classement
- G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
- G06N 7/00 - Agencements informatiques fondés sur des modèles mathématiques spécifiques
|
8.
|
Systems and methods for unsupervised paraphrase mining
Numéro d'application |
17008563 |
Numéro de brevet |
11741312 |
Statut |
Délivré - en vigueur |
Date de dépôt |
2020-08-31 |
Date de la première publication |
2022-03-03 |
Date d'octroi |
2023-08-29 |
Propriétaire |
RECRUIT CO., LTD. (Japon)
|
Inventeur(s) |
- Golshan, Behzad
- Chen, Chen
- Tan, Wang-Chiew
- Ma, Danni
|
Abrégé
Disclosed embodiments relate to aligning pairs of sentences. Techniques can include receiving a plurality of sentences; generating a graph for each of at least two sentences of the plurality of sentences, wherein generating a graph for each sentence of the at least two sentences comprises: identifying one or more tokens for the sentence; and connecting via edges the one or more tokens; generating a combined graph for the at least two sentences wherein generating a combined graph comprises: aligning the identified tokens of the at least two sentences of the plurality of sentences; identifying matching and non-matching tokens between the at least two sentences based on the alignment; and merging matching tokens into a combined graph node.
Classes IPC ?
- G06F 40/35 - Représentation du discours ou du dialogue
- G06F 40/268 - Analyse morphologique
- G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
- G06F 18/2323 - Techniques non hiérarchiques basées sur la théorie des graphes, p.ex. les arbres couvrants de poids minimal [MST] ou les coupes de graphes
|
|