42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing non-downloadable software that provides integrated
business management intelligence by combining information
from various databases and presenting it in a user
interface; providing non-downloadable software for database
deployment, querying, reporting, and importing and exporting
data; providing non-downloadable software for creating,
managing, editing, and operating databases; providing
non-downloadable software for development of computer
software applications; providing non-downloadable software
for development, monitoring and management of database and
data warehouse for use by database administrators and
software developers.
The technology is generally directed to personalizing digital components for users based on a merged preference profile. The merged preference profile may be generated based on declared and inferred preferences of the users. The declared and inferred preferences may be preferences related to brands, topics, types of products, etc. The merged preference profile may be used to personalize the digital components output to the user. For example, when identifying digital components to be selected, at least some of the digital components may be selected based on the merged preference profile of the user. At least a portion of the selected digital components may be digital components with a subject matching the preferences of the user, thereby improving the digital components output to the user.
A display system includes a waveguide with tunable waveguide gratings having a pitch (i.e., a distance between adjacent grating features) that can be adjusted to optimize efficiency for each color of light and a light engine to time multiplex projection of each color of a frame synchronously with the grating adjustments. Light of each color of a frame is projected toward the waveguide while the gratings are tuned to the projected color, and the multiplexing frequency is higher than the human vision refresh rate, such that the discontinuity in projection of the different colors of light is not noticeable.
A method (600) includes receiving training data including transcribed speech utterances (304) spoken in a general domain, modified speech utterances (305) in a target domain, and unspoken textual utterances (320) in the target domain. The modified speech utterances include utterances spoken in the target domain that have been modified to obfuscate one or more classes of sensitive information recited in the utterances. The method also includes generating a corresponding alignment output (402) for each unspoken textual utterance of the received training data using an alignment model (400). The method also includes training a speech recognition model (200) on the alignment outputs generated for the corresponding to the unspoken textual utterances, the un-transcribed speech utterances, and the transcribed speech utterances to teach the speech recognition model to learn to recognize speech in the target domain and phrases within the one or more classes of sensitive information.
A method (500) includes, for each of a plurality of training samples (420), processing, using an RNN-T model (200), a corresponding sequence of acoustic frames (422) to obtain an n-best list of speech recognition hypotheses (236), and, for each hypothesis of the n-best list, determining a corresponding number of word errors relative to a corresponding ground-truth transcription (424). For a top-ranked hypothesis from the n-best list, the method includes determining a first loss based on the corresponding ground-truth transcription. The method includes identifying, as an oracle hypothesis, the speech recognition hypothesis from the n-best list having the smallest corresponding number of word errors relative to the corresponding ground-truth transcription, and determining a second loss for the oracle hypothesis based on the corresponding ground-truth transcription. The method includes determining a corresponding combined loss (432) based on the first and second losses, and training the model based on the corresponding combined loss.
Systems and methods for universal handwriting recognition are disclosed herein. The method can include receiving data representing handwriting, including at least one style characteristic of the handwriting, and processing the data representing the handwriting with one or more handwriting recognition models. The method can also include receiving one or more strokes associated with the handwriting from the one or more handwriting recognition models and providing the strokes to a software application. The method can further include generating a visual representation of the handwriting in the software application based on the strokes and generating a style for future digital handwriting in the software application, the style being generated based on the strokes and the at least one style characteristic of the handwriting.
G06V 30/244 - Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
Training and/or utilizing a Speech-To-Speech Translation (S2ST) system that can be used to generate, based on processing source audio data that captures a spoken utterance in a source language, target audio data that includes a synthetic spoken utterance that is spoken in a target language and that corresponds, both linguistically and para-linguistically, to the spoken utterance in the source language. Implementations that are directed to training the S2ST system utilize an unsupervised approach, with monolingual speech data, in training the S2ST system.
e.g.e.g., from a client device operated by a user. Contextual information associated with the user or the client device may be retrieved. Generative model (GM) output may be generated based on processing, using a generative model, data indicative of the query and the contextual information. Synthetic queries may be generated using the GM output, and search result documents (SRDs) may be selected. State data indicative of: the query, contextual information, one or more of the synthetic queries, and the set of search result documents, may be processed to identify a classification of the query. Based on the classification downstream GM(s) may be selected and used to generate one or more additional GM outputs.
Some implementations relate to utilizing both a large language model (LLM) and a visual language model (VLM) in generating, and at least selectively refining, a plan for the execution of a long-horizon robotic task. Various implementations include processing, using the LLM, a free-form natural language instruction, that describes the robotic task to generate LLM output. In some implementations, the LLM output can reflect natural language that indicates one or more sub-tasks to perform the task. Additional or alternative implementations include processing, using the VLM, one or more instances of vision data capturing the environment of the robot and a VLM prompt that is based on the LLM output, to generate a task conditioned description of the environment. Some implementations additionally or alternatively use the task conditioned description of the environment to refine the one or more sub-tasks based on the current environment of the robot.
A method (600) of training an accent recognition model (204) includes receiving a corpus of training utterances (242) spoken across various accents, each training utterance in the corpus including training audio features (244) characterizing the training utterance, and executing a training process (300) to train the accent recognition model on the corpus of training utterances to teach the accent recognition model to learn how to predict accent representations (232) from the training audio features. The accent recognition model includes one or more strided convolution layers (210), a stack of multi-headed attention layers (220), and a pooling layer (230) configured to generate a corresponding accent representation.
An apparatus includes a light engine to emit light having a first polarization state and a polarization gradient layer to convert a portion of the light having the first polarization state to light having a second polarization state. The apparatus also includes a waveguide including an incoupler to incouple light having the first polarization state at a first diffraction efficiency and light having the second polarization state at a second diffraction efficiency different than the first diffraction efficiency. The light having the first polarization state exiting the polarization gradient layer is incident on the edge of the incoupler that falls closer to the direction of light propagation within the waveguide.
A model (200) includes an encoder (210) to receive acoustic frames (110) and generate, at each of a plurality of output steps, a higher-order feature representation (212) for a corresponding acoustic frame. The model also includes a multi-output HAT decoder (220) to generate at each output step a probability distribution over possible speech recognition hypotheses (242), and an indication (232) of whether the output step corresponds to an auxiliary token associated with a particular auxiliary task. The model is trained by a JEIT training process (400) based on: a paired training data set (415) including paired audio data (422) and transcriptions (424), the transcriptions annotated with ground-truth auxiliary tokens (426) associated with the particular auxiliary task; and an unpaired training data set (435) including textual utterances (442) not paired with any corresponding audio data, the textual utterances annotated with the ground-truth auxiliary tokens (444) associated with the particular auxiliary task.
A method (500) includes receiving a sequence of acoustic frames (110) characterizing an utterance (106). During a first pass (301), the method includes generating first-pass audio encodings (222) based on the sequence of acoustic frames using a stack of mask-conformer blocks (300) of an acoustic encoder (220), generating a first-pass transcription (120a) of the utterance based on the first-pass audio encodings using a speech recognition decoder (230), and generating a first-pass masked output sequence (344) using a mask-predict decoder (340) of the acoustic encoder. During a second pass (302), the method includes generating second-pass audio encodings (224) by performing cross-attention on the sequence of acoustic frames and the masked first-pass transcription using the stack of mask-conformer blocks of the acoustic encoder and generating a second-pass transcription (120b) of the utterance based on the second-pass audio encodings using the speech recognition decoder.
Aspects of the disclosed technology include techniques and mechanisms for an efficient error correction coding scheme that can detect and correct data errors that may occur in a memory. In general, the scheme comprises segmenting the data that would be transferred as part of a data request into different parts and applying error correction codes to the separate parts.
A joint segmenting and ASR model (200) includes an encoder (210) to receive acoustic frames (110) and generate, at each of a plurality of output steps, a higher order feature representation (214) for a corresponding acoustic frame. The model also includes a decoder (220) to generate based on the higher order feature representation at each output step a probability distribution (224) over possible speech recognition hypotheses, and an indication (232) of whether the output step corresponds to an end of segment (EOS). The model is trained on a set of training samples (415), each training sample (424) including audio data (422) characterizing multiple segments of long-form speech; and a corresponding transcription (424), the corresponding transcription annotated with ground-truth EOS labels obtained via distillation from a language model teacher (510) that receives the corresponding transcription as input and injects the ground-truth EOS labels into the corresponding transcription between semantically complete segments.
Methods, systems, and apparatus for learning quantum systems via out-of-time-ordered correlators. In one aspect, a method includes measuring, by a control and measurement system, an out-of-time-ordered correlator value for a quantum system that includes a plurality of qubits, where the plurality of qubits comprises a probe qubit and one or more other qubits. To measure the out-of-time-ordered correlator value, the probe qubit is prepared in an initial state. Forward time evolution is performed on the quantum system for a time t. A unitary operator is applied to one or more qubits in the quantum system. Backward time evolution is performed on the quantum system for the time t, and the probe qubit is measured to obtain the out-of-time-ordered correlator value. A classical computing device processes the measured out-of-time-ordered correlator value to determine properties of the quantum system.
18.
SYSTEMS AND METHODS FOR DIGITAL INK GENERATION AND EDITING
Systems and methods for editing and generating digital ink. The present technology may provide systems and methods for training a handwriting model to generate digital ink that is stylistically and visually consistent with an original handwriting input, but which incorporates one or more changes to the text of the original handwriting input. In some examples, training may be performed using training examples that include an original handwriting sample and an original label representing the sequence of characters in the original handwriting sample. In such a case, the original handwriting sample may be processed to generate a style vector that is randomly masked, and the handwriting model may then be trained to generate a predicted handwriting sample that closely matches the original handwriting sample using the masked style vector and the original label as inputs.
Implementations disclosed herein are directed to unsupervised federated training of global machine learning (“ML”) model layers that, after the federated training, can be combined with additional layer(s), thereby resulting in a combined ML model. Processor(s) can: detect audio data that captures a spoken utterance of a user of a client device; process, using a local ML model, the audio data to generate predicted output(s); generate, using unsupervised learning locally at the client device, a gradient based on the predicted output(s); transmit the gradient to a remote system; update weight(s) of the global ML model layers based on the gradient; subsequent to updating the weight(s), train, using supervised learning remotely at the remote system, a combined ML model that includes the updated global ML model layers and additional layer(s); transmit the combined ML model to the client device; and use the combined ML model to make prediction(s) at the client device.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for jointly learning the architecture of a neural network during the training of the neural network. In particular, the architecture of the neural network is learned using differentiable parametric masks.
Methods, systems, and apparatus for hybrid quantum-classical quantum Monte Carlo. In one aspect, a method includes receiving, by a classical computer, data generated by a quantum computer, the data representing results of measurements of a trial wavefunction, wherein the trial wavefunction approximates the target wavefunction and is prepared by the quantum computer, computing, by the classical computer, a classical shadow of the trial wavefunction using the data representing the results of the measurements of the trial wavefunction, and performing, by the classical computer, imaginary time propagation for a sequence of imaginary time steps of an initial wavefunction using a Hamiltonian that characterizes the fermionic quantum system, wherein: the imaginary time propagation is performed until predetermined convergence criteria are met; and performing each imaginary time step of the imaginary time propagation comprises updating the wavefunction for the previous imaginary time step using the classical shadow of the trial wavefunction to obtain a wavefunction for the current imaginary time step.
A joint auxiliary task and ASR model includes an encoder to receive a sequence of acoustic frames and generate, at each of a plurality of output steps, a higher-order feature representation for a corresponding acoustic frame. The model also includes a multi-output HAT decoder to generate at each of the plurality of output steps a probability distribution over possible speech recognition hypotheses, and an indication of whether the output step corresponds to an auxiliary token associated with a particular auxiliary task. The model is trained by a JEIT training process based on: a paired training data set including paired audio data and transcriptions, the transcriptions annotated with ground-truth auxiliary tokens associated with the particular auxiliary task; and an unpaired training data set including textual utterances not paired with any corresponding audio data, the textual utterances annotated with the ground-truth auxiliary tokens associated with the particular auxiliary task.
Systems, devices, and methods are described in which one or more tunable lens elements are incorporated within a lens structure communicatively coupled to a wearable display device operable to present augmented reality (AR) content to a user. The lens structure includes a display optics lens layer having a provided AR display, one or more eye-side lens layers disposed adjacent to the display optics lens layer and facing an eye of the user, and one or more world-side lens layers disposed adjacent to the display optics lens layer and facing away from the eye of the user. The world-side lens layers includes a tunable lens component to selectively adjust a focal modulation of at least a portion of a real-world view of the user via the lens structure.
Optical systems may include MEMS mirrors having elliptical mirror plates. A laser scanning system may include a MEMS mirror that scans an incident light beam along a single scanning axis. The MEMS mirror may include an elliptical mirror plate having a semi-major axis that is aligned parallel or perpendicular to the rotational axis of the elliptical mirror plate. The incident light beam may have an elliptical cross-section, such that the incident light beam completely or substantially overlaps the reflecting surface of the elliptical mirror plate. After being reflected by the elliptical mirror plate, the light beam may be circularized via one or more shaping lenses disposed in the optical path of the reflected light beam, prior to projection of the light beam.
G02B 27/09 - Beam shaping, e.g. changing the cross-sectioned area, not otherwise provided for
G02B 26/08 - Optical devices or arrangements for the control of light using movable or deformable optical elements for controlling the direction of light
Systems and methods for recommending media content to a user based on information associated with a referral source that referred the user to a media item provided by a source of the media content are presented. In one or more aspects, a system is provided that includes a presentation component that presents, via user a interface, a first media item associated with a media presentation source referred to a user through a referral source. The system further includes an analytics component that identifies a second media item based on media items associated with the media presentation source that are referred to other users through the referral source, and a recommendation component that recommends the second media item to the user through the user interface.
G06F 3/0482 - Interaction with lists of selectable items, e.g. menus
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
H04L 65/612 - Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
H04L 67/60 - Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
26.
MASK-CONFORMER AUGMENTING CONFORMER WITH MASK-PREDICT DECODER UNIFYING SPEECH RECOGNITION AND RESCORING
A method includes receiving a sequence of acoustic frames characterizing an utterance. During a first pass, the method includes generating first-pass audio encodings based on the sequence of acoustic frames using a stack of mask-conformer blocks of an acoustic encoder, generating a first-pass transcription of the utterance based on the first-pass audio encodings using a speech recognition decoder, and generating a first-pass masked output sequence using a mask-predict decoder of the acoustic encoder. During a second pass, the method includes generating second-pass audio encodings by performing cross-attention on the sequence of acoustic frames and the masked first-pass transcription using the stack of mask-conformer blocks of the acoustic encoder and generating a second-pass transcription of the utterance based on the second-pass audio encodings using the speech recognition decoder.
Systems and methods for identifying related videos based on elements tagged in the videos are presented. In an aspect, a system includes an identification component configured to identify tagged elements in a video, a matching component configured to identify other videos that include one or more of the tagged elements, and a recommendation component configured to recommend the other videos for viewing based on a current or past request to play the video.
G06F 16/783 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
H04N 21/45 - Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies
H04N 21/472 - End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
Generally disclosed herein is an approach for a telemetry subsystem enabling the telemetry data to be collected and processed without the need to interrupt processing jobs being processed by processing cores. The telemetry subsystem may include one or more telemetry cores dedicated to telemetry data collection. Telemetry cores are configured to receive telemetry data from telemetry agents, processing cores, and other components of a system on chip (SoC).
Methods, systems, and apparatus for solving quadratic optimization problems over orthogonal groups using quantum computing. In one aspect, a method includes receiving data representing a quadratic optimization problem, wherein decision variables of the quadratic optimization problem take values in an orthogonal group or a special orthogonal group; encoding the quadratic optimization problem as a quantum Hamiltonian, the encoding comprising using a Clifford algebra representation of the group to map orthogonal matrices or special orthogonal matrices in the group to respective quantum states in a Hilbert space; determining an approximate eigenstate of the quantum Hamiltonian; computing expectation values of Pauli operators with respect to the approximate eigenstate, wherein the Pauli operators comprise operators obtained by mapping multiplication operations of the Clifford algebra into the Hilbert space; and rounding the expectation values of the Pauli operators to elements of the orthogonal group to obtain a solution to the quadratic optimization problem.
Methods, systems, and apparatus, including a method for determining network measurements. In some respects, a method includes receiving, by a first aggregation server and from each of multiple client devices, encrypted impression data. A second aggregation server received from each of at least a portion of the multiple client devices, conversion data that includes, for each conversion recorded by the client device, encrypted conversion value data. The first aggregation server and the second aggregation server perform a multi-party computation process to decrypt the encrypted impression data and the encrypted conversion data.
A method for example-driven machine learning is disclosed herein. The method comprises maintaining a plurality of dialog system rules and a knowledge database including a plurality of intent objects and a plurality of entity objects. The plurality of intent objects and the plurality of entity objects are associated with at least one dialog system rule. An exemplary phrase is received and one or more linguistic elements are retrieved from the exemplary phrase. It is determined that at least one of the linguistic elements is directed to at least one of the plurality of intent objects of the plurality of entity objects and at least one of the linguistic elements in association with the at least one dialog system rule is added to the knowledge database.
Systems and methods for generating augmented reality prerenderings can provide the benefit of an augmented reality rendering without requiring the use of user data. Template images can be used instead of user data to protect the user's privacy while enabling the user to see an object or product rendered onto a preferred template image or a variety of template images.
Methods, systems, and computer media provide attestation tokens that protect the integrity of communications transmitted from client devices, while at the same time avoiding the use of stable device identifiers that could be used to track client devices or their users. In one approach, client devices can receive anonymous certificates from a device integrity computing system signifying membership in a selected device trustworthiness group, and attestation tokens can be signed anonymously with the anonymous certificates using a group signature scheme. Client devices can include throttlers imposing limits on the quantity of attestation tokens created by the client device.
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
Generally disclosed herein is an approach for modifying use of segment routing multiprotocol label switching (SR-MPLS) allowing an arbitrary MPLS control plane and traditional MPLS data plane to utilize a single MPLS label to represent two or more edges in a path. MPLS labels may be divided into smaller sub-labels, which together uniquely represent a pair of edges along a route. In one example, a single MPLS label may be divided into two sub-labels, the first sub-label representing a first edge, and the second sub-label representing a second edge. In this regard, longer source routes may be supported in a packet header in network designs that implement strict source routing.
Example methods, apparatus, and systems for generating custom content responsive to a received search query are disclosed. An example method for generating custom content responsive to a received search query includes receiving, via a communication interface from a user computing device, a search query including one or more search terms; determining, responsive to the search query, a set of search results relevant to the search query; identifying, responsive to the search query, third-party content and/or a third party relevant to the search query; generating, based on (i) the search query and (ii) the third-party content or the third party, custom content relevant to the search query and related to a landing page associated with the third-party content or the third party, for presentation along with the set of research results; and transmitting, via the communication interface to the user computing device, the custom content.
Implementations described herein determine, for a given document generated by a given source, one or more portions of content (e.g., phrase(s), image(s), paragraph(s), etc.) of the given document that may be influenced by a source perspective of the given source. Further, implementations determine one or more additional resources that are related to the given source and that are related to the portion(s) of content of the given document. Yet further, implementations utilize the additional resource(s) to determine additional content that provides context for the portion(s) that may be influenced by a source perspective. A relationship, between the additional resource(s) and the portions of the given document, can be defined. Based on the relationship being defined, the additional content can be caused to be rendered at a client device in response to the client device accessing the given document.
A method includes, for each training sample of a plurality of training samples, processing, using an RNN-T model, a corresponding sequence of acoustic frames to obtain an n-best list of speech recognition hypotheses, and, for each speech recognition hypothesis of the n-best list, determining a corresponding number of word errors relative to a corresponding ground-truth transcription. For a top-ranked hypothesis from the n-best list, the method includes determining a first loss based on the corresponding ground-truth transcription. The method includes identifying, as an oracle hypothesis, the speech recognition hypothesis from the n-best list having the smallest corresponding number of word errors relative to the corresponding ground-truth transcription, and determining a second loss for the oracle hypothesis based on the corresponding ground-truth transcription. The method includes determining a corresponding self-training combined loss based on the first and second losses, and training the model based on the corresponding self-training combined loss.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text-to-image model so that the text-to-image model generates images that each depict a variable instance of an object class when the object class without the unique identifier is provided as a text input, and that generates images that each depict a same subject instance of the object class when the unique identifier is provided as the text input.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
This disclosure relates to classification methods that can be implemented on quantum computing systems. According to a first aspect, this specification describes a method for training a classifier implemented on a quantum computer, the method comprising: preparing a plurality of qubits in an input state with a known classification, said plurality of qubits comprising one or more readout qubits; applying one or more parameterised quantum gates to the plurality of qubits to transform the input state to an output state; determining, using a readout state of the one or more readout qubits in the output state, a predicted classification of the input state; comparing the predicted classification with the known classification; and updating one or more parameters of the parameterised quantum gates in dependence on the comparison of the predicted classification with the known classification.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating descriptions of input images. One of the methods includes obtaining an input image; processing the input image using a first neural network to generate an alternative representation for the input image; and processing the alternative representation for the input image using a second neural network to generate a sequence of a plurality of words in a target natural language that describes the input image.
Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 17/02 - Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
G10L 17/04 - Training, enrolment or model building
G10L 17/10 - Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
G10L 17/14 - Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
G06F 16/683 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
The technology generally relates to systems and methods for performing in-field testing of processing cores within a system-on-chip (SoC), so as to identify faults, including those associated with silent data corruption. For example, an SoC may contain operational cores and spare cores. An operational core may be selected for testing while a spare core is used to replace the tested core. In addition, a spare core may be used to replace an operational core that has been determined to be corrupted.
Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
A system and method for repartitioning data in a distributed network. The method may include executing, by one or more processors, a first pass of a data set from a plurality of first sources to a plurality of first sinks, each first sink collecting data from one or more of the first sources, and executing, by the one or more processors, a second pass of the data set from a plurality of second sources to a plurality of second sinks, each one of the plurality of first sinks corresponding to one of the plurality of second sources, and each second sink collecting data from one or more of the second sources. Executing the first and second passes causes the data set to be repartitioned such that one or more second sinks collect data that originated from two or more of the first sources.
G06F 12/0804 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
A method includes receiving, by a robotic device, an indication that the robotic device was elected to be a leader robotic device by a plurality of robotic devices and receiving an indication of a new task in a remotely stored list of tasks. The method further includes determining, based on a remotely stored list of robotic devices, an additional robotic device to assign the new task. The remotely stored list of robotic devices comprises an entry for each respective robotic device of the plurality of robotic devices associating the respective robotic device with an identifier and a heartbeat. The method additionally includes assigning the new task to the additional robotic device based on the additional robotic device having an active heartbeat. Assigning the new task to the additional robotic device comprises associating the new task with the additional robotic device in the remotely stored list of tasks to cause the additional robotic device to carry out the new task.
(1) Providing online non-downloadable open source software for use in large language models and artificial intelligence; providing online non-downloadable open source software using artificial intelligence for the production of human speech and text; providing online non-downloadable open source software for natural language processing, generation, understanding and analysis; providing online non-downloadable open source software for artificial intelligence and machine-learning based language and speech processing software; providing online non-downloadable open source software for creating generative models; providing online non-downloadable open source software for multi-modal machine-learning based language, text, and speech processing software; providing online non-downloadable open source software for processing speech, text, sound, code, videos, images, and sound input; providing open source online non-downloadable software for generating speech, text, sound, code, videos, images, and sound output; research and development services in the field of artificial intelligence; research, development and evaluation of large language models and data sets; research, design and development of computer programs and software; providing online non-downloadable open source software for managing data sets and performing safety checks in the field of artificial intelligence; providing online non-downloadable open source software for multi-modal artificial intelligence and machine-learning based language, text, sound, code, video, image, speech, and sound processing software; providing online non-downloadable open source software for facilitating multi-modal natural language, speech, text, sound, code, videos, images, and sound input; research and development services in the field of multi-modal computer natural language processing, artificial intelligence, and machine learning; providing online non-downloadable open source software for an integrated development environment for large language models; providing online non-downloadable open source software for use in the fields of artificial intelligence, machine learning, natural language generation, statistical learning, mathematical learning, supervised learning, and unsupervised learning; providing information from searchable indexes and databases of information, including text, music, images, videos, software algorithms, mathematical equations, electronic documents, and databases; application service provider featuring application programming interface (API) software; providing online non-downloadable open source software for facilitating interaction and communication between humans and AI (artificial intelligence) chatbots; providing online non-downloadable open source chatbot software for simulating conversations.
48.
Display screen or portion thereof with transitional graphical user interface
Mitigating the reality gap through training and utilization of at least one difference model. The difference model can be utilized to generate, for each of a plurality of instances of simulated state data generated by a robotic simulator, a corresponding instance of modified simulated state data. The difference model is trained so that a generated modified instance of simulated state data is closer to “real world data” than is a corresponding initial instance of simulated state data. Accordingly, the difference model can be utilized to mitigate the reality gap through modification of initially generated simulated state data, to make it more accurately reflect what would occur in a real environment. Moreover, the difference representation from the difference model can be used as input to the control policy to adapt the control learned from simulator to the real environment.
Methods and systems for displaying screens in extended reality are disclosed herein. The method can include receiving video data of a real-world environment of a user of an extended reality device and identifying a screen of a separate computing device in the video data. The method can also include pairing the extended reality device and the separate computing device, generating a graphical outline around the screen of the separate computing device, and displaying, by the processor, the screen of the separate computing device and the graphical outline around the screen of the separate computing device on a display of the extended reality device. The method can further include detecting a user interaction with the graphical outline around the screen of the separate computing device based on the received video data and in response to detecting the user interaction, performing an action associated with the user interaction.
G06V 20/20 - Scenes; Scene-specific elements in augmented reality scenes
G06V 20/40 - Scenes; Scene-specific elements in video content
G06K 7/14 - Methods or arrangements for sensing record carriers by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
H04L 67/131 - Protocols for games, networked simulations or virtual reality
51.
System and method for targeting information based on message content in a reply
A method of presenting information to a party through a messaging application is described. Responsive to receipt of a communication from a party (e.g., the first user), a reply is sent. The communication and the reply is presented in an interface to the sender. The messaging system determines matching content that is relevant to one or both of the communication and the reply and determines a quality of the match. Determining the quality of the match may include determining a score for an advertisement based on the advertisement's responsiveness to content identified in the reply message that was sent. Based on a determination that the quality is above a threshold, the matching content is presented along with the communication and the reply.
Systems and methods for proactive query and content suggestion can include obtaining web data, determining a change event occurred, and generating a query and content suggestion. Generating the query and content suggestion can include processing data descriptive of the change event with a generative model to generate one or more model-generated query suggestions. One or more web resources can be obtained then processed to generate a change event summary. The one or more query suggestions and the change event summary can then be provided for display.
Systems and methods are disclosed herein for modifying suggestion metadata in an electronic document. A copy request is received to copy a portion of the electronic document. The portion of the electronic document includes a suggestion having metadata that indicates the suggestion was made by a first user. The copy request is received from a second user. A paste request to paste the copied portion is received from the second user. Responsive to determining to modify the metadata of the suggestion, the indication that the first suggestion was made by the first user is replaced with an indication that the first suggestion was made by the second user.
(1) Downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; downloadable software for an integrated development environment for large language models. (1) Research, design and development of computer programs and software; research, development and evaluation of large language models and data sets; providing online non-downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; providing online non-downloadable software for an integrated development environment for large language models.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; Downloadable software for an integrated development environment for large language models. Research, design and development of computer programs and software; research, development and evaluation of large language models and data sets; providing online non-downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; providing online non-downloadable software for an integrated development environment for large language models.
(1) Downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; downloadable software for an integrated development environment for large language models. (1) Research, design and development of computer programs and software; research, development and evaluation of large language models and data sets; providing online non-downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; providing online non-downloadable software for an integrated development environment for large language models.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; Downloadable software for an integrated development environment for large language models. Research, design and development of computer programs and software; research, development and evaluation of large language models and data sets; providing online non-downloadable software for creating, developing, deploying, integrating, monitoring, and running applications powered by language and generative models, data sets, and artificial intelligence; providing online non-downloadable software for an integrated development environment for large language models.
Aspects of the disclosure are directed to a parallel recovery mode that applies log records while allowing read queries on read-replica databases. The parallel recovery mode can include applying log records in log sequence number (LSN) order for a block or for multiple blocks, and managing log records affecting multiple blocks. The parallel recovery mode can further manage dependency between different log records and maintain transactional consistency on read queries.
G06F 16/27 - Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating predictions about images. One of the systems includes a neural network comprising a sequence of one or more network blocks that are each configured to perform operations comprising: obtaining a block input that represents an intermediate representation of an input image; determining a plurality of patches of the block input or of an updated representation of the block input, wherein each patch comprises a different subset of elements of the block input or of the updated representation of the block input; assigning each patch to one or more respective expert modules of a plurality of expert modules of the network block; for each patch of the plurality of patches, processing the patch using the corresponding expert modules to generate respective module outputs; and generating a block output by combining the module outputs.
Methods, devices, systems, and means for intra-UECS communication by a coordinating user equipment, UE, in a user equipment-coordination set, UECS, are described herein. The coordinating UE allocates first air interface resources to a second UE and second air interface resources to a third UE for intra-UECS communication. The coordinating UE receives, using the allocated first air interface resources, an Internet Protocol, IP, data packet from the second UE in the UECS. The coordinating UE determines that a destination address included in the IP data packet is an address of the third UE and transmits, using the allocated second air interface resources, the IP data packet to the third UE.
The techniques disclosed herein provide a secure control plane (SCP), which in turn provides an isolated secure execution environment for a data plane (DP). Any arbitrary business logic can execute within the DP, and all sensitive data traversing the SCP and entering the DP is encrypted. Split keys generated outside the DP are assembled within, and only within, the DP, where they are used to decrypt sensitive data, enabling the business logic to perform computations using the sensitive data within the secure execution environment. The DP also provides attestation for the business logic executing within the DP, enabling outside parties to verify that the deployed business logic matches published logic. In the event of proprietary logic that is not published, techniques are also disclosed herein that enable verification that proprietary business logic deployed on the DP adheres to security policies.
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
H04L 9/30 - Public key, i.e. encryption algorithm being computationally infeasible to invert and users' encryption keys not requiring secrecy
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
63.
Modeling Exponentially Large Classical Physical Systems using Quantum Computing
Systems and methods for simulating classical physical systems are provided. In one example, a method may include initializing one or more qubits with an initial quantum state encoding one or more physical properties of a classical physical system comprising an oscillator network. An example method may include simulating, by one or more quantum computing devices using the one or more qubits, the classical physical system.
G06N 10/80 - Quantum programming, e.g. interfaces, languages or software-development kits for creating or handling programs capable of running on quantum computers; Platforms for simulating or accessing quantum computers, e.g. cloud-based quantum computing
G06N 10/20 - Models of quantum computing, e.g. quantum circuits or universal quantum computers
G06N 10/40 - Physical realisations or architectures of quantum processors or components for manipulating qubits, e.g. qubit coupling or qubit control
G06N 10/60 - Quantum algorithms, e.g. based on quantum optimisation, or quantum Fourier or Hadamard transforms
64.
CLUSTERING AND MINING ACCENTED SPEECH FOR INCLUSIVE AND FAIR SPEECH RECOGNITION
A method of training an accent recognition model includes receiving a corpus of training utterances spoken across various accents, each training utterance in the corpus including training audio features characterizing the training utterance, and executing a training process to train the accent recognition model on the corpus of training utterances to teach the accent recognition model to learn how to predict accent representations from the training audio features. The accent recognition model includes one or more strided convolution layers, a stack of multi-headed attention layers, and a pooling layer configured to generate a corresponding accent representation.
A method comprises receiving a first sequence of images of a portion of a user, the first sequence of images being monocular images; generating an avatar based on the first sequence of images, the avatar being based on a model including a feature vector associated with a vertex; receiving a second sequence of images of the portion of the user; and based on the second sequence of images, modifying the avatar with a displacement of the vertex to represent a gesture of the avatar.
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06T 7/90 - Determination of colour characteristics
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
66.
Managing Data Availability on Encryption Key Status Changes in Replicated Storage Systems
A method includes obtaining a key status for a first cryptographic key. The first cryptographic key is used to encrypt replicated data of a first replication instance. The method also includes determining, based on the key status, that the first cryptographic key is inaccessible which causes the first replication instance to be unavailable. In response to determining that the first cryptographic key is inaccessible, the method includes scheduling a second replication instance to be unavailable after a threshold amount of time has passed. The second replication instance includes replicated data encrypted by a second cryptographic key that is accessible. After the threshold amount of time has passed and when the first cryptographic key is still inaccessible, the method includes setting the second replication instance as unavailable.
Methods, systems and apparatus for performing quantum state preparation. In one aspect, a method includes the actions of defining a target quantum state of a quantum system, wherein time evolution of the quantum system is governed by a target Hamiltonian, and defining a total Hamiltonian that interpolates between an initial Hamiltonian and the target Hamiltonian, wherein the total Hamiltonian is equal to the initial Hamiltonian at an initial time and is equal to the target Hamiltonian at a final time; approximating the time evolution of the total Hamiltonian using a truncated linear combination of unitary simulations to generate a truncated time evolution operator; evolving a ground state of the initial Hamiltonian according to the truncated time evolution operator for a truncated number of time steps to generate an intermediate state; and variationally adjusting the intermediate state to determine a wavefunction that approximates the target quantum state of the quantum system.
G06N 10/00 - Quantum computing, i.e. information processing based on quantum-mechanical phenomena
G06F 15/16 - Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
68.
METHODS, SYSTEMS, AND MEDIA FOR SYNCHRONIZING VIDEO STREAMS
Methods, systems, and media for synchronizing video streams are provided. In some embodiments, the method comprises: identifying a target video stream and a reference video stream, wherein the target video stream and the reference video stream are two different broadcasts of a program; generating, for the target video stream, a sequence of fingerprints; determining a time shift at which the sequence of fingerprints appears within the reference video stream; determining whether the target video stream is synchronized with the reference video stream by determining whether the time shift exceeds a predetermined threshold; and, in response to determining that the target video stream is not synchronized with the reference video stream, causing an electronic programming guide that includes an indication of the target video stream to be modified based on the time shift.
Methods, systems, and media for identifying video objects linked to a source video are provided. In some embodiments, the method comprises: identifying demographic attributes corresponding to a first user participating in an online conversation; determining at least one keyword associated with the online conversation, wherein the keyword indicates a topic of the online conversation; identifying a video object based at least on the demographic attributes and the at least one keyword, wherein the video object comprises a portion of a video; causing the identified video object to be presented in a group of video objects on a first user device associated with the first user; receiving an indication that the identified video object has been selected on the first user device for inclusion in a message in the online conversation; and causing the identified video object to be presented on a second user device associated with the second user.
G06F 16/48 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
G06F 16/735 - Filtering based on additional data, e.g. user or group profiles
G06F 16/9535 - Search customisation based on user profiles and personalisation
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
A cardiovascular monitoring system includes a first monitoring device configured to couple to a first body part of a subject and a second monitoring device configured to couple to a second body part of the subject. The first monitoring device is configured to measure a first cardiovascular signal and a first motion signal at the first body part, and the second monitoring device is configured to measure a second cardiovascular signal and a second motion signal at the second body part. A controller of the system receives the first and second cardiovascular signals and the first and second motion signals and filters the first and second cardiovascular signals by removing spectral components that correspond to spectral components of the first and second motion signals. Based on a correlated spectral component that is present in both the filtered first and second cardiovascular signals, the system determines cardiovascular information of the subject.
A61B 5/00 - Measuring for diagnostic purposes ; Identification of persons
A61B 5/021 - Measuring pressure in heart or blood vessels
A61B 5/0295 - Measuring blood flow using plethysmography, i.e. measuring the variations in the volume of a body part as modified by the circulation of blood therethrough, e.g. impedance plethysmography
71.
Language Model Prediction of API Call Invocations and Verbal Response
A method includes obtaining an utterance from a user including a user query directed toward a digital assistant. The method includes generating, using a language model, a first prediction string based on the utterance and determining whether the first prediction string includes an application programming interface (API) call to invoke a program via an API. When the first prediction string includes the API call to invoke the program, the method includes calling, using the API call, the program via the API to retrieve a program result; receiving, via the API, the program result; updating a conversational context with the program result that includes the utterance; and generating, using the language model, a second prediction string based on the updated conversational context. When the first prediction string does not include the API call, the method includes providing an utterance response to the utterance based on the first prediction string.
A method includes receiving user queries directed toward a cloud-based assistant service. For each received user query directed toward the cloud-based assistant service, the method also includes extracting one or more attributes from the user query and logging the user query into one or more of a plurality of category buckets based on the one or more attributes extracted from the user query. The method also includes determining when at least one of the plurality of category buckets includes a threshold number of the user queries logged into the at least one category bucket, and when the at least one of the plurality of category buckets includes the threshold number of the user queries, generating a distilled model of the cloud-based assistant service. The distilled model of the cloud-based assistant service is configured to execute on one or more target client devices.
A method of training a language model for rare-word speech recognition includes obtaining a set of training text samples, and obtaining a set of training utterances used for training a speech recognition model. Each training utterance in the plurality of training utterances includes audio data corresponding to an utterance and a corresponding transcription of the utterance. The method also includes applying rare word filtering on the set of training text samples to identify a subset of rare-word training text samples that include words that do not appear in the transcriptions from the set of training utterances or appear in the transcriptions from the set of training utterances less than a threshold number of times. The method further includes training the external language model on the transcriptions from the set of training utterances and the identified subset of rare-word training text samples.
Aspects of the disclosure are directed to providing users more control over SSD storage recovery, such as providing capabilities and configuration options for a cloud platform to manage the SSD recovery. Aspects of the disclosure can include providing a restart-in-place maintenance mode, a configurable time-out option for SSD recovery, automatic snapshot triggering, automatic archiving, and/or extending stop/start virtual machine functionality to work with local SSD storage.
Aspects of the disclosure are directed to an optimization model for storing liquid hydrogen to power fuel cells in data centers. The optimization model can be based on hydrogen fuel consumption rates in the data center, refueling rates from vendors, refueling response time, storage tank area constraints in the data center, and/or logistical refueling constraints. The optimization model can allow for providing sufficient fuel within a constrained space for backup power in the data center, such as when an emergency arises.
F17C 5/00 - Methods or apparatus for filling pressure vessels with liquefied, solidified, or compressed gases
F17C 13/02 - Special adaptations of indicating, measuring, or monitoring equipment
G05B 19/4155 - Numerical control (NC), i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form characterised by programme execution, i.e. part programme or machine function execution, e.g. selection of a programme
76.
Parametric Amplifiers With Inductive Input Coupling For Quantum Computing Systems
The disclosure is towards parametric amplifiers with inductive input coupling for quantum computing systems. One example aspect of the present disclosure is directed to a quantum computing system comprising a first qubit, a first measurement device, and a first amplifier. The first measurement device is configured to generate a first qubit signal corresponding to a first quantum state of the first qubit. The first amplifier is configured to amplify the first qubit signal. The first amplifier comprises a first transmission-line resonator. The first transmission-line resonator provides an inductive reactance for an electrical coupling between the first measurement device and the first amplifier. The inductive reactance for the electrical coupling enables a transmission of the first qubit signal.
This document describes techniques and apparatuses for automatic white-balance for a camera system. The techniques and apparatuses utilize a precursor image to detect one or more detected faces and determine a tone. The camera system retrieves tonal data based on a group of images determined to contain a same face as the detected face. Based on this tonal data, a difference in white balance is determined based on the difference in tone of the detected face within the precursor image and the associated tonal data. Camera settings are adjusted based on the difference in white balance to enable capture of an image having an improved tone.
H04N 23/88 - Camera processing pipelines; Components thereof for processing colour signals for colour balance, e.g. white-balance circuits or colour temperature control
H04N 23/12 - Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from different wavelengths with one sensor only
H04N 23/57 - Mechanical or electrical details of cameras or camera modules specially adapted for being embedded in other devices
H04N 23/611 - Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
H04N 23/63 - Control of cameras or camera modules by using electronic viewfinders
This disclosure enables dynamic transmission power allocation for a physical downlink shared channel (PDSCH) (118). A network entity (104) can transmit a channel state information (CSI) report configuration (106) to a UE (102) that configures the UE (102) to provide a power backoff indicator (PBI) report (116). A CSI report (112) can be modified to include the PBI report (116). The UE may determine the one or more PBIs based on a measured signal-to-interference plus noise ratio (SINR) that is measured for one or more CSI reference signal (CSI-RS) resources. The network entity (104) receives PBI report (116) from the UE (102) and uses one or more PBIs in the PBI report (116) to set the PDSCH transmission power. The network entity (104) can transmit control signaling indicating the transmission power, power offset, or additional power backoff of the PDSCH.
H04W 52/36 - Transmission power control [TPC] using constraints in the total amount of available transmission power with a discrete range or set of values, e.g. step size, ramping or offsets
H04W 52/54 - Signalisation aspects of the TPC commands, e.g. frame structure
H04W 52/24 - TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters
H04W 52/14 - Separate analysis of uplink or downlink
A device includes a plurality of cores having a plurality of configurable self-repair pipelines, wherein each core of the plurality of cores comprises a plurality of pipeline flops for routing self-repair data to the plurality of cores in parallel, wherein a series of connected pipeline flops forms one of the plurality of configurable self-repair pipelines.
A device or method executes a virtual camera to generate modified frames based on physical camera frames. An indication is received that a browser is accessing a virtual camera in an in-browser camera list, the in-browser camera list being selectable by a user within the browser. A physical camera frame is received, at the browser, from a physical camera associated with the virtual camera. A modified frame is generated based on the physical camera frame and a local browser setting, the local browser setting being selectable by a user. The modified frame is sent to a display processing module.
Techniques are described herein for implementing collaborative poll elements within a document in a document editing application. A method includes: receiving first user interface input that indicates a first vote, associated with a first user, in an interactive poll element embedded in a first document; determining that the first user interface input is received via a first instance of the document editing application; and in response to determining that the first user interface input is received via the first instance of the document editing application: updating a list of votes associated with the interactive poll element by adding the vote, corresponding to the first user, to the list of votes associated with the interactive poll element; updating a cryptographic digital signature associated with the interactive poll element; and providing the updated list of votes and the updated cryptographic digital signature to each of the plurality of users.
G06F 21/64 - Protecting data integrity, e.g. using checksums, certificates or signatures
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
82.
GENERATION OF EXPLANATIONS WITH MULTISTEP REASONING FOR RANKING IN RECOMMENDER SYSTEMS
The technology employs large language models (LLMs) (1102) to simultaneously generate an explanation (1106) and a score (ranking) (1104) using a multistep reasoning process that evaluates both item metadata (1110) and user context (1108). The context may include information from an active conversation (1116) and stored profile information (804). This approach beneficially generates an explanation concurrently with the scoring/ranking. The explanation can be further distilled into a user-facing (external) explanation (1120) or inspected to debug the model via subsequent (internal) evaluation (1122). The technology utilizing such LLMs can be employed to directly reason about how well an item matches the context of a conversation within a ranking module and also generate an intuitive natural language explanation. Other use cases include dialogue management, incorporating natural language user profiles, and building realistic user simulators to generate synthetic data at scale for evaluation and tuning of system components.
Waveguides (202, 502, 600, 700) for displays (100) constructed from a combination of flat (508, 608, 708) and curved (510, 610, 710) surfaces using plural incouplers (204, 504, 604, 704, 912) include additional incouplers incorporated into a waveguide spaced from one another at precise angles (916, 1216), e.g., that match or correspond to grating angles associated with the waveguides (202, 502, 600, 700), allowing the injection of light (206, 605, 705, 908, 910) in multiple locations to the same grating structure while still maintaining k-space closure and thus preventing unintended refraction or distortion. Regions immediately surrounding incouplers (204, 504, 604, 704, 912) and outcouplers (210, 506, 606, 706, 914) in the waveguide (202, 502, 600, 700) include flat surfaces (508, 608, 708), while regions between the incouplers (204, 504, 604, 704, 912) and outcouplers (210, 506, 606, 706, 914) include one or more curved surfaces (510, 610, 710).
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling an agent interacting with an environment. In one aspect, a method comprises: receiving an observation image of an environment; receiving an input text sequence; generating an object localization input that includes the observation image; processing the object localization input using an object localization neural network to generate an object localization output that identifies respective locations of the one or more objects in the observation image; generating a policy input based on the observation image, the input text sequence, and the object localization output; processing the policy input using a policy neural network to generate a policy output that defines an action to be performed by the agent in response to the observation image; selecting an action to be performed by the agent using the policy output; and causing the agent to perform the selected action.
A browser-based tool is disclosed for providing context-based assistance during web browsing. An example method involves receiving a contextual search request pertaining to main content displayed in a browser's display area, extracting content from the main content, receiving a contextual suggestion based on the extracted content, and displaying the contextual suggestion in a designated contextual search area within the browser. This innovative approach streamlines the search process by providing users with relevant suggestions based on the content they are currently viewing, thereby improving efficiency in navigating online information.
A method (900) includes receiving training data including a corpus of multilingual unspoken textual utterances (320), a corpus of multilingual un-transcribed non-synthetic speech utterances (306), and a corpus of multilingual transcribed non-synthetic speech utterances (304). For each un-transcribed non-synthetic speech utterance, the method includes generating a target quantized vector token (221) and a target token index (222), generating contrastive context vectors (215) from corresponding masked audio features (211m), and deriving a contrastive loss term (316). The method also includes generating an alignment output (602) and generating a first probability distribution over possible speech recognition hypotheses (392) for the alignment output. The method also includes generating a second probability distribution over possible speech recognition hypotheses (394) and determining a non-synthetic speech loss term (344). The method also includes pre-training an audio encoder (210).
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Business networking; arranging and conducting special events for business purposes; association services, namely, promoting the interests of developers pursuing careers in technology; association services, namely, promoting public awareness of women excelling in technology; organizing business networking events in the field of technology. Providing classes, seminars, workshops, events, meet-ups (in-person and virtually) and training in the field of software and hardware development. Providing temporary use of on-line non-downloadable software and hardware development tools; providing a web site featuring non-downloadable software and hardware development tools and API's (application program interface) for developers; providing technical information in the field of computer software development.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing online non-downloadable software for use in large
language models and artificial intelligence; providing
online non-downloadable software using artificial
intelligence for the production of human speech and text;
providing online non-downloadable software for natural
language processing, generation, understanding and analysis
in the fields of health and medicine; providing online
non-downloadable software for artificial intelligence and
machine-learning based language and speech processing
software in the fields of health and medicine; providing
online non-downloadable computer software using artificial
intelligence for medical purposes, namely, asking and
answering medical questions, analyzing patient records and
medical history, generating summaries for medical history,
medical reports, patient charts, medical data, pathology
reports, lab results and medical examination notes;
providing online non-downloadable computer software using
artificial intelligence for analysis and interpretation of
medical data, identification of trends, patterns, and
potential medical research questions; providing online
non-downloadable software using artificial intelligence for
performing literature searches using natural language
queries, identifying relevant articles, and extracting key
findings; providing online-non downloadable software using
artificial intelligence for assisting healthcare
professionals in making informed decisions by providing
relevant medical information, potential treatment options,
and possible side effects; providing online non-downloadable
software using artificial intelligence for medical coding
and billing processes, namely assigning the correct codes
and generating accurate claims; providing online
non-downloadable software for creating generative models in
the fields of health medicine; providing online
non-downloadable software for multi-modal machine-learning
based language, text, and speech processing software in the
fields of health and medicine; providing online
non-downloadable software for processing speech, text,
sound, videos, images, and sound input; providing online
non-downloadable software for generating speech, text,
sound, videos, images, and sound output; research and
development services in the fields of health and medical
artificial intelligence; research in the field of medical
artificial intelligence technology; research, development
and technological evaluation of large language models and
data sets; research, design and development of computer
programs and software for use in the fields of health and
medicine; providing online non-downloadable software for
managing data sets and performing safety checks in the field
of artificial intelligence; providing online
non-downloadable software for multi-modal artificial
intelligence and machine-learning based language, text,
sound, video, image, speech, and sound processing software;
providing temporary use of online non-downloadable software
for facilitating multi-modal natural language, speech, text,
sound, videos, images, and sound input; research and
development services in the field of multi-modal computer
natural language processing, artificial intelligence, and
machine learning; providing temporary use of online
non-downloadable software for an integrated development
environment for large language models; providing online
non-downloadable software for use in the fields of
artificial intelligence, machine learning, natural language
generation, deep learning, statistical learning,
mathematical learning, supervised learning, and unsupervised
learning; providing online non-downloadable software for
accessing information from searchable indexes and databases
of information, including text, music, images, videos,
software algorithms, mathematical equations, medical data,
medical information, medical literature, electronic
documents, and databases; application service provider
featuring application programming interface (API) software;
providing temporary use of online non-downloadable software
for facilitating interaction and communication between
humans and AI (artificial intelligence) chatbots; providing
online non-downloadable chatbot software for simulating
conversations.
Provided are computing systems, methods, and platforms that obtain local node embeddings for heterogeneous graphs. A heterogeneous graph comprising a plurality of nodes can be obtained. Weight values respectively associated with subgraphs of the heterogeneous graph can be determined. At least one node from among the plurality of nodes can be selected. An embedding for the at least one selected node can be learned using an embedding objective computed based on the weight values. The heterogeneous graph can be processed based on the embedding. Submodular hypergraphs can be used to represent heterogeneous graphs and their cuts. The 1-regularized personalized PageRank can be applied to hypergraphs, where the optimal solution gives the node embedding for the given seed nodes. The resulting 1-regularized personalized PageRank can be solved in running time without depending on the size of the whole graph.
Methods and apparatus for enhancing simulated annealing with quantum fluctuations. In one aspect, a method includes obtaining an input state; performing simulated annealing on the input state with a temperature reduction schedule until a decrease in energy is below a first minimum value; terminating the simulated annealing in response to determining that the decrease in energy is below the first minimum level; outputting a first evolved state and first temperature value; reducing the temperature to a minimum temperature value; performing quantum annealing on the first evolved state with a transversal field increase schedule until a completion of a second event occurs; terminating the quantum annealing in response to determining that a completion of the second event has occurred; outputting a second evolved state as a subsequent input state for the simulated annealing, and determining that the completion of the first event has occurred.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the methods includes: obtaining data specifying an initial neural network configured to perform a machine learning task; a representativeness measure for each of a plurality of filters; determining a central tendency measure for the plurality of filters based on processing a batch of network inputs using the initial neural network; determining a cumulative importance score for each of the plurality of filters; selecting a proper subset of the plurality of filters; and generating a pruned neural network configured to perform the machine learning task.
This document describes systems and techniques directed at enlarging active areas of displays in electronic devices. In aspects, a display includes a grid of transistors positioned within a display panel module to control an illumination of one or more electroluminescent layers. Routing lines extend from one or more transistors of the grid of transistors to at least one electroluminescent layer. In this way, the at least one electroluminescent layer can be positioned away from the grid of transistors and disposed above portions of display panel module driving circuitry. As a result, active areas of displays can be enlarged and information content can be maximized without a panel border area allotted to the display panel module driving circuitry surrounding transistors having to be reduced.
G09G 3/3258 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix using controlled light sources using electroluminescent panels semiconductive, e.g. using light-emitting diodes [LED] organic, e.g. using organic light-emitting diodes [OLED] using an active matrix with pixel circuitry controlling the voltage across the light-emitting element
93.
CONDITIONALLY ASSIGNING VARIOUS AUTOMATED ASSISTANT FUNCTION(S) TO INTERACTION WITH A PERIPHERAL ASSISTANT CONTROL DEVICE
In response to a user interacting with a tangible peripheral assistant control device (e.g., depressing a button of the device), causing an automated assistant to perform one or more actions. The action(s) performed can be based on input previously provided by the user in configuring the peripheral assistant control device. The action(s) performed in response to interaction with the peripheral assistant control device can vary based on one or more conditions, such as which user is currently active, where the peripheral assistant control device is currently located (which can optionally be inferred based on which of multiple assistant computing devices the button is paired with), and/or the current state of one or more smart devices and/or other devices (e.g., as determined based on a device topology). A utility of the peripheral assistant control device can be automatically extended beyond what was specifically requested by a user during configuration.
A computer-implemented method includes receiving, by a computing device, input activations and determining, by a controller of the computing device, whether each of the input activations has either a zero value or a non-zero value. The method further includes storing, in a memory bank of the computing device, at least one of the input activations. Storing the at least one input activation includes generating an index comprising one or more memory address locations that have input activation values that are non-zero values. The method still further includes providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array. The activations are provided, at least in part, from a memory address location associated with the index.
Implementations relate to helping a large language model generate factual responses to prompts that request factual content is disclosed. The large language model may receive a prompt context, a plurality of encoded context passages as input. The large language model is trained to determine whether or not to utilize the encoded context passages in generating the response. Implementations also relate to different methods of fine-tuning the responses generated by the large language model through query refinements, response re-writes, and evaluation of factual accuracy.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on an input sequence of characters that has a respective character at each of a plurality of character positions to generate a network output. One of the systems includes a neural network configured to perform the machine learning task, the neural network comprising a gradient-based sub-word tokenizer and an output neural network. The gradient-based sub-word tokenizer is configured to apply a learned, i.e., flexible, sub-word tokenization strategy to the input sequence of characters to generate a sequence of latent sub-word representations. The output neural network is configured to process the latent sub-word representation to generate the network output for the task.
This document describes systems and techniques directed at enlarging active areas of displays using variable pixel and/or transistor densities. In aspects, a display includes a cover layer positioned as a topmost layer and an array of pixels positioned thereunder. A plurality of transistors, positioned under the array of pixels, may control an electrical activation of one or more pixels within the array of pixels. In implementations, the plurality of transistors define a smaller area than the array of pixels such that at least one pixel of the array of pixels extends beyond the area defined by the plurality of transistors and above driving circuitry. Variable pixel and/or transistor densities can support the enlarged active area of displays and improve user experience.
A joint segmenting and ASR model includes an encoder to receive a sequence of acoustic frames and generate, at each of a plurality of output steps, a higher order feature representation for a corresponding acoustic frame. The model also includes a decoder to generate based on the higher order feature representation at each of the plurality of output steps a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of segment (EOS). The model is trained on a set of training samples, each training sample including audio data characterizing multiple segments of long-form speech; and a corresponding transcription of the long-form speech, the corresponding transcription annotated with ground-truth EOS labels obtained via distillation from a language model teacher that receives the corresponding transcription as input and injects the ground-truth EOS labels into the corresponding transcription between semantically complete segments.
A method includes receiving training data including a corpus of multilingual unspoken textual utterances, a corpus of multilingual un-transcribed non-synthetic speech utterances, and a corpus of multilingual transcribed non-synthetic speech utterances. For each un-transcribed non-synthetic speech utterance, the method includes generating a target quantized vector token and a target token index, generating contrastive context vectors from corresponding masked audio features, and deriving a contrastive loss term. The method also includes generating an alignment output, generating a first probability distribution over possible speech recognition hypotheses for the alignment output, and determining an alignment output loss term. The method also includes generating a second probability distribution over possible speech recognition hypotheses and determining a non-synthetic speech loss term. The method also includes pre-training an audio encoder based on the contrastive loss term, the alignment output loss term, and the non-synthetic speech loss term.
Training and/or utilizing a Speech-To-Speech Translation (S2ST) system that can be used to generate, based on processing source audio data that captures a spoken utterance in a source language, target audio data that includes a synthetic spoken utterance that is spoken in a target language and that corresponds, both linguistically and para-linguistically, to the spoken utterance in the source language. Implementations that are directed to training the S2ST system utilize an unsupervised approach, with monolingual speech data, in training the S2ST system.
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band