A system, method, and computer-program product includes constructing a transcript adaptation training data corpus that includes a plurality of transcript normalization training data samples, wherein each of the plurality of transcript normalization training data samples includes: a predicted audio transcript that includes at least one numerical expression, an adapted audio transcript that includes an alphabetic representation of the at least one numerical expression, and a transcript normalization identifier that, when applied to a model input comprising a target audio transcript, defines a text-to-text transformation objective causing a numeric-to-alphabetic expression machine learning model to predict an alphabetic-equivalent audio transcript that represents each numerical expression included in the target audio transcript in one or more alphabetic tokens; configuring the numeric-to-alphabetic expression machine learning model based on a training of a machine learning text-to-text transformer model using the transcript adaptation training data corpus; and executing the numeric-to-alphabetic expression machine learning model.
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/04 - Segmentation; Word boundary detection
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
G10L 25/78 - Detection of presence or absence of voice signals
2.
SYSTEMS AND METHODS FOR CONFIGURING AND USING AN AUDIO TRANSCRIPT CORRECTION MACHINE LEARNING MODEL
A system, method, and computer-program product includes constructing a transcript correction training data corpus that includes a plurality of labeled audio transcription training data samples, wherein each of the plurality of labeled audio transcription training data samples includes: an incorrect audio transcription of a target piece of audio data; a correct audio transcription of the target piece of audio data; and a transcript correction identifier that, when applied to a model input that includes a likely incorrect audio transcript, defines a text-to-text transformation objective causing an audio transcript correction machine learning model to predict a corrected audio transcript based on the likely incorrect audio transcript; configuring the audio transcript correction machine learning model based on a training of a machine learning text-to-text transformer model using the transcript correction training data corpus; and executing the audio transcript correction machine learning model within a speech-to-text post-processing sequence of a speech-to-text service.
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/04 - Segmentation; Word boundary detection
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
G10L 25/78 - Detection of presence or absence of voice signals
3.
BIAS MITIGATING MACHINE LEARNING TRAINING SYSTEM WITH MULTI-CLASS TARGET
A computing device trains a fair prediction model. A prediction model is trained and executed with observation vectors. A weight value is computed for each observation vector based on whether the predicted target variable value of a respective observation vector of the plurality of observation vectors has a predefined target event value. An observation vector is relabeled based on the computed weight value. The prediction model is retrained with each observation vector weighted by a respective computed weight value and with the target variable value of any observation vector that was relabeled. The retrained prediction model is executed. A conditional moments matrix is computed. A constraint violation matrix is computed. Computing the weight value through computing the constraint violation matrix is repeated until a stop criterion indicates retraining of the prediction model is complete. The retrained prediction model is output.
A computer-implemented system includes identifying a target hierarchical taxonomy comprising a plurality of distinct hierarchical taxonomy categories; extracting a plurality of distinct taxonomy tokens from the plurality of distinct hierarchical taxonomy categories; computing a taxonomy vector corpus based on the plurality of distinct taxonomy tokens; computing a plurality of distinct taxonomy clusters based on an input of the taxonomy vector corpus; constructing a hierarchical taxonomy classifier based on the plurality of distinct taxonomy clusters; converting a volume of unlabeled structured datasets to a plurality of distinct corpora of taxonomy-labeled structured datasets based on the hierarchical taxonomy classifier; and outputting at least one corpus of taxonomy-labeled structured datasets of the plurality of distinct corpora of taxonomy-labeled structured datasets based on an input of a data classification query.
A parallel processing technique can be used to expedite reconciliation of a hierarchy of forecasts on a computer system. As one example, the computer system can receive forecasts that have a hierarchical relationship with respect to one another. The computer system can distribute the forecasts among a group of computing nodes by time point, so that all data points corresponding to the same time point in the forecasts are assigned to the same computing node. The computing nodes can receive the datasets corresponding to the time points, organize the data points in each of the datasets by forecast to generate ordered datasets, and assign the ordered datasets to processing threads. The processing threads (across the computing nodes) can then execute a reconciliation process in parallel to one another to generate reconciled values, which can be output by the computing nodes.
A computing system detects a defective object. An image is received of a manufacturing line that includes objects in a process of being manufactured. Each pixel included in the image is classified as a background pixel class, a non-defective object class, or a defective object class using a trained neural network model. The pixels included in the image that were classified as the non-defective object class or the defective object class are grouped into polygons. Each polygon is defined by a contiguous group of pixels classified as the non-defective object class or the defective object class. Each polygon is classified in the non-defective object class or in the defective object class based on a number of pixels included in a respective polygon that are classified in the non-defective object class relative to a number of pixels included in the respective polygon that are classified in the defective object class.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/24 - Aligning, centring, orientation detection or correction of the image
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.
G10L 15/04 - Segmentation; Word boundary detection
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
8.
Computer system for automatically analyzing a video of a physical activity using a model and providing corresponding feedback
A computer system can automatically analyze a video of a physical activity and provide corresponding feedback. For example, the system can receive a video file including image frames showing an entity performing a physical activity that involves a sequence of movement phases. The system can generate coordinate sets by performing image analysis on the image frames. The system can provide the coordinate sets as input to a trained model, the trained model being configured to assign scores and movement phases to the image frames based on the coordinate sets. The system can then select a particular movement phase for which to provide feedback, based on the scores and movement phases assigned to the image frames. The system can generate the feedback for the entity about their performance of the particular movement phase, which may improve the entity's future performance of that particular movement phase.
G06V 10/84 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
An apparatus includes at least one node device to host a computing cluster, and at least one processor to generate a UI providing guidance through a set of configuration settings for the computing cluster, wherein, for each configuration setting that is received as an input during configuration, the at least one processor is caused to: perform a check of the set of configuration settings to determine whether the received configuration setting creates a conflict among the set of configuration settings; and in response to a determination that the received configuration setting creates a conflict among the set of configuration settings, perform operations including generate an indication of the conflict for presentation by the UI, and receive a change to a configuration setting as an input from the input device.
An apparatus includes a processor to receive a request to provide a view of an object associated with a job flow, and in response to determining that the object is associated with a task type requiring access to a particular resource not accessible to a first interpretation routine: store, within a job queue, a job flow generation request message to cause generation of a job flow definition the defines another job flow for generating the requested view; within a task container in which a second interpretation routine that does have access to the particular resource is executed, generate the job flow definition; store, within a task queue, a job flow generation completion message that includes a copy of the job flow definition; use the job flow definition to perform the other job flow to generate the requested view; and transmit the requested view to the requesting device.
Groups of connected nodes in a network of nodes can be detected for evaluating and mitigating risks of the network of nodes. For example, a system can process one or more subnetworks of the network of nodes in parallel. For each subnetwork, the system can identify root nodes and their reachable nodes to create rooted groups of connected nodes. The system then can determine outdegrees of the remaining nodes in the network. The system can identify reachable nodes from a remaining node of the highest outdegree to create a nonrooted group of connected nodes. The system can estimate a risk value based on the number of rooted groups and nonrooted groups, the number of nodes in each rooted group and nonrooted group, and the attributes of the nodes in each group. The system can mitigate potential risks by reconfiguring the network of nodes.
H04L 41/12 - Discovery or management of network topologies
H04L 41/0604 - Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
H04L 41/22 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
H04L 41/0631 - Management of faults, events, alarms or notifications using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
Some examples can involve a system that can receive a first user selection of time series data and a second user selection of a type of forecasting model to apply to the time series data. The system can then obtain a first set of candidate values and a second set of candidate values for a first parameter and a second parameter, respectively, of the selected type of forecasting model. The candidate values may be determined based on statistical information derived from the time series data. The system can then provide the first set of candidate values and the second set of candidate values to the user, receive user selections of a first parameter value and a second parameter value, and determine whether a conflict exists between the first parameter value and the second parameter value. If so, the system can generate an output indicating that the conflict exists.
A data protection system is provided to detect data and execute security actions on the detected data using multiple tiers of parallel processing and incremental processing. For example, the data protection system can employ parallel job-submission and parallel-job execution to cataloging, scanning, searching, and other processes. Only source data that has not already been processed or has modified may be loaded to a cataloging data queue and a scanning data queue to reduce processing time. Scan results can include different data groups and can be used to search for specific data sets.
The computing device obtains a training data set related to a plurality of historic user inputs associated with preferences of one or more services or items from an entity. For each of the one or more services or items, the computing device executes operations to train a plurality of models using the training data set to generate a plurality of recommended models, apply a validation data set to generate a plurality of predictions from the plurality of recommended models, obtain a weight of each metric of a plurality of metrics from the entity, obtain user inputs associated with user preferences, and determine a relevancy score for each metric. The computing device selects a recommended model based on the relevancy score of the selected metric or a combination of selected metrics, generates one or more recommendations for the users, and outputs the one or more generated recommendations to the users.
An apparatus includes at least one node device to host a computing cluster, and at least one processor to: use at least one of a level of resource observed to be consumed by operation of the computing cluster or a level of performance observed to be provided by operation of the computing cluster as an input to a pre-existing cluster model to derive a predicted level; compare the predicted level to a corresponding observed level of resource consumed or performance provided; and in response to the predicted level not matching the observed level to within a pre-selected degree, derive a new cluster model from observations of the operation of the computing cluster, and generate a prompt to perform repeat the configuration of the computing cluster using the new cluster model in place of the pre-existing cluster model to generate a new set of configuration settings for the computing cluster.
A computer system can automatically generate a directed graph interface for use in detecting and mitigating anomalies in entity interactions. For example, the system can receive interaction data describing a set of interactions at two entities. The system can then generate a directed network graph based on the interaction data. To do so, the system can identify pairs of interactions associated with the two entities in the set of interactions. The system can classify the pairs of interactions as outbound and/or inbound interaction pairs. The system can then generate one or more directed links in the directed network graph to represent the outbound and/or inbound interaction pairs. The system can further determine a characteristic of the outbound and/or inbound interaction pairs, automatically detect an anomaly that may be suggestive of malicious activity by one or both entities based on the characteristic, and output an indicator of the detected anomaly.
A computer monitors a state of a system. A time branch is defined for each valid value of each discrete variable. A system model is executed with observed values to update each time branch and determine a probability associated with each time branch. A discrete variable is selected, and a sequence duration value is incremented. When the incremented sequence duration value is greater than a predefined minimum sequence duration value, a probability change value is computed for the discrete variable, and, when the computed probability change value is less than or equal to a synchronization probability change value, a continuous value for each continuous variable for each time branch of the discrete variable is synchronized, and the sequence duration value for the selected discrete variable is reinitialized. The continuous value for at least one non-observed continuous variable is output.
An apparatus includes a processor to: receive a request to perform a job flow; within a performance container, based on the data dependencies among a set of tasks of the job flow, derive an order of performance of the set of tasks that includes a subset able to be performed in parallel, and derive a quantity of task containers to enable the parallel performance of the subset; based on the derived quantity of task containers, derive a quantity of virtual machines (VMs) to enable the parallel performance of the subset; provide, to a VM allocation routine, an indication of a need for provision of the quantity of VMs; and store, within a task queue, multiple task routine execution request messages to enable parallel execution of task routines within the quantity of task containers to cause the parallel performance of the subset.
A flexible computer architecture for performing digital image analysis is described herein. In some examples, the computer architecture can include a distributed messaging platform (DMP) for receiving images from cameras and storing the images in a first queue. The computer architecture can also include a first container for receiving the images from the first queue, applying an image analysis model to the images, and transmitting the image analysis result to the DMP for storage in a second queue. Additionally, the computer architecture can include a second container for receiving the image analysis result from the second queue, performing a post-processing operation on the image analysis result, and transmitting the post-processing result to the DMP for storage in a third queue. The computer architecture can further include an output container for receiving the post-processing result from the third queue and generating an alert notification based on the post-processing result.
A computer trains a neural network. A neural network is executed with a weight vector to compute a gradient vector using a batch of observation vectors. Eigenvalues are computed from a Hessian approximation matrix, a regularization parameter value is computed using the gradient vector, the eigenvalues, and a step-size value, a search direction vector is computed using the eigenvalues, the gradient vector, the Hessian approximation matrix, and the regularization parameter value, a reduction ratio value is computed, an updated weight vector is computed from the weight vector, a learning rate value, and the search direction vector or the gradient vector based on the computed reduction ratio value, and an updated Hessian approximation matrix is computed from the Hessian approximation matrix, the predefined learning rate value, and the search direction vector or the gradient vector based on the reduction ratio value. The step-size value is updated using the search direction vector.
In one example, a system can receive a set of text samples and generate a set of summaries based on the set of text samples. The system can then generate a training dataset by iteratively executing a training-sample generation process. Each iteration can involve selecting multiple text samples from the set of text samples, combining the multiple text samples together into a training sample, determining a text category and a summary corresponding to a selected one of the multiple text samples, and including the text category and the summary in the training sample. After generating the training dataset, the system can use it to train a model. The trained model can then receive a target textual dataset and a target category as input, identify a portion of the target textual dataset corresponding to the target category, and generate a summarization of the portion of that target textual dataset.
An apparatus including a processor to: within a kill container, in response to a set of error messages indicative of errors in executing multiple instances of a task routine to perform a task of a job flow with multiple data object blocks of a data object, and in response to the quantity of error messages reaching a threshold, output a kill tasks request message that identifies the job flow; within a task container, in response to the kill tasks request message, cease execution of the task routine and output a task cancelation message that identifies the task and the job flow; and within a performance container, in response to he task cancelation message, output a job cancelation message to cause the transmission of an indication of cancelation of the job flow, via a network, and to a requesting device that requested the performance of the job flow.
A computing device trains a fair machine learning model. A predicted target variable is defined using a trained prediction model. The prediction model is trained with weighted observation vectors. The predicted target variable is updated using the prediction model trained with weighted observation vectors. A true conditional moments matrix and a false conditional moments matrix are computed. The training and updating with weighted observation vectors are repeated until a number of iterations is performed. When a computed conditional moments matrix indicates to adjust a bound value, the bound value is updated based on an upper bound value or a lower bound value, and the repeated training and updating with weighted observation vectors is repeated with the bound value replaced with the updated bound value until the conditional moments matrix indicates no further adjustment of the bound value is needed. A fair prediction model is trained with the updated bound value.
A computing device determines a disaggregated solution vector of a plurality of variables. A first value is computed for a known variable using a predefined density distribution function, and a second value is computed for an unknown variable using the computed first value, a predefined correlation value, and a predefined aggregate value. The predefined correlation value indicates a correlation between the known variable and the unknown variable. A predefined number of solution vectors is computed by repeating the first value and the second value computations. A solution vector is the computed first value and the computed second value. A centroid vector is computed from solution vectors computed by repeating the computations. A predefined number of closest solution vectors to the computed centroid vector are determined from the solution vectors. The determined closest solution vectors are output.
G06F 17/11 - Complex mathematical operations for solving equations
G06F 18/2413 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
G06Q 10/0635 - Risk analysis of enterprise or organisation activities
G06F 18/2411 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
G06F 18/23213 - Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
25.
Data object preparation for execution of multiple task routine instances in many task computing
An apparatus includes a processor to: output a request message to cause a first task to be performed; within a task container, in response to the request message and a data object not being divided, divide the data object into a set of data object blocks based on at least the sizes of the data object and the atomic unit of organization of data therein, as well as the storage resources allocated to task containers, and output a task completion message indicating that the first task has been performed, and including a set of data block identifiers indicating the location of the set of data object blocks within at least one federated area; and in response to the task completion message, output a set of request messages to cause a second task to be performed by executing multiple instances of a task routine within multiple task containers.
A computing device selects a piecewise linear regression model for multivariable data. A hyperplane is fit to observation vectors using a linear multivariable regression. A baseline fit quality measure is computed for the fit hyperplane. For each independent variable, the observation vectors are sorted, contiguous segments to evaluate are defined, for each contiguous segment, a segment hyperplane is fit to the sorted observation vectors using a multivariable linear regression, path distances are computed between a first observation of the and a last observation of the sorted observation vectors based on a predefined number of segments, a shortest path associated with a smallest value of the computed path distances is selected, and a fit quality measure is computed for the selected shortest path. A best independent variable is selected from the independent variables based on having an extremum value for the computed fit quality measure.
An apparatus including a processor to: output a first request message onto a group sub-queue shared by multiple task containers to request execution of a first task routine; within a task container, respond to the first request message, by outputting a first task in-progress message onto an individual sub-queue not shared with other task containers to accede to executing the first task routine, followed by a task completion message; and respond to the task completion message by allowing the task completion message to remain on the individual sub-queue to keep the task container from executing another task routine from another request message on the group sub-queue, outputting a second request message onto the individual sub-queue to cause execution of a second task routine within the same task container to perform a second task, and responding to the second task in-progress message by de-queuing the task completion message.
A computing device determines an optimal number of threads for a computer task. Execution of a computing task is controlled in a computing environment based on each task configuration included in a plurality of task configurations to determine an execution runtime value for each task configuration. An optimal number of threads value is determined for each set of task configurations having common values for a task parameter value, a dataset indicator, and a hardware indicator. The optimal number of threads value is an extremum value of an execution parameter value as a function of a number of threads value. A dataset parameter value is determined for a dataset. A hardware parameter value is determined as a characteristic of each distinct executing computing device in the computing environment. The optimal number of threads value for each set of task configurations is stored in a performance dataset in association with the common values.
An apparatus includes a processor to: receive, from a requesting device, a request to perform speech-to-text conversion of a speech data set; within a first thread of a thread pool, perform a first pause detection technique to identify a first set of likely sentence pauses; within a second thread of the thread pool, perform a second pause detection technique to identify a second set of likely sentence pauses; perform a speaker diarization technique to identify a set of likely speaker changes; divide the speech data set into data segments representing speech segments based on a combination of at least the first set of likely sentence pauses, the second set of likely sentence pauses, and the set of likely speaker changes; use at least an acoustic model with each data segment to identify likely speech sounds; and generate a transcript based, at least in part, on the identified likely speech sounds.
G10L 15/04 - Segmentation; Word boundary detection
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
An apparatus includes a processor to: receive a request to perform speech-to-text conversion of a speech data set; perform pause detection to identify a set of likely sentence pauses and/or speaker diarization technique to identify a set of likely speaker changes; based the set of likely sentence pauses and/or the set of likely speaker changes, divide the speech data set into data segments representing speech segments; use an acoustic model with the data segments to derive sets of probabilities of speech sounds uttered; store the sets of probabilities in temporal order within a buffer queue; distribute the sets of probabilities from the buffer queue in temporal order among threads of a thread pool; and within each thread, and based on set(s) of probabilities, derive one candidate word and select either the candidate word or an alternate candidate word derived from a language model as the next word most likely spoken.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
G10L 15/04 - Segmentation; Word boundary detection
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
31.
Process to Geographically Associate Potential Water Quality Stressors to Monitoring Stations
A computing device obtains data indicating a topography for an area comprising water and receives an indication of an identified data object representing a stressor to the area or a first monitoring station configurable to monitor the stressor. The computing device also determines a location for the identified data object in the topography and selects one or more related data objects to be related to the identified data object by determining a classification indicating whether the identified data object operates in water and selecting the one or more related data objects based on the location and the classification. The computing device also generates one or more controls for monitoring the area based on the selected one or more related data objects.
A computing device accesses a machine learning model trained on training data of first bonding operations (e.g., a ball and/or stitch bond). The first bonding operations comprise operations to bond a first set of multiple wires to a first set of surfaces. The machine learning model is trained by supervised learning. The device receives input data indicating process data generated from measurements of second bonding operations. The second bonding operations comprise operations to bond a second set of multiple wires to a second set of surfaces. The device weights the input data according to the machine learning model. The device generates an anomaly predictor indicating a risk for an anomaly occurrence in the second bonding operations based on weighting the input data according to the machine learning model. The device outputs the anomaly predictor to control the second bonding operations.
A computing device (2002) accesses a machine learning model (2050) trained on training data (2032) of first bonding operations (1308, 2040A) (e.g., a ball and/or stitch bond). The first bonding operations comprise operations to bond a first set of wires (1504) to a first set of surfaces (1506, 1508). The machine learning model is trained by supervised learning. The device receives input data (2070) indicating process data (2074) generated from measurements of second bonding operations (2040B). The second bonding operations comprise operations to bond a second set of wires to a second set of surfaces. The device weights the input data according to the machine learning model. The device generates an anomaly predictor (2052) indicating a risk for an anomaly occurrence in the second bonding operations based on weighting the input data according to the machine learning model. The device outputs the anomaly predictor to control the second bonding operations.
An event stream processing (ESP) model is read that describes computational processes. (A) An event block object is received. (B) A new measurement value, a timestamp value, and a sensor identifier are extracted. (C) An in-memory data store is updated with the new measurement value, the timestamp value, and the sensor identifier. (A) through (C) are repeated until an output update time is reached. When the output update time is reached, data stored in the in-memory data store is processed and updated using data enrichment windows to define enriched data values that are output. The data enrichment windows include a gate window before each window that uses values computed by more than one window. The gate window sends a trigger to a next window when each value of the more than one window has been computed. The enrichment windows are included in the ESP model.
An apparatus includes processor(s) to: receive a request to test goodness-of-fit of a spatial process model; generate a KD tree from observed spatial point dataset including locations within a region at which instances of an event occurred; derive, from the observed spatial point dataset, multiple quadrats into which the region is divided; receive, from multiple processors, current levels of availability of processing resources including quantities of currently available execution threads; select, based on the quantity of currently available execution threads, a subset of the multiple processors to perform multiple iterations of a portion of the test in parallel; provide, to each processor of the subset, the KD tree, the spatial process model, and the multiple quadrats; receive, from each processor of the subset, per-quadrat data portions indicative of results of an iteration; derive a goodness-of-fit statistic from the per-quadrat data portions; and transmit an indication of goodness-of-fit to another device.
A computing system obtains a first preconfigured feature set. The first preconfigured feature set defines: a first feature definition defining an input variable, and first computer instructions for locating first data. The first data is available for retrieval because it is stored, or set-up to arrive, in the feature storage according to the first preconfigured feature set. The computing system receives a requested data set for the input variable. The computing system generates an availability status indicating whether the request data set is available for retrieval according to the first preconfigured feature set. Based on the availability status, generating, by the computing system, the requested data set by: retrieving historical data for the first preconfigured feature set; retrieving a data definition associated with the historical data; and generating the requested data based on the historical data and the data definition.
A computing device create a user interface application. A user interface (UI) tag is read in a UI application. The UI tag is executed to identify a UI template tag. The identified UI template tag is executed to define a top-level container initializer for the UI application and to define a plurality of widget initializers for inclusion in a top-level container rendered using the top-level container initializer. The top-level container is rendered in a display using the top-level container initializer. Each widget of a plurality of widgets in the rendered top-level container is rendered using the defined plurality of widget initializers to create a UI.
A computing device trains a fair machine learning model. A prediction model is trained to predict a target value. For a number of iterations, a weight vector is computed using the bound value based on fairness constraints defined for a fairness measure type; a weight value is assigned to each observation vector based on the target value and a sensitive attribute value; the prediction model is trained with each weighted observation vector to predict the target value; and a conditional moments vector is computed based on the fairness constraints and the target and sensitive attribute values. Conditional moments difference values are computed. When the conditional moments difference values indicate to adjust the bound value, the bound value is updated and the process is repeated with the bound value replaced with the updated bound value until the conditional moments difference values indicate no further adjustment of the bound value is needed.
Embodiments are directed to techniques for image content extraction. Some embodiments include extracting contextually structured data from document images, such as by automatically identifying document layout, document data, document metadata, and/or correlations therebetween in a document image, for instance. Some embodiments utilize breakpoints to enable the system to match different documents with internal variations to a common template. Several embodiments include extracting contextually structured data from table images, such as gridded and non-gridded tables. Many embodiments are directed to generating and utilizing a document template database for automatically extracting document image contents into a contextually structured format. Several embodiments are directed to automatically identifying and associating document metadata with corresponding document data in a document image to generate a machine-facilitated annotation of the document image. In some embodiments, the machine-facilitated annotation may be used to generate a template for the template database.
A computing device trains a machine state predictive model. A generative adversarial network with an autoencoder is trained using a first plurality of observation vectors. Each observation vector of the first plurality of observation vectors includes state variable values for state variables and an action variable value for an action variable. The state variables define a machine state, wherein the action variable defines a next action taken in response to the machine state. The first plurality of observation vectors successively defines sequential machine states to manufacture a product. A second plurality of observation vectors is generated using the trained generative adversarial network with the autoencoder. A machine state machine learning model is trained to predict a subsequent machine state using the first plurality of observation vectors and the generated second plurality of observation vectors. A description of the machine state machine learning model is output.
A computing device accesses a machine learning model trained on training data of first bonding operations (e.g., a ball and/or stitch bond). The first bonding operations comprise operations to bond a first set of multiple wires to a first set of surfaces. The machine learning model is trained by supervised learning. The device receives input data indicating process data generated from measurements of second bonding operations. The second bonding operations comprise operations to bond a second set of multiple wires to a second set of surfaces. The device weights the input data according to the machine learning model. The device generates an anomaly predictor indicating a risk for an anomaly occurrence in the second bonding operations based on weighting the input data according to the machine learning model. The device outputs the anomaly predictor to control the second bonding operations.
Text profiles can be leveraged to select and configure models according to some examples described herein. In one example, a system can analyze a reference textual dataset and a target textual dataset using text-mining techniques to generate a first text profile and a second text profile, respectively. The first text profile can contain first metrics characterizing the reference textual dataset and the second text profile can contain second metrics characterizing the target textual dataset. The system can determine a similarity value by comparing the first text profile to the second text profile. The system can also receive a user selection of a model that is to be applied to the target textual dataset. The system can then generate an insight relating to an anticipated accuracy of the model on the target textual dataset based on the similarity value. The system can output the insight to the user.
In one example, a system can execute a first machine-learning model to determine an overall classification for a textual dataset. The system can also determine classification scores indicating the level of influence that each token in the textual dataset had on the overall classification. The system can select a first subset of the tokens based on their classification scores. The system can also execute a second machine-learning model to determine probabilities that the textual dataset falls into various categories. The system can determine category scores indicating the level of influence that each token had on a most-likely category determination. The system can select a second subset of the tokens based on their category scores. The system can then generate a first visualization depicting the first subset of tokens color-coded to indicate their classification scores and a second visualization depicting the second subset of tokens color-coded to indicate their category scores.
A computing system establishes a hierarchy for monitoring model(s). The hierarchy comprises an association between each of multiple measures of a measure level of the hierarchy and intermediate level(s) of the hierarchy. An intermediate level comprises one or more of a measurement category or analysis type. The hierarchy comprises an association between the intermediate level(s) and at least one model. The system monitors the model(s) by generating health measurements. Each of the health measurements corresponds to one of the multiple measures. Each of the health measurements indicates a performance of a monitored model according to a measurement category or analysis type associated in the hierarchy with the respective measure of the multiple measures. The system generates a visualization in a graphical user interface. The visualization comprises a graphical representation of an indication of a health measurement for each of measure(s), and associations in the hierarchy.
One example described herein involves a system receiving task data and distribution criteria for a state space model from a client device. The task data can indicate a type of sequential Monte Carlo (SMC) task to be implemented. The distribution criteria can include an initial distribution, a transition distribution, and a measurement distribution for the state space model. The system can generate a set of program functions based on the task data and the distribution criteria. The system can then execute an SMC module to generate a distribution and a corresponding summary, where the SMC module is configured to call the set of program functions during execution of an SMC process and apply the results returned from the set of program functions in one or more subsequent steps of the SMC process. The system can then transmit an electronic communication to the client device indicating the distribution and its corresponding summary.
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
46.
Speech segmentation based on combination of pause detection and speaker diarization
An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
47.
Dynamic per-node pre-pulling in distributed computing
An apparatus includes a processor to: receive an indication of ability of a node device to provide a resource for executing application routines, at least one identifier of at least one image including an executable routine stored within a cache of the node device, and an indication of at least one revision level of the at least one image; analyze the ability to provide the resource; in response to being able to support execution of the application routine, identify a first image in a repository; compare identifiers to determine whether there is a second image including a matching executable routine; in response to a match, compare revision levels; and in response to the revision level of the most recent version of the first image being more recent, retrieve the most recent version of the first image from the repository, and store it within the node device.
Tops of geological layers can be automatically identified using machine-learning techniques as described herein. In one example, a system can receive well log records associated with wellbores drilled through geological layers. The system can generate well clusters by applying a clustering process to the well log records. The system can then obtain a respective set of training data associated with a well cluster, train a machine-learning model based on the respective set of training data, select a target well-log record associated with a target wellbore of the well cluster, and provide the target well-log record as input to the trained machine-learning model. Based on an output from the trained machine-learning model, the system can determine the geological tops of the geological layers in a region surrounding the target wellbore. The system may then transmit an electronic signal indicating the geological tops of the geological layers associated with the target wellbore.
G06F 16/587 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
G06F 16/909 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
G01V 1/40 - Seismology; Seismic or acoustic prospecting or detecting specially adapted for well-logging
G06F 16/387 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
49.
User interfaces for converting node-link data into audio outputs
Node-link data can be converted into audio outputs. For example, a system can generate a graphical user interface (GUI) depicting a node-link diagram having nodes and links. The GUI can include a virtual reference point in the node-link diagram and a virtual control element that is rotatable around the virtual reference point by a user to contact one or more of the nodes in the node-link diagram. The system can receive user input for rotating the virtual control element around the virtual reference point, which can generate a contact between the virtual control element and a particular node of the node-link diagram. In response to detecting the contact, the system can determine a sound characteristic configured to indicate an attribute associated with the particular node. The system can then generate a sound having the sound characteristic, for example to assist the user in exploring the node-link diagram.
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
A computing device selects new test configurations for testing software. (A) First test configurations are generated using a random seed value. (B) Software under test is executed with the first test configurations to generate a test result for each. (C) Second test configurations are generated from the first test configurations and the test results generated for each. (D) The software under test is executed with the second test configurations to generate the test result for each. (E) When a restart is triggered based on a distance metric value computed between the second test configurations, a next random seed value is selected as the random seed value and (A) through (E) are repeated. (F) When the restart is not triggered, (C) through (F) are repeated until a stop criterion is satisfied. (G) When the stop criterion is satisfied, the test result is output for each test configuration.
An apparatus includes processor (s) to: generate a set of candidate n-grams based on probability distributions from an acoustic model for candidate graphemes of a next word most likely spoken following at least one preceding word spoken within speech audio; provide the set of candidate n-grams to multiple devices; provide, to each node device, an indication of which candidate n-grams are to be searched for within the n-gram corpus by each node device to enable searches for multiple candidate n-grams to be performed, independently and at least partially in parallel, across the node devices; receive, from each node device, an indication of a probability of occurrence of at least one candidate n-gram within the speech audio; based on the received probabilities of occurrence, identify the next word most likely spoken within the speech audio; and add the next word most likely spoken to a transcript of the speech audio.
A computing device learns a directed acyclic graph (DAG). An SSCP matrix is computed from variable values defined for observation vectors. A topological order vector is initialized that defines a topological order for the variables. A loss value is computed using the topological order vector and the SSCP matrix. (A) A neighbor determination method is selected. (B) A next topological order vector is determined relative to the initialized topological order vector using the neighbor determination method. (C) A loss value is computed using the next topological order vector and the SSCP matrix. (D) (B) and (C) are repeated until each topological order vector is determined in (B) based on the neighbor determination method. A best topological vector is determined from each next topological order vector based on having a minimum value for the computed loss value. An adjacency matrix is computed using the best topological vector and the SSCP matrix.
Tops of geological layers can be automatically identified using machine-learning techniques as described herein. In one example, a system can receive well log records associated with wellbores drilled through geological layers. The system can generate well clusters by applying a clustering process to the well log records. The system can then obtain a respective set of training data associated with a well cluster, train a machine-learning model based on the respective set of training data, select a target well-log record associated with a target wellbore of the well cluster, and provide the target well-log record as input to the trained machine-learning model. Based on an output from the trained machine-learning model, the system can determine the geological tops of the geological layers in a region surrounding the target wellbore. The system may then transmit an electronic signal indicating the geological tops of the geological layers associated with the target wellbore.
E21B 49/00 - Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells
G01V 99/00 - Subject matter not provided for in other groups of this subclass
(A) Conditional vectors are defined. (B) Latent observation vectors are generated using a predefined noise distribution function. (C) A forward propagation of a generator model is executed with the conditional vectors and the latent observation vectors as input to generate an output vector. (D) A forward propagation of a decoder model of a trained autoencoder model is executed with the generated output vector as input to generate a plurality of decoded vectors. (E) Transformed observation vectors are selected from transformed data based on the defined plurality of conditional vectors. (F) A forward propagation of a discriminator model is executed with the transformed observation vectors, the conditional vectors, and the decoded vectors as input to predict whether each transformed observation vector and each decoded vector is real or fake. (G) The discriminator and generator models are updated and (A) through (G) are repeated until training is complete.
A computing device trains a fair machine learning model. A prediction model is trained to predict a target value. For a number of iterations, a weight vector is computed using the bound value based on fairness constraints defined for a fairness measure type; a weight value is assigned to each observation vector based on the target value and a sensitive attribute value; the prediction model is trained with each weighted observation vector to predict the target value; and a conditional moments vector is computed based on the fairness constraints and the target and sensitive attribute values. Conditional moments difference values are computed. When the conditional moments difference values indicate to adjust the bound value, the bound value is updated and the process is repeated with the bound value replaced with the updated bound value until the conditional moments difference values indicate no further adjustment of the bound value is needed.
Text profiles can be leveraged to select and configure models according to some examples described herein. In one example, a system can analyze a reference textual dataset and a target textual dataset using text-mining techniques to generate a first text profile and a second text profile, respectively. The first text profile can contain first metrics characterizing the reference textual dataset and the second text profile can contain second metrics characterizing the target textual dataset. The system can determine a similarity value by comparing the first text profile to the second text profile. The system can also receive a user selection of a model that is to be applied to the target textual dataset. The system can then generate an insight relating to an anticipated accuracy of the model on the target textual dataset based on the similarity value. The system can output the insight to the user.
A computing device generates synthetic tabular data. Until a convergence parameter value indicates that training of an attention generator model is complete, conditional vectors are defined; latent vectors are generated using a predefined noise distribution function; a forward propagation of an attention generator model that includes an attention model integrated with a conditional generator model is executed to generate output vectors; transformed observation vectors are selected; a forward propagation of a discriminator model is executed with the transformed observation vectors, the conditional vectors, and the output vectors to predict whether each transformed observation vector and each output vector is real or fake; a discriminator model loss value is computed based on the predictions; the discriminator model is updated using the discriminator model loss value; an attention generator model loss value is computed based on the predictions; and the attention generator model is updated using the attention generator model loss value.
An apparatus to: analyze a data set to identify a candidate topic not in a set of topics; determine whether the prominence of the candidate topic within the data set meets a threshold; in response to meeting the threshold, retrieve a rate of increase in frequency of the candidate topic in online searches; in response to meeting a threshold rate of increase, retrieve the keyword most frequently used in online searches for the candidate topic, use the keyword to retrieve a supplemental data set, and analyze input data extracted from the supplemental data set to determine whether the candidate topic can change the accuracy of a forecast model; and in response to determining that the candidate topic can change the accuracy, add the candidate topic to the set of topics and replace the forecast model with a forecast model trained for the set of topics augmented with the candidate topic.
An apparatus includes processor(s) to: generate a set of candidate n-grams based on probability distributions from an acoustic model for candidate graphemes of a next word most likely spoken following at least one preceding word spoken within speech audio; provide the set of candidate n-grams to multiple devices; provide, to each node device, an indication of which candidate n-grams are to be searched for within the n-gram corpus by each node device to enable searches for multiple candidate n-grams to be performed, independently and at least partially in parallel, across the node devices; receive, from each node device, an indication of a probability of occurrence of at least one candidate n-gram within the speech audio; based on the received probabilities of occurrence, identify the next word most likely spoken within the speech audio; and add the next word most likely spoken to a transcript of the speech audio.
A computing device determines a recommendation. A confidence matrix is computed using a predefined weight value. (A) A first parameter matrix is updated using the confidence matrix, a predefined response matrix, a first step-size parameter value, and a first direction matrix. The predefined response matrix includes a predefined response value by each user to each item and at least one matrix value for which a user has not provided a response to an item. (B) A second parameter matrix is updated using the confidence matrix, the predefined response matrix, a second step-size parameter value, and a second direction matrix. (C) An objective function value is updated based on the first and second parameter matrices. (D) The first and second parameter matrices are trained by repeating (A) through (C). The first and second parameter matrices output for use in predicting a recommended item for a requesting user.
An apparatus includes a processor to: derive an order of performance of a set of tasks of a job flow; based on the order of performance, store, within a task queue, a first task routine execution request message to cause a first task to be performed; within a first task container, and in response to storage of the first task routine execution request message, execute instructions of a first task routine of a set of task routines, store a mid-flow data set output of the first task within a federated area, and store a first task completion message within the task queue after completion of storage of the mid-flow data set; and in response to the storage of the first task completion message, and based on the order of performance, store, within the task queue, a second task routine execution request message to cause a second task to be performed.
A computing device selects new test configurations for testing software. Software under test is executed with first test configurations to generate a test result for each test configuration. Each test configuration includes a value for each test parameter where each test parameter is an input to the software under test. A predictive model is trained using each test configuration of the first test configurations in association with the test result generated for each test configuration based on an objective function value. The predictive model is executed with second test configurations to predict the test result for each test configuration of the second test configurations. Test configurations are selected from the second test configurations based on the predicted test results to define third test configurations. The software under test is executed with the defined third test configurations to generate the test result for each test configuration of the third test configurations.
Some examples describes herein relate to handling bulk requests for resources. In one example, a system can determine a bulk request parameter-value associated with a bulk request. The system can then predict a baseline benefit value, which can be a benefit value when the bulk request parameter-value is used as a lower boundary for a unit parameter-value. The system can also determine a lower boundary constraint on the unit parameter-value independently of the bulk request parameter-value. The system can then execute an iterative process using the baseline benefit value and the lower boundary constraint. Based on a result of the iterative process, the system can determine whether and how much the bulk request parameter-value should be adjusted. The system may adjust the bulk request parameter-value accordingly or output a recommendation to do so.
G06N 3/04 - Architecture, e.g. interconnection topology
G16H 50/50 - ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
An apparatus includes a processor to: within a first container, and prior to its uninstantiation, execute a first instance of a routine to cause the processor to monitor for and detect a job performance request in a job queue; and within a second container, execute a second instance of the routine to cause the processor to search the job queue for a job performance request, and in response to a combination of the uninstantiation of the first container, the storage of the job performance request in the job queue and there being no indication of completion of the job flow in the job queue, perform a combination of store an indication of the job flow performance commencing in the job queue, derive an order of performance of the set of tasks of the job flow and store a first task execution request in a task queue.
A computing device selects a trained spatial regression model. A spatial weights matrix defined for observation vectors is selected, where each element of the spatial weights matrix indicates an amount of influence between respective pairs of observation vectors. Each observation vector is spatially referenced. A spatial regression model is selected from spatial regression models, initialized, and trained using the observation vectors and the spatial weights matrix to fit a response variable using regressor variables. Each observation vector includes a response value for the response variable and a regressor value for each regressor variable of the regressor variables. A fit criterion value is computed for the spatial regression model and the spatial regression model selection, initialization, and training are repeated until each spatial regression model is selected. A best spatial regression model is selected and output as the spatial regression model having an extremum value of the fit criterion value.
A computing device determines a recommendation. (A) A first parameter matrix is updated using a first direction matrix and a first step-size parameter value that is greater than one. The first parameter matrix includes a row dimension equal to a number of users of a plurality of users included in a ratings matrix and the ratings matrix includes a missing matrix value. (B) A second parameter matrix is updated using a second direction matrix and a second step-size parameter value that is greater than one. The second parameter matrix includes a column dimension equal to a number of items of a plurality of items included in the ratings matrix. (C) An objective function value is updated based on the first parameter matrix and the second parameter matrix. (D) (A) through (C) are repeated until the first parameter matrix and the second parameter matrix satisfy a convergence test.
Some examples herein describe time-series recognition and analysis techniques with computer vision. In one example, a system can access an image depicting data lines representing time series datasets. The system can execute a clustering process to assign pixels in the image to pixel clusters. The system can generate image masks based on attributes of the pixel clusters, and identify a respective set of line segments defining the respective data line associated with each image mask. The system can determine pixel sets associated with the time series datasets based on the respective set of line segments associated with each image mask, and provide one or more pixel sets as input for a computing operation that processes the pixel sets and returns a processing result. The system may then display the processing result on a display device or perform another task based on the processing result.
G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
G06V 10/48 - Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
G06V 10/72 - Data preparation, e.g. statistical preprocessing of image or video features
G06V 10/762 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
69.
Diagnostic techniques for monitoring physical devices and resolving operational events
Operational events associated with a target physical device can be detected for mitigation by implementing some aspects described herein. For example, a system can apply a sliding window to received sensor measurements at successive time intervals to generate a set of data windows. The system can determine a set of eigenvectors associated with the set of data windows by performing principal component analysis on a set of data points in the set of data windows. The system can determine a set of angle changes between pairs of eigenvectors. The system can generate a measurement profile by executing an integral transform on the set of angle changes. One or more trained machine-learning models are configured to detect an operational event associated with the target physical device based on the measurement profile and generate an output indicating the operational event.
H02J 13/00 - Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
G06N 3/04 - Architecture, e.g. interconnection topology
G06K 9/62 - Methods or arrangements for recognition using electronic means
70.
Incremental singular value decomposition in support of machine learning
A singular value decomposition (SVD) is computed of a first matrix to define a left matrix, a diagonal matrix, and a right matrix. The left matrix, the diagonal matrix, and the right matrix are updated using an arrowhead matrix structure defined from the diagonal matrix and by adding a next observation vector to a last row of the first matrix. The updated left matrix, the updated diagonal matrix, and the updated right matrix are updated using a diagonal-plus-rank-one (DPR1) matrix structure defined from the updated diagonal matrix and by removing an observation vector from a first row of the first matrix. Eigenpairs of the DPR1 matrix are computed based on whether a value computed from the updated left matrix is positive or negative. The left matrix updated in (C), the diagonal matrix updated in (C), and the right matrix updated in (C) are output.
Logical rules can be automatically generated for use with event detection systems according to some aspects of the present disclosure. For example, a computing device can extract a group of logical rules from trained decision trees and apply a test data set to the group of logical rules to determine count values corresponding to the logical rules. The computing device can then determine performance metric values based on the count values, select a subset of logical rules from among the group of logical rules based on the performance metric values, and provide at least one logical rule in the subset for use with an event detection system. The event detection system can be configured to detect an event in relation to a target data set that was not used to train the decision trees.
A computing device trains a neural network machine learning model. A forward propagation of a first neural network is executed. A backward propagation of the first neural network is executed from a last layer to a last convolution layer to compute a gradient vector. A discriminative localization map is computed for each observation vector with the computed gradient vector using a discriminative localization map function. An activation threshold value is selected for each observation vector from at least two different values based on a prediction error of the first neural network. A biased feature map is computed for each observation vector based on the activation threshold value selected for each observation vector. A masked observation vector is computed for each observation vector using the biased feature map. A forward and a backward propagation of a second neural network is executed a predefined number of iterations using the masked observation vector.
The computing device transforms lab data and field data into a first format suitable for execution with a supervised machine learning model to determine an input variable importance for a first set of input variables in predicting a field outcome. Based on the determination, the computing device generates one or more logical rules of decision metrics, selects the one or more input variables that yields a higher input variable importance, and generates one or more pass-fail indicators. The computing device combines the one or more pass-fail indicators and generates one or more prediction factor rules. The computing device transforms the field data and the one or more prediction factor rules into a second format suitable for execution with a model to determine a treatment effect for the one or more prediction factor rules. The computing device selects the prediction factor rule that maximizes the treatment effect.
An apparatus includes a processor to: within a performance container, execute a performance routine to derive an order of performance of tasks of a job flow based on dependencies, begin performing the tasks, and store, within a job queue, a job performance status indication including task performance statuses; identify a set of sub flows within the job flow based on branches in the job flow; correlate each of the task performance statuses to a corresponding sub flow performance status; reduce the job performance status indication size by, for each sub flow in which all tasks have been completed, replace the corresponding task performance statuses with the corresponding sub flow performance status of completed, and for each sub flow with no task performed, replace the corresponding task performance statuses with the corresponding sub flow performance status of not executed; and transmit the job performance status indication to the requesting device.
A computing device responds to a membership overlap query. A list of unique member identifiers included in a plurality of datasets is created. A list of datasets of the plurality of datasets is defined for each unique member identifier. Each dataset included in the list of datasets includes a unique member associated with a respective unique member identifier. A unique list of datasets is defined from each list of datasets. A number of occurrences of each unique list of datasets is determined. A number of datasets included in each unique list of datasets is determined. Intersection data is created that includes a dataset list of each unique list of datasets in association with the number of occurrences of each respective, unique list of datasets and with the number of datasets included in each respective, unique list of datasets. An overlap response is determined using the created intersection data.
A computing system creates interaction features from variable values in a transformed dataset that includes a variable value computed for each variable of transformed variables computed from a prior execution of a transformation flow applied to an input dataset. An interaction transformation flow definition indicates a subset of the transformed variables, a synthesis definition, and interaction transformation operations to apply to the transformed variables. The synthesis definition describes how the subset of the transformed variables are combined to compute a value input to the interaction transformation operations. A plurality of variable combinations of the subset is defined. A computation is defined for each combination and interaction transformation operation. An operation data value is computed for each computation from the transformed dataset. An observation vector is read from the transformed dataset and a current interaction variable value is synthesized for each combination. A result value is computed for each combination.
An apparatus includes processor(s) to: perform pre-processing operations including derive an audio noise level of speech audio of a speech data set, derive a first relative weighting for first and second segmentation techniques for identifying likely sentence pauses in the speech audio based on the audio noise level, and select likely sentence pauses for a converged set of likely sentence pauses from likely sentence pauses identified by the first and/or second segmentation techniques based on the first relative weighting; and perform speech-to-text processing operations including divide the speech data set into data segments representing speech segments of the speech audio based on the converged set of likely sentence pauses, and derive a second relative weighting based on the audio noise level for selecting words indicated by an acoustic model or by a language model as being most likely spoken in the speech audio for inclusion in a transcript.
G06N 3/04 - Architecture, e.g. interconnection topology
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
78.
Dual use of acoustic model in speech-to-text framework
An apparatus includes processor(s) to: perform preprocessing operations of a segmentation technique including divide speech data set into data chunks representing chunks of speech audio, use an acoustic model with each data chunk to identify pauses in the speech audio, and analyze a length of time of each identified pause to identify a candidate set of likely sentence pauses in the speech audio; and perform speech-to-text operations including divide the speech data set into data segments that each representing segments of the speech audio based on the candidate set of likely sentence pauses, use the acoustic model with each data segment to identify likely speech sounds in the speech audio, analyze the identified likely speech sounds to identify candidate sets of words likely spoken in the speech audio, and generate a transcript of the speech data set based at least on the candidate sets of words likely spoken.
G06N 3/04 - Architecture, e.g. interconnection topology
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/02 - Feature extraction for speech recognition; Selection of recognition unit
A computing system trains a classification model using distributed training data. In response to receipt of a first request, a training data subset is accessed and sent to each higher index worker computing device, the training data subset sent by each lower index worker computing device is received, and a first kernel matrix block and a second kernel matrix block are computed using a kernel function and the accessed or received training data subsets. (A) In response to receipt of a second request from the controller device, a first vector is computed using the first and second kernel matrix blocks, a latent function vector and an objective function value are computed, and the objective function value is sent to the controller device. (A) is repeated until the controller device determines training of a classification model is complete. Model parameters for the trained classification model are output.
Testing for software applications can be implemented according to some aspects described herein. For example, a system can receive override data, including a location of a logical statement in source code and an override command, that is associated with a software application. The system can generate debugging data based on the override data, the debugging data including a breakpoint associated with the location and a debugger command corresponding to the override command. The system can then provide the debugging data as input to debugging software, the debugging software being configured to monitor execution of the software application during a software test. The debugging software can determine that the breakpoint has been reached and responsively execute the debugger command for testing a target portion of source code for the software application.
Unclassified observations are classified. Similarity values are computed for each unclassified observation and for each target variable value. A confidence value is computed for each unclassified observation using the similarity values. A high-confidence threshold value and a low-confidence threshold value are computed from the confidence values. For each observation, when the confidence value is greater than the high-confidence threshold value, the observation is added to a training dataset and, when the confidence value is greater than the low-confidence threshold value and less than the high-confidence threshold value, the observation is added to the training dataset based on a comparison between a random value drawn from a uniform distribution and an inclusion percentage value. A classification model is trained with the training dataset and classified observations. The trained classification model is executed with the unclassified observations to determine a label assignment.
A computing device trains a neural network machine learning model. A forward propagation of a first neural network is executed. A backward propagation of the first neural network is executed from a last layer to a last convolution layer of a plurality of convolutional layers to compute a gradient vector for first weight values of the last convolution layer using observation vectors. A discriminative localization map is computed for each observation vector with the gradient vector using a discriminative localization map function. A forward and a backward propagation of a second neural network is executed to compute a second weight value for each neuron of the second neural network using the discriminative localization map computed for each observation vector. A predefined number of iterations of the forward and the backward propagation of the second neural network is repeated.
A computing device classifies unclassified observations. A first batch of unclassified observation vectors and a first batch of classified observation vectors are selected. A prior regularization error value and a decoder reconstruction error value are computed. A first batch of noise observation vectors is generated. An evidence lower bound (ELBO) value is computed. A gradient of an encoder neural network model is computed, and the ELBO value is updated. A decoder neural network model and an encoder neural network model are updated. The decoder neural network model is trained. The target variable value is determined for each observation vector of the unclassified observation vectors based on an output of the trained decoder neural network model. The target variable value is output.
A visualization is presented while tuning a machine learning model. A model tuning process writes tuning data to a history table. The model tuning process is repeatedly training and scoring a model type with different sets of values of hyperparameters defined based on the model type. An objective function value is computed for each set of values of the hyperparameters. Data stored in the history table is accessed and used to identify the hyperparameters. (A) A page template is selected from page templates that describe graphical objects presented in the display. (B) The page template is updated with the accessed data. (C) The display is updated using the updated page template. (D) At the end of a refresh time period, new data stored in the history table by the model tuning process is accessed. (E) (B) through (D) are repeated with the accessed data replaced with the accessed new data.
Requests for computing resources and other resources can be predicted and managed. For example, a system can determine a baseline prediction indicating a number of requests for an object over a future time-period. The system can then execute a first model to generate a first set of values based on seasonality in the baseline prediction, a second model to generate a second set of values based on short-term trends in the baseline prediction, and a third model to generate a third set of values based on the baseline prediction. The system can select a most accurate model from among the three models and generate an output prediction by applying the set of values output by the most accurate model to the baseline prediction. Based on the output prediction, the system can cause an adjustment to be made to a provisioning process for the object.
A computing system determines a response to a query. A bin start value and a bin stop value is defined for each bin based on an input bin option. End nodes are split based on the bin start value and the bin stop value of each bin to define a second plurality of end nodes. Each start node of a plurality of start nodes that is connected to each end node of the second plurality of end nodes is identified based on the respective link attributes of a plurality of link attributes. Overlapping start nodes of the plurality of start nodes that overlap at an end node of the second plurality of end nodes are identified based on a predefined overlap query graph that defines a connectivity to identify between a start node and the end node. The identified overlapping start nodes are output as a response to the predefined overlap query graph.
A computing system displays an initial graph with icons. Each icon graphically represents data associated with a respective entity. The first icon is connected in the initial graph to other icon(s). The system receives an indication of a graphical network pattern. The graphical network pattern is defined by a user selection of a second icon in the initial graph and: a user selection of a third icon in the initial graph; or a user selection of a graphical representation in the initial graph of a relationship between the second icon and the third icon. The system sends computer instructions indicating a network pattern query for searching an electronic database for electronic record(s) corresponding to a queried network pattern. The system receives a dataset indicating located electronic record(s) corresponding to the queried network pattern. The system generates output data indicating an output graph for a graphical representation of the located record(s).
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
Geospatial data can be converted into audio outputs. For example, a system can receive a dataset indicating geospatial locations of objects within a region. Based on the dataset, the system can generate a virtual map representing the region and including virtual points representing the objects. The virtual points can be spatially positioned at locations in the virtual map corresponding to the geospatial locations of the objects in the region. The system can receive a user input via a user input device for interacting with a particular virtual point among the virtual points in the virtual map. The system can determine one or more sound characteristics for a sound based on receiving the user input. The system can then transmit an audio signal to an audio device for causing the audio device to generate the sound having the one or more sound characteristics, which may assist with exploring the virtual map.
An apparatus includes processor(s) to: use an acoustic model to generate a first set of probabilities of speech sounds uttered within speech audio; derive at least a first candidate word most likely spoken in the speech audio using the first set; analyze the first set to derive a degree of uncertainty therefor; compare the degree of uncertainty to a threshold; in response to at least the degree of uncertainty being less than the threshold, select the first candidate word as a next word most likely spoken in the speech audio; in response to at least the degree of uncertainty being greater than the threshold, select, as the next word most likely spoken in the speech audio, a second candidate word indicated as being most likely spoken based on a second set of probabilities generated by a language model; and add the next word most likely spoken to a transcript.
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/04 - Segmentation; Word boundary detection
G06N 3/04 - Architecture, e.g. interconnection topology
G10L 25/78 - Detection of presence or absence of voice signals
An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; derive a threshold amplitude based on at least one peak amplitude of the speech audio; designate each data chunk with a peak amplitude below the threshold amplitude a pause data chunk; within a set of temporally consecutive data chunks of the multiple data chunks, identify a longest subset of temporally consecutive pause data chunks; within the set of temporally consecutive data chunks, designate the longest subset of temporally consecutive pause data chunks as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in each speech segment.
G06N 3/04 - Architecture, e.g. interconnection topology
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
91.
Distributable event prediction and machine learning recognition system
Data is classified using semi-supervised data. Sparse coefficients are computed using a decomposition of a Laplacian matrix. (B) Updated parameter values are computed for a dimensionality reduction method using the sparse coefficients, the Laplacian matrix, and a plurality of observation vectors. The updated parameter values include a robust estimator of a decomposition matrix determined from the decomposition of the Laplacian matrix. (B) is repeated until a convergence parameter value indicates the updated parameter values for the dimensionality reduction method have converged. A classification matrix is defined using the sparse coefficients and the robust estimator of the decomposition of the Laplacian matrix. The target variable value is determined for each observation vector based on the classification matrix. The target variable value is output for each observation vector of the plurality of unclassified observation vectors and is defined to represent a label for a respective unclassified observation vector.
A computer transforms high-dimensional data into low-dimensional data. (A) A distance matrix is computed from observation vectors. (B) A kernel matrix is computed from the distance matrix using a bandwidth value. (C) The kernel matrix is decomposed using an eigen decomposition to define eigenvalues. (D) A predefined number of largest eigenvalues are selected from the eigenvalues. (E) The selected largest eigenvalues are summed. (F) A next bandwidth value is computed based on the summed eigenvalues. (A) through (F) are repeated with the next bandwidth value until a stop criterion is satisfied. Each observation vector of the observation vectors is transformed into a second space using a kernel principal component analysis with the next bandwidth value and the kernel matrix. The second space has a dimension defined by the predefined number of first eigenvalues. Each transformed observation vector is output.
Computing resources consumed in performing computerized sequence-mining can be reduced by implementing some examples of the present disclosure. In one example, a system can determine weights for data entries in a data set and then select a group of data entries from the data set based on the weights. Next, the system can determine a group of k-length sequences present in the selected group of data entries by applying a shuffling algorithm. The system can then determine frequencies corresponding to the group of k-length sequences and select candidate sequences from among the group of k-length sequences based on the frequencies thereof. Next, the system can determine support values corresponding to the candidate sequences and then select output sequences from among the candidate sequences based on the support values thereof. The system may then transmit an output signal indicating the selected output sequences an electronic device.
A computing system receives geolocation information indicating aggregated locations of mobile devices configured to move in a geographic area. The geolocation information comprises measured location(s) for a given mobile device of the mobile devices. The system generates a time series representing mobility network graphs over a first time period. The time series is generated by, for each subperiod in the time series, generating data representing estimated movement of member(s) of a population between locations within the geographic area. The estimated movement is estimated based on the geolocation information and a total population for the geographic area. The system generates metric(s) derived from the time series. The system determines contamination information indicating a respective contamination status for locations for each subperiod of the time series. The system generates a computer model to predict changes in the contamination information in a second time period subsequent to the first time period.
Computerized pipelines can transform input data into data structures compatible with models in some examples. In one such example, a system can obtain a first table that includes first data referencing a set of subjects. The system can then execute a sequence of processing operations on the first data in a particular order defined by a data-processing pipeline to modify an analysis table to include features associated with the set of subjects. Executing each respective processing operation in the sequence to generate the modified analysis table may involve: deriving a respective set of features from the first data by executing a respective feature-extraction operation on the first data; and adding the respective set of features to the analysis table. The system may then execute a predictive model on the modified analysis table for generating a predicted value based on the modified analysis table.
An apparatus includes a processor to: based on data dependencies specified in a job flow definition, identify first and second tasks of the corresponding job flow to be performed sequentially, wherein the first task outputs a data object used as an input to the second; store, within a task queue, at least one message conveying at least an identifier of the first task, and an indication that the data object is to be exchanged through a shared memory space; within a task container, in response to storage of the at least one message within the task queue, sequentially execute first and second task routines to sequentially perform the first and second tasks, respectively, and instantiate the shared memory space to be accessible to the first and second task routines during their executions; and upon completion of the job flow, transmit an indication of completion to another device via a network.
A computing system computes a variable relevance using a trained tree model. (A) A next child node is selected. (B) A number of observations associated with the next child node is computed. (C) A population ratio value is computed. (D) A next leaf node is selected. (E) First observations are identified. (F) A first impurity value is computed for the first observations. (G) Second observations are identified when the first observations are associated with the descending child nodes. (H) A second impurity value is computed for the second observations. (I) A gain contribution is computed. (J) A node gain value is updated. (K) (D) through (J) are repeated. (L) A variable gain value is updated for a variable associated with the split test. (M) (A) through (L) are repeated. (N) A set of relevant variables is selected based on the variable gain value.
Tuned hyperparameter values are determined for training a machine learning model. When a selected hyperparameter configuration does not satisfy a linear constraint, if a projection of the selected hyperparameter configuration is included in a first cache that stores previously computed projections is determined. When the projection is included in the first cache, the projection is extracted from the first cache using the selected hyperparameter configuration, and the selected hyperparameter configuration is replaced with the extracted projection in the plurality of hyperparameter configurations. When the projection is not included in the first cache, a projection computation for the selected hyperparameter configuration is assigned to a session. A computed projection is received from the session for the selected hyperparameter configuration. The computed projection and the selected hyperparameter configuration are stored to the first cache, and the selected hyperparameter configuration is replaced with the computed projection.
A computing device classifies unclassified observations. A first batch of noise observations is generated. (A) A first batch of unclassified observations is selected. (B) A first batch of classified observations is selected. (C) A discriminator neural network model trained to classify unclassified observations and noise observations is updated with observations that include the first batch of unclassified observations, the first batch of classified observations, and the first batch of noise observations. (D) A discriminator loss value is computed that includes an adversarial loss term computed using a predefined transition matrix. (E) A second batch of unclassified observations is selected. (F) A second batch of noise observations is generated. (G) A generator neural network model trained to generate a fake observation vector for the second batch of noise observations is updated with the second batch of unclassified observations and the second batch of noise observations. (H) (A) to (G) is repeated.
A computing system trains a reinforcement learning model comprising multiple different attention model components. The reinforcement learning model trains on training data of a first environment (e.g., a first traffic intersection). The reinforcement learning model trains by training a state attention computer model on the training data that weighs each of respective inputs of a respective state. The reinforcement learning model trains by training an action attention computer model that determines a probability of switching from a first action to a second action of the first set of the multiple candidate actions (e.g., changing traffic colors of traffic lights).
Alternatively, or additionally, a computing system generates an indication of a selected outcome according to the reinforcement learning model and sends a selection output to the second environment (e.g., a second traffic intersection with more lanes than the first traffic intersection) to implement the selected action in the second environment.