Systems and methods are disclosed that relate to freespace detection using machine learning models. First data that may include object labels may be obtained from a first sensor and freespace may be identified using the first data and the object labels. The first data may be annotated to include freespace labels that correspond to freespace within an operational environment. Freespace annotated data may be generated by combining the one or more freespace labels with second data obtained from a second sensor, with the freespace annotated data corresponding to a viewable area in the operational environment. The viewable area may be determined by tracing one or more rays from the second sensor within the field of view of the second sensor relative to the first data. The freespace annotated data may be input into a machine learning model to train the machine learning model to detect freespace using the second data.
Apparatuses, systems, and techniques to perform channel estimation on one or more signals. In at least one embodiment, channel estimation on one or more wireless signals is performed in parallel based on one or more frequencies of one or more signals.
Apparatuses, systems, and techniques to generate images. In at least one embodiment, one or more neural networks are used to generate a panoramic image from a segmentation mask.
According to various embodiments, a processing subsystem includes a first printed circuit board (PCB); a processor mounted directly on a first side of the first PCB; and one or more power components. The one or more power components are coupled to a second side of the first PCB and electrically coupled to the processor, where the first side of the first PCB is opposite to the second side of the first PCB.
A system (1100) for testing multiple devices (250) includes a connector holder (208) having a plurality of holes (908), wherein each hole (908) included in the plurality of holes (908) is configured to hold a respective cable connector (960) that connects to a cable (270); a device holder (240) that is configured to hold a first device (250) in a testing position; and an engagement mechanism (222) that supports the connector holder (208) and is operable to move the connector holder (208) to an engaged position. When the first device (250) is being held by the device holder (240) in the testing position, and a first hole (908) included in the plurality of holes (908) holds a first cable connector (960), a contact point (920) associated with the first cable connector (960) contacts a signal pad (804) associated with the first device (250).
G01R 31/26 - Testing of individual semiconductor devices
G01R 31/00 - Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
In various examples, a surface may be estimated using depth data for autonomous systems and applications. One or more software components or modules may use the depth data (e.g., 3D LiDAR point cloud data) in addition to ego-motion data (e.g., data representative of location, heading, speed, and/or pose of the ego-machine) to generate a non-parametric model of the ground or driving surface. In some embodiments, an iterative process may be used to generate and iteratively refine estimated surface values by minimizing (or approximating minimization of) a cost function that penalizes deviation between measured values and estimated values and/or deviations among adjacent measured values. The systems and applications described herein may include robust real-time or near real-time ground surface estimation relying on generated data, and may further include a large-scale offline ground surface estimation approach that is non-causal and uses (e.g., all) available data at once.
Apparatuses, systems, and techniques to process neural networks. In at least one embodiment, neural network graph data is organized for processing. In at least one embodiment, for example, neural network graph data is organized based, at least in part, on one or more sparsity constraints.
Apparatuses, systems, and techniques of one or more neural networks to generate one or more variations of an image based, at least in part, on one or more locations of textual information in the image. In at least one embodiment, a neural network is trained by variations of an image to idenify text in an image.
In various examples, feature values corresponding to a plurality of views are transformed into feature values of a shared orientation or perspective to generate a feature map – such as a Bird's-Eye-View (BEV), top-down, orthogonally projected, and/or other shared perspective feature map type. Feature values corresponding to a region of a view may be transformed into feature values using a neural network. The feature values may be assigned to bins of a grid and values assigned to at least one same bin may be combined to generate one or more feature values for the feature map. To assign the transformed features to the bins, one or more portions of a view may be projected into one or more bins using polynomial curves. Radial and/or angular bins may be used to represent the environment for the feature map.
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
Apparatuses, systems, and techniques are presented to generate digital reconstructions of physical objects. In at least one embodiment, one or more first neural networks are used to generate a three-dimensional (3D) model having a first level of detail, and one or more second neural networks are used to modify the 3D model to have a second level of detail.
A deep neural network can be trained to output motion or deformation information for a character that is representative of the character uttering speech contained in audio input, which is accurate for an emotional state of the character. The character can have different facial components or regions (e.g., head, skin, eyes, tongue) modeled separately, such that the network can output motion or deformation information for each of these different facial components. During training, the network can be provided with emotion and/or style vectors that indicate information to be used in generating realistic animation for input speech, as may relate to one or more emotions to be exhibited by the character, a relative weighting of those emotions, and any style or adjustments to be made to how the character expresses that emotional state. The network output can be provided to a renderer to generate audio-driven facial animation that is emotion-accurate.
G10L 21/10 - Transforming into visible information
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
12.
INFERRING EMOTION FROM SPEECH IN AUDIO DATA USING DEEP LEARNING
A deep neural network can be trained to infer emotion data from input audio. The network can be a transformer-based network that can infer probability values for a set of emotions or emotion classes. The emotion probability values can be modified using one or more heuristics, such as to provide for smoothing of emotion determinations over time, or via a user interface, where a user can modify emotion determinations as appropriate. A user may also provide prior emotion values to be blended with these emotion determination values. Determined emotion values can be provided as input to an emotion-based operation, such as to provide audio-driven speech animation.
Apparatuses, systems, and techniques are presented to train neural networks and use those neural networks for inferencing tasks. In at least one embodiment, one or more neural networks are caused to be trained using weight parameters based, at least in part, on an amount of training data used to train the one or more neural networks.
Apparatuses, systems, and techniques to evaluate neural networks. In at least one embodiment, neural networks are evaluated using one or more other neural networks. In at least one embodiment, two or more neural networks are caused to generate consistent results from first input information and caused to generate inconsistent results from second input information.
A system comprises at least one circuit to detect whether a fault has occurred during performance of an operation by the at least one circuit. The at least one circuit generates error detecting values and determines a fault has occurred when the error detecting values do not match predetermined error detecting data.
Apparatuses, systems, and techniques to cause one or more portions of one or more neural networks to be trained. In at least one embodiment, one or more portions of one or more neural networks are caused to be trained by, for example, iteratively adjusting precision of weight parameters associated with the one or more portions based, at least in part, on one or more performance metrics of the one or more portions.
Apparatuses, systems, and techniques are presented to determination about objects in an environment. In at least one embodiment, a neural network can be used to determine one or more positions of one or more objects within a three-dimensional (3D) environment and to generate a segmented map of the 3D environment based, at least in part, on one or more two dimensional (2D) images of the one or more objects.
In various examples, cached sensor data captured by an ego-object and ego-motion of the ego-object are used to reconstruct the area under the vehicle in real time. For example, image data captured over time by a vehicle may be cached into a composite map that visualizes the ground or drivable area, and the vehicle's ego-motion may be used to retrieve a region of the composite map corresponding to the under vehicle area. For each time slice, a newly captured or generated image representing that time slice may be used to generate a local map of an observed portion of the ground, and the local map may be merged with a composite map that represents previously observed local maps. Accordingly, the under vehicle area for that time slice may be reconstructed by retrieving corresponding pixels from the composite map using the vehicle's ego-motion.
B60R 1/27 - Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view providing all-round vision, e.g. using omnidirectional cameras
19.
IMAGE STITCHING WITH DYNAMIC SEAM PLACEMENT BASED ON OBJECT SALIENCY FOR SURROUND VIEW VISUALIZATION
In various examples, dynamic seam placement is used to position seams in regions of overlapping image data to avoid crossing salient objects or regions. Objects may be detected from image frames representing overlapping views of an environment surrounding an ego-object such as a vehicle. The images may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with regions of overlapping image data, and a representation of the detected objects and/or salient regions (e.g., a saliency mask) may be generated and projected onto the aligned composite image or surface. Seams may be positioned in the overlapping regions to avoid or minimize crossing salient pixels represented in the projected masks, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).
In various examples, a state machine is used to select between a default seam placement or dynamic seam placement that avoids salient regions, and to enable and disable dynamic seam placement based on speed of ego-motion, direction of ego-motion, proximity to salient objects, active viewport, driver gaze, and/or other factors. Images representing overlapping views of an environment may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with overlapping regions of image data, and a default or dynamic seam placement may be selected based on driving scenario (e.g., driving direction, speed, proximity to nearby objects). As such, seams may be positioned in the overlapping regions of image data, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).
In various examples, an environment surrounding an ego-object is visualized using an adaptive 3D bowl that models the environment with a shape that changes based on distance (and direction) to one or more representative point(s) on detected objects. Distance (and direction) to detected objects may be determined using 3D object detection or a top-down 2D or 3D occupancy grid, and used to adapt the shape of the adaptive 3D bowl in various ways (e.g., by sizing its ground plane to fit within the distance to the closest detected object, fitting a shape using an optimization algorithm). The adaptive 3D bowl may be enabled or disabled during each time slice (e.g., based on ego-speed), and the 3D bowl for each time slice may be used to render a visualization of the environment (e.g., a top-down projection image, a textured 3D bowl, and/or a rendered view thereof).
B60R 1/27 - Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view providing all-round vision, e.g. using omnidirectional cameras
22.
OPTIMIZED VISUALIZATION STREAMING FOR VEHICLE ENVIRONMENT VISUALIZATION
In various examples, sensor data may be captured by sensors of an ego-object, such as a vehicle traveling in a physical environment, and a representation of the sensor data may be streamed from the ego-object to a remote location to facilitate various remote experiences, such as streaming to a remote viewer (e.g., a friend or relative), streaming to a remote or fleet operator, streaming to a mobile app configured to self-park or summon an ego-object, rendering a 3D augmented reality (AR) or virtual reality (VR) representation of the physical environment, and/or others. In some embodiments, the stream includes one or more command channels used to control data collection, rendering, stream content, or even vehicle maneuvers, such as during an emergency, self-park, or summon scenario.
B60R 1/27 - Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view providing all-round vision, e.g. using omnidirectional cameras
G05D 1/00 - Control of position, course, altitude, or attitude of land, water, air, or space vehicles, e.g. automatic pilot
23.
APPLICATION PROGRAMMING INTERFACE TO IDENTIFY LOCATION OF PROGRAM PORTIONS
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Apparatuses, systems, and techniques to perform collective operations using parallel processing. In at least one embodiment, a non-blocking application programming interface allow programs to improve performance of one or more collective operations on a GPU.
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to transfer information between a plurality of fifth generation new radio (5G-NR) computing using different transport protocols.
Apparatuses, systems, and techniques to generate a robust representation of an image. Input tokens (104) of an input image are received, and an inference (110) about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention (106) module to perform token mixing and a channel self-attention (108) module to perform channel processing.
In a method for encryption of sensitive data, an encrypted user private key is received in a Trusted Execution Environment (TEE) in a worker node in a container management system, the encrypted user private key being an encrypted version of a user private key for decrypting a message from a user in the container management system. The user private key is obtained in the TEE, and the encrypted user private key being decrypted into the user private key with a provider private key that is received from an encryption manager for managing the container management system. The user private key may be transmitted to the worker node safely, such that the worker node may use the user private key to decrypt messages from the user. Therefore, the security level of the container management system may be increased.
Disclosed are apparatuses, systems, and techniques that may perform methods of pyramid optical flow processing with efficient identification and handling of object boundary pixels. In pyramid optical flow, motion vectors for pixels of image layers having a coarse resolution may be used as hints for identification of motion vectors for pixels of image layers having a higher resolution. Pixels that are located near apparent boundaries between foreground and background objects may receive multiple hints from lower-resolution image layers, for more accurate identification of matching pixels across different image levels of the pyramid.
Apparatuses, systems, and techniques are presented to generate one or more images. One or more neural networks are used to generate one or more images of one or more objects based, at least in part, on a model of the one or more objects and texture information.
Apparatuses, systems, and techniques for pre-loading a software application in a cloud computing environment. A method can include sending a pre-load request to pre-load a first portion of data for an application hosted at an application hosting platform, the pre-load request being received before receiving user input identifying the application for execution. The method can include receiving a first indication that the first portion of data is pre-loaded and receiving a user request to execute the application. The method can further include sending a load request to load a second portion of data for the application, receiving a second indication that the second portion of data is loaded for the application, and causing the application to execute at the virtualized computing environment in response to receiving the second indication.
A system to track objects in an environment using projection images generated from LiDAR is disclosed. A deep neural network (DNN) computes a motion mask indicative of motion corresponding to points representing objects in the environment. The environment (200) includes a location (202A) for an ego-machine at T-1, a location (202B) for the ego-machine at TO, ego-trajectory (204), vehicle location (206A), vehicle location (206B), wall (208), and 3D points (210), (212), and (214). If the system determines that a depth value corresponding to a 3D point, such as 3D point (210), has changed over time, the system may infer that movement has occurred at that particular 3D point. If a first measured distance for a pixel from the location (202A) for the ego-machine at T-1 to the 3D point (214) on the wall (208), is 10 meters in a previous range image at time T-l, and a second measured distance for the pixel from the location (202B) for the ego-machine at TO to the 3D point (210), is 5 meters in a current range image at time TO, then the system may determine that a vehicle has moved from vehicle location (206A) to vehicle location (206B) and now obstructs the line-of-sight of the sensor(s) of the ego-machine at location (202B) such that the ego-machine is unable to measure the distance to the 3D point (212) from the location (202B). Projection may be based on tracked ego-motion.
Apparatuses, systems, and techniques to generate an image. In at least one embodiment, one or more neural networks are to generate a second image based, at least in part, on a first image and information indicating zero or more differences between the first and second image.
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
34.
HIGH DEFINITION (HD) MAP CONTENT REPRESENTATION AND DISTRIBUTION FOR AUTONOMOUS VEHICLES
In various examples, a network of servers, such as a content delivery network, is used to provide a lightweight approach to hosting and serving HD map data to vehicles. The lightweight approach may allow for modifying various map components, such as tiles, layers, and/or segments. Modifying may include adding, removing, and/or updating the various components. A request to modify a first version of a High definition (HD) map may be received. Map data may be recorded that represents a second version of the HD map. A second request associated with the HD map may be received from a vehicle. Based on this second request, second map data representative of at least a portion of a layer may be identified on at least one server of the network of servers. The second map data may then be transmitted to the vehicle by the network of servers.
Disclosed are apparatuses, systems, and techniques to perform and facilitate fast and efficient modular computational operations, such as modular division and modular inversion, using shared platforms, including hardware accelerator engines.
G06F 7/38 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
H04L 9/28 - Arrangements for secret or secure communications; Network security protocols using particular encryption algorithm
36.
EFFICIENT MASKING OF SECURE DATA IN LADDER-TYPE CRYPTOGRAPHIC COMPUTATIONS
Disclosed are apparatuses, systems, and techniques to perform and facilitate secure ladder computational operations whose iterative execution depends on secret values associated with input data. Masking factors that re-blind secret data without exposing the unmasked secret data are used between iterations of the ladder computations. Montgomery multiplication techniques to facilitate secret data masking are used by efficiently avoiding modular division operations. The vulnerability of ladder computations to adversarial side-channel attacks can be significantly reduced.
Disclosed are apparatuses, systems, and techniques to perform and facilitate secure ladder computational operations whose iterative execution depends on secret values associated with input data. Disclosed embodiments balance execution of various iterations in a way that is balanced for different secret values, significantly reducing vulnerability of ladder computations to adversarial side-channel attacks.
Disclosed are apparatuses, systems, and techniques to perform and facilitate fast and efficient modular computational operations, such as Montgomery multiplication with reduced interdependencies, using optimized processing resources.
G06F 7/72 - Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations using residue arithmetic
Apparatuses, systems, and techniques to modify tensors based on processor requirements. Input tensors and weight tensors are modified to meet processing resource requirements.
Apparatuses, systems, and methods for verifying fingerprints associated with components to be installed on printed circuit boards (PCBs). In at least one embodiment, one or more processors determine whether a component fingerprint associated with a component to be installed on the PCB corresponds to an expected fingerprint, the component fingerprint based, at least in part, on a firmware version associated with the component.
G06F 21/73 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by creating or determining hardware identification, e.g. serial numbers
Apparatuses, systems, and methods for communication interfaces of a programmable part. In at least one embodiment, one or more communication interfaces are secured to a top side of a programmable part to provide programmable access to the programmable part after installation on a printed circuit board (PCB), the one or more communication interfaces to be selectively disabled based, at least in part, on a status of the programmable part.
G06F 21/76 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in application-specific integrated circuits [ASIC] or field-programmable devices, e.g. field-programmable gate arrays [FPGA] or programmable logic devices [PLD]
42.
IDENTIFYING OBJECTS USING NEURAL NETWORK-GENERATED DESCRIPTORS
Apparatuses, systems, and techniques are presented to identify one or more objects. In at least one embodiment, one or more neural networks can be used to identify one or more objects based, at least in part, on one or more descriptors of one or more segments of the one or more objects.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a first three-way flow controller is associated with a single-phase fluid and a second three-way flow controller is associated with a two-phase fluid, with a first three-way flow controller to enable a first flow path of a single-phase fluid from a coolant distribution unit to a cold plate or to enable a second flow path to a heat exchanger to cool a two-phase fluid to be used in a cold plate, and with a second three-way flow controller to enable a third flow path of a two-phase fluid to a cold plate or to enable a fourth flow path to a heat exchanger.
Apparatuses, systems, and techniques to decompress data in parallel. In at least one embodiment, decompressing a variable-length-coded data stream speculatively decodes overlapping portions of said data stream to determine locations to begin correctly decoding said data stream.
Apparatuses, systems, and techniques to select cache policies. In at least one embodiment, a system causes one or more cache policies of one or more caches to be selected based, at least in part, on one or more neural networks to use data stored in the one or more caches.
G06F 12/0875 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
Apparatuses, systems, and techniques to enable a program to access data regardless of where said data is stored. In at least one embodiment, a system enables a program to access data regardless of where said data is stored, based on, for example, one or more locations encoding one or more addresses of said data.
Apparatuses, systems, and techniques to indicate contextual information to be used by available logical processors. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to indicate a first set of contextual information to be used by a first subset of available processors.
Apparatus, systems, and techniques to transform data in memory for deep learning operations. In at least one embodiment, a compiler inserts one or more data transforms into a software program to transform one or more data elements arbitrarily arranged in memory and improve performance of one or more deep learning operations.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a plurality of in-rack coolant distribution units (IRCDUs) include a first IRCDU and a second IRCDU that are interchangeable within a rack depending on a type of coolant to be provided to a rack from a coolant distribution unit (CDU), so that a first IRCDU that is calibrated to a first coolant can distribute a first coolant and a second IRCDU that is calibrated to a second coolant can distribute a second coolant to a rack manifold of a rack.
Apparatuses, systems, and techniques to collect compute performance information. In at least one embodiment, an API is performed to cause two or more portions of at least one software program to be concurrently performed a plurality of times in order to generate one or more performance metrics.
Apparatuses, systems, and techniques to manage memory arrays. In at least one embodiment an application programming interface (API) is performed to disassociate a virtual address indicated by the API from a corresponding physical address.
Apparatuses, systems, and techniques to facilitate execution graph control. In at least one embodiment, an application programming interface comprising one or more parameters is used to control which of one or more portions of graph code are to be performed.
Apparatuses, systems, and techniques to facilitate execution graph control. In at least one embodiment, an application programming interface comprising one or more parameters is used to indicate which of one or more portions of graph code are to be performed.
Various embodiments include a memory device that recovers from write errors and read errors more quickly relative to prior memory devices. Certain patterns of write data and read data may result on poor signal quality on the memory interface between memory controllers and memory devices. The disclosed memory device, synchronously with the memory controller, scrambles read data before transmitting the data to the memory controller and descrambles received from the memory controller. The scrambling and descrambling results in a different pattern on the memory interface even for the same read data or write data. Therefore, when a write operation or a read operation fails, and the operation is replayed, the pattern transmitted on the memory interface is different when the operation is replayed. As a result, the memory device more easily recovers from data patterns that cause poor signal quality on the memory interface.
G06F 11/10 - Adding special bits or symbols to the coded information, e.g. parity check, casting out nines or elevens
G06F 11/14 - Error detection or correction of the data by redundancy in operation, e.g. by using different operation sequences leading to the same result
G06F 3/06 - Digital input from, or digital output to, record carriers
G11C 7/10 - Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
55.
APPLICATION PROGRAMMING INTERFACE TO CAUSE GRAPH CODE TO UPDATE A SEMAPHORE
Apparatuses, systems, and techniques to facilitate graph code synchronization between application programming interfaces. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause graph code to update a semaphore used by another API.
Apparatuses, systems, and techniques to facilitate graph code synchronization between application programming interfaces. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause graph code to wait on a semaphore used by another API.
G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
G06N 3/088 - Non-supervised learning, e.g. competitive learning
Apparatuses, systems, and techniques to enable image processing methods on a graphics processing unit (GPU). In at least one embodiment, seamless cubemapping is enabled with a flag contained within a function of an application programming interface (API).
Apparatuses, systems, and techniques to perform a matrix multiplication using parallel processing. In at least one embodiment, a matrix multiplication is divided into a set of tiles, with each tile processed with a prolog task, a calculation task, and an epilog task. The prolog tasks are performed by a dedicated set of threads, with the remaining tasks performed in an interleaved manner using two or more thread groups.
In various examples, animations may be generated using audio-driven body animation synthesized with voice tempo. For example, full body animation may be driven from an audio input representative of recorded speech, where voice tempo (e.g., a number of phonemes per unit time) may be used to generate a 1D audio signal for comparing to datasets including data samples that each include an animation and a corresponding 1D audio signal. One or more loss functions may be used to compare the 1D audio signal from the input audio to the audio signals of the datasets, as well as to compare joint information of joints of an actor between animations of two or more data samples, in order to identify optimal transition points between the animations. The animations may then be stitched together - e.g., using interpolation and/or a neural network trained to seamlessly stitch sequences together - using the transition points.
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G10L 21/10 - Transforming into visible information
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
In examples, locations of facial landmarks may be applied to one or more machine learning models (MLMs) to generate output data indicating profiles corresponding to facial expressions, such as facial action coding system (FACS) values. The output data may be used to determine geometry of a model. For example, video frames depicting one or more faces may be analyzed to determine the locations. The facial landmarks may be normalized, then be applied to the MLM(s) to infer the profile(s), which may then be used to animate the mode for expression retargeting from the video. The MLM(s) may include sub-networks that each analyze a set of input data corresponding to a region of the face to determine profiles that correspond to the region. The profiles from the sub-networks, along global locations of facial landmarks may be used by a subsequent network to infer the profiles for the overall face.
Configurations for communication interfaces are disclosed. In at least one embodiment, a processor includes one or more circuits to determine a firmware configuration for one or more server components and to transmit the firmware configuration at startup.
Systems and methods described relate to the generation of image content. In order to provide for smoothing between sequential images, but avoid introducing lag into lighting effects, light information can be compared for regions between consecutive rendered frames. Shading can be performed and the results compared for tiles of pixels to compute gradient values, such as by using a single light sample for each tile. A filtering pass can be performed with respect to these gradients, and this filtered, lower-resolution grid version can be upscaled into a full resolution, screen-sized image and the gradients transformed into confidence values. These confidence values can be used to determine an extent to which to keep lighting data from the previous frame with respect to the current frame. For example, less lighting information can be used from the prior frame for a given pixel location if the confidence for that location is lower.
One embodiment of the present invention sets forth a technique for generating a bounding volume hierarchy. The technique includes determining a first set of objects associated with a first node. The technique also includes generating a first plurality of child nodes that are associated with the first node. The technique further includes for each object included in the first set of objects, storing within the object an identifier for a corresponding child node included in the first plurality of child nodes based on a first set of partitions associated with the first set of objects.
Various embodiments include a memory device that is capable of performing memory access operations with reduced power consumption relative to prior approaches. The memory device receives early indication as to whether a forthcoming memory access operation is a read operation or a write operation. The memory device enables various circuits and disables other circuits depending on whether this early indication identifies an upcoming memory access operation as a read operation or a write operation. As a result, circuits that are not needed for an upcoming memory access operation are disabled earlier during the memory access operation relative to prior approaches. Disabling such circuits earlier during the memory access operation reduces power consumption without reducing memory device performance.
Apparatuses, systems, and techniques to configure processor partitioning for a multi-process service. In at least one embodiment, a multi-process service configures a set of streaming multiprocessors of one or more parallel processing units to perform one or more threads based on one or more user-defined data values accessible to a parallel processing library, such as compute uniform device architecture (CUDA).
In examples, a device's native input interface (e.g., a soft keyboard) may be invoked using interaction areas associated with image frames from an application, such as a game. An area of an image frame(s) from a streamed game video may be designated (e.g., by the game and/or a game server) as an interaction area. When an input event associated with the interaction area is detected, an instruction may be issued to the client device to invoke a user interface (e.g., a soft keyboard) of the client device and may cause the client device to present a graphical input interface. Inputs made to the presented graphical input interface may be accessed by the game streaming client and provided to the game instance.
A63F 13/355 - Performing operations on behalf of clients with restricted processing capabilities, e.g. servers transform changing game scene into an MPEG-stream for transmitting to a mobile phone or a thin client
A63F 13/533 - Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game for prompting the player, e.g. by displaying a game menu
A63F 13/214 - Input arrangements for video game devices characterised by their sensors, purposes or types for locating contacts on a surface, e.g. floor mats or touch pads
Apparatuses, systems, and techniques to determine dimensions of one or more sets of data. In at least one embodiment, a processor causes one or more dimensions of one or more sets of data to be determined using one or more dimensional constraints of the one or more sets of data.
Apparatuses, systems, and techniques to generate code to be performed by one or more first processors based, at least in part, on one or more indications of data to be used by one or more second processors. In at least one embodiment, a CUDA program includes host 5 code and device code, and a linker uses references for code elements in host code to link or prune code elements from device code.
Apparatuses, systems, and techniques to infer information from one or more sets of data. In at least one embodiment, a processor uses one or more neural networks to infer information from one or more sets of data based, at least in part, on one or more dynamically configurable dimensions of the one or more sets of data.
Apparatuses, systems, and techniques to perform parallel processing. In at least one embodiment, a parallel processing algorithm for performing an additive prefix scan is selected from a plurality of alternatives based on an arrangement of a group of threads provided to perform the scan.
Apparatuses, systems, and techniques are presented to perform one or more operations. In at least one embodiment, one or more data values, to be used by one or more neural networks, are caused to be replaced by one or more invalid data values.
One or more embodiments of the present disclosure relate to executing, by a plurality of compute engines, a plurality of runnables of a computing application based at least on an execution schedule and a set of commands associated with the execution schedule. The execution schedule may be generated using a compiling system to include the set of commands. The set of commands may include one or more individual commands corresponding to one or more timing fences dictating a timing and order of execution of one or more individual runnables of the plurality of runnables.
Techniques applicable to a ray tracing hardware accelerator for traversing a hierarchical acceleration structure with reduced round-trip communications with a processor are disclosed. The reduction of round-trip communications with a processor during traversal is achieved by having a visibility mask that defines visibility states for regions within a geometric primitive available to be accessed in the ray tracing hardware accelerator when a ray intersection is detected for the geometric primitive.
Apparatuses, systems, and techniques to facilitate execution graph synchronization. In at least one embodiment, an application programming interface comprising one or more parameters is used to create dependencies between graph code nodes and one or more software routines.
Apparatuses, systems, and techniques to facilitate parallel processing. In at least one embodiment, an application programming interface allows a user to define a plurality of cooperative thread groups, and launch multiple cooperative thread groups in parallel provided sufficient processing resources are available.
Apparatuses, systems, and techniques to facilitate data retrieval. In at least one embodiment, an application programming interface is used to facilitate indication of a data location and to cause data to be retrieved from the location.
A Displaced Micro-mesh (DMM) primitive enables high complexity geometry for ray and path tracing while minimizing the associated builder costs and preserving high efficiency. A structured, hierarchical representation implicitly encodes vertex positions of a triangle micro-mesh based on a barycentric grid, and enables microvertex displacements to be encoded efficiently (e.g., as scalars linearly interpolated between minimum and maximum triangle surfaces). The resulting displaced micro-mesh primitive provides a highly compressed representation of a potentially vast number of displaced microtriangles that can be stored in a small amount of space. Improvements in ray tracing hardware permit automatic processing of such primitive for ray-geometry intersection testing by ray tracing circuits without requiring intermediate reporting to a shader.
An algorithm and associated set of rules enable a given polygon micro-mesh type to always be able to represent a more compressed micro-mesh type. These rules, in conjunction with additional constraints on the order used to encode displaced micro-meshes, enable lossy compression techniques to efficiently store geometric displacements as a parallel algorithm, with little communication required among independently compressed displaced micro-meshes, while guaranteeing high quality watertight (crack-free) results for vector displacements, triangle textures, and ray and path tracing.
A µ-mesh ("micro mesh"), which is a structured representation of geometry that exploits coherence for compactness and exploits its structure for efficient rendering with intrinsic level of detail is provided. The micromesh is a regular mesh having a power-of- two number of segments along its perimeters, and which can be overlaid on a surface of a geometric primitive. The micromesh is used for providing a visibility mask and/or a displacement map that is accessible using barycentric coordinates of a point of interest on the micromesh.
Apparatuses, systems, and techniques are presented to upsample audio. In at least one embodiment, one or more neural networks are used to determine one or more second frequencies of one or more audio signals based, at least in part, on only one or more first frequencies of the one or more audio signals.
G10L 21/0388 - Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques - Details of processing therefor
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G06N 3/04 - Architecture, e.g. interconnection topology
Apparatuses, systems, and techniques to perform multi-architecture execution graphs. In at least one embodiment, a parallel processing platform, such as compute uniform device architecture (CUDA) generates multi-architecture execution graphs comprising a plurality of software kernels to be performed by one or more processor cores having one or more processor architectures.
G06F 9/50 - Allocation of resources, e.g. of the central processing unit [CPU]
G06F 15/78 - Architectures of general purpose stored program computers comprising a single central processing unit
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
Disclosed are techniques for compressing data of an image using multiple processing cores. The techniques include obtaining, using a first (second, etc. ) processing core, a first (second, etc. ) plurality of reconstructed blocks approximating source pixels of a first (second, etc. ) portion of an image and filtering, using the first processing core, the first plurality of reconstructed blocks. The filtering includes enabling application of one or more filters to a first plurality of regions that include pixels of the first plurality of reconstructed blocks but not pixels of the second plurality of reconstructed blocks. The filtering further includes disabling application of the one or more filters to a second plurality of regions that include pixels of the first plurality of reconstructed blocks and pixels of the second plurality of reconstructed blocks.
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
84.
SOFTWARE INTERFACE USED TO PERFORM AND FACILITATE MULTI-USER AND/OR MULTI-CELL PHYSICAL LAYER (PHY) SIGNAL PROCESSING PIPELINES IN A FIFTH GENERATION (5G) NEW RADIO (NR) NETWORK
Apparatuses, systems, and techniques to perform and facilitate an interface for multi-user and/or multi-cell physical layer (PHY) signal processing pipelines in a fifth generation (5G) new radio (NR) network. In at least one embodiment, a software interface facilitates scalable execution of multi-user and/or multi-cell information by a 5G-NR PHY software library implementing one or more signal processing pipelines.
G06F 9/455 - Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
G06F 9/50 - Allocation of resources, e.g. of the central processing unit [CPU]
H03M 13/00 - Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
Disclosed are techniques for compressing data of an image. Intermediate pixels may be determined. Each location of the image may be associated with a block of a plurality of blocks of a first size and a block of a plurality of blocks of a second size. For each block of the first size and of the second size, a first cost for a first mode and a second cost for a second mode may be determined in parallel using the intermediate pixels. A final mode and a final block size may be selected for each location of the image using the first cost and the second cost for each of a respective block of the first size and a respective block of the second size associated with a corresponding location. Final pixels may be determined, and a representation of the image may be obtained based on the final pixels.
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/103 - Selection of coding mode or of prediction mode
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
86.
HARDWARE CODEC ACCELERATORS FOR HIGH-PERFORMANCE VIDEO ENCODING
Disclosed are apparatuses, systems, and techniques for real-time codec encoding of video files using hardware-assisted accelerators that utilize a combination of parallel and sequential processing, in which at least a part of intra-frame block prediction is performed with parallel processing.
H04N 19/119 - Adaptive subdivision aspects e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
87.
OBJECT CHARACTERIZATION USING ONE OR MORE NEURAL NETWORKS
Apparatuses, systems, and techniques are presented to detect one or more objects in one or more images. In at least one embodiment, one or more neural networks can be used to detect one or more objects in one or more images based, at least in part, on textual descriptions of the one or more objects.
Apparatuses, systems, and techniques are presented to select neural networks. In at least one embodiment, one or more first neural networks can be used to select one or more second neural networks, as may be based at least in part upon an inference to be generated by the one or more second neural networks.
Apparatuses, systems, and techniques are presented to classify objects in images. In at least one embodiment, one or more neural networks are used to identify one or more objects in one or more full images based, at least in part, on the one or more neural networks having been trained using the one or more full images and one or more portions of the one or more full images.
Apparatuses, systems, and techniques are presented to simplify neural networks. In at least one embodiment, one or more portions of one or more neural networks are cause to be removed based, at least in part, on one or more performance metrics of the one or more neural networks.
An application management platform comprising at least a packaging and bundling component, a deployment management component, and an update component. The packaging and bundling component versions, packages, and bundles a plurality of infrastructure components for a remote data center. The deployment management component provisions one or more nodes of the remote data center with the plurality of infrastructure components for an application. The update component monitors available updates to one or more of the plurality of infrastructure components used by the remote data center and facilitates update of the one or more of the plurality of infrastructure components at the remote data center.
In various examples, a system includes a memory operating within a first risk level and circuitry operating within a second risk level that indicates more risk than the first risk level. The circuitry reads and/or writes data to a first memory address within the memory, and reads and/or writes an error detection code to a second memory address within the memory.
Systems and methods are presented for reliable transmission of time-sensitive data. In particular, various embodiments provide for the generation of compressed sequential data, where individual instances of a sequence represent differentials from prior instances in that sequence. In order to reduce an amount of data that needs to be transmitted, instances of data (such as individual video frames) can be provided using a prior video frame as a reference, sending only data for those pixel locations where the pixel value differs from the reference frame. A reference frame can include a previously-received and successfully-decoded frame, in order to minimize the impact of dropped, incomplete, or corrupted frames. In order to further reduce data transmission requirements, a reference frame can be selected which is determined to be optimal for the current frame, such as may represent a least amount of data to be transmitted for a given frame.
H04N 19/164 - Feedback from the receiver or from the transmission channel
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/61 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
H04N 19/89 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
Apparatuses, systems, and techniques to use data obtained from an inferred object to determine whether to re-infer the same inferred object. In at least one embodiment, data obtained from inferencing a tracked object among a set of images is used to adjust a set of conditions.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 10/96 - Management of image or video recognition tasks
95.
CONDITIONAL IMAGE GENERATION USING ONE OR MORE NEURAL NETWORKS
Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon one or more input types.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
96.
ISOLATING A REGION OF A SYSTEM ON A CHIP FOR SAFETY CRITICAL OPERATIONS
In various examples, an integrated circuit includes first and second portions operating within separate domains. The second portion has an interface that connects the first and second portions. The second portion selectively locks the interface to prevent communication with the first portion over the interface, and selectively unlocks the interface to allow communication with the first portion over the interface.
G06F 11/07 - Responding to the occurrence of a fault, e.g. fault tolerance
G06F 11/16 - Error detection or correction of the data by redundancy in hardware
G06F 11/20 - Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
Configurations for rack connection systems are disclosed. In at least one embodiment, an adapter for a component plug includes a rotation component to enable rotation of at least a portion of the component plug about at least two axes.
Apparatuses, systems, and techniques are presented to generate media content. In at least one embodiment, a first neural network is used to generate first video information based, at least in part, upon voice information corresponding to one or more users, and a second neural network is used to generate second video information corresponding to the one or more users based, at least in part, upon the first video information and one or more images corresponding to the one or more users.
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
99.
IN-ROW COOLING UNIT WITH INTERCHANGEABLE HEAT EXCHANGERS
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, an in-row cooling unit is located within a row of racks and between a racks so that it can use an interchangeable heat exchanger (IHE) to receive primary coolant and can use one or more flow controllers to provide a first part of a primary coolant to cool secondary coolant that is to be distributed to at least one cold plate and to provide a second part of a primary coolant to cool air to be circulated through at least one server tray or rack.
Apparatuses, systems, and techniques to generate one or more images of an object. In at least one embodiment, a technique includes training one or more neural networks to generate one or more images of an object from at least a first image of the object and a second lower-resolution image of the object, where the training includes a comparison of the one or more generated images of the object with the second lower-resolution image of the object.