A method, a system, and a computer program product for performing analysis of data to detect presence of malicious code are disclosed. Reduced dimensionality vectors are generated from a plurality of original dimensionality vectors representing features in a plurality of samples. The reduced dimensionality vectors have a lower dimensionality than an original dimensionality of the plurality of original dimensionality vectors. A first plurality of clusters is determined by applying a first clustering algorithm to the reduced dimensionality vectors. A second plurality of clusters is determined by applying a second clustering algorithm to one or more clusters in the first plurality of clusters using the original dimensionality. An exemplar for a cluster in the second plurality of clusters is added to a training set, which is used to train a machine learning model for identifying a file containing malicious code.
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
Systems, methods, and software can be used for securing in-tunnel messages. One example of a method includes obtaining a parsed file that comprises two or more sub-feature trees, and each of the two or more sub-feature trees comprise at least one feature layer that comprises features. The method further includes generating a feature vector that identifies the features in the at least one feature layer for each of the two or more sub-feature trees. The method yet further includes mapping the features in the at least one feature layer for each of the one or more sub-feature trees to a corresponding position in the feature vector. By converting features in the parsed file into a feature vector, the method provides an applicable format of the feature vector in wide applications for the parsed file.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06F 16/28 - Databases characterised by their database models, e.g. relational or object models
G06F 16/901 - Indexing; Data structures therefor; Storage structures
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
An artifact is received and features are extracted therefrom to form a feature vector. Thereafter, a determination is made to alter a malware processing workflow based on a distance of one or more features in the feature vector relative to one or more indicator centroids. Each indicator centroid specifying a threshold distance to trigger an action. Based on such a determination, the malware processing workflow is altered.
Systems, methods, and software can be used to cluster software codes in a scalable manner. In some aspects, a computer-implemented method comprises: obtaining a plurality of software samples; computing one or more first hash results for each of the plurality of software samples; computing one or more second hash results for each of the plurality of software samples based on the one or more first hash results, wherein an amount of the one or more second hash results is less than an amount of the one or more first hash results; determining a similarity output based on the one or more second hash results of two of the plurality of software samples; and clustering the plurality of software samples based on the similarity output to generate one or more software sample clusters.
A method and computing device for statistical data fingerprinting and tracing data similarity of documents. The method comprises applying a statistical function to a subset of text in a first document thereby generating a first fingerprint; applying the statistical function to a subset of text in a second document thereby generating a second fingerprint; comparing the first fingerprint to the second fingerprint; and determining that the subset of text in the first document matches the subset of text in the second document based on the first fingerprint threshold matching the second fingerprint, wherein the statistical function is a measure of randomness of a count of each character in a subset of text against an expected distribution of said characters.
Bayesian continuous user authentication can be obtained by receiving observed behavior data that collectively characterizes interaction of an active user with at least one computing device or software application. A sequence of events within the observed behavior data can be identified and scored using a universal background model that generates first scores that characterize an extent to which each event or history of events is anomalous for a particular population of users. Further, the events are scored using a user model that generates second scores that characterizes an extent to which each event or history of events is anomalous for the particular user who owns the account. The first scores and the second scores are smoothed using a smoothing function. A probability that the active user is the account owner associated with the user model is determined based on the smoothed first scores and the smoothed second scores.
Systems and methods are described herein for computer user authentication using machine learning. Authentication for a user is initiated based on an identification confidence score of the user. The identification confidence score is based on one or more characteristics of the user. Using a machine learning model for the user, user activity of the user is monitored for anomalous activity to generate first data. Based on the monitoring, differences between the first data and historical utilization data for the user determine whether the user's utilization of the one or more resources is anomalous. When the user's utilization of the one or more resource is anomalous, the user's access to the one or more resource is removed.
Features are extracted from an artifact so that a vector can be populated. The vector is then inputted into an anomaly detection model comprising a deep generative model to generate a first score. The first score can characterize the artifact as being malicious or benign to access, execute, or continue to execute. In addition, the vector is inputted into a machine learning-based classification model to generate a second score. The second score can also characterize the artifact as being malicious or benign to access, execute, or continue to execute. The second score is then modified based on the first score to result in a final score. The final score can then be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
G06N 7/00 - Computing arrangements based on specific mathematical models
H04L 9/06 - Arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
9.
Projected vector modification as mitigation for machine learning model string stuffing
An artifact is received from which features are extracted so as to populate a vector. The features in the vector can be reduced using a feature reduction operations to result in a modified vector having a plurality of buckets. A presence of predetermined types of features are identified within buckets of the modified vector influencing a score above a pre-determined threshold. A contribution of the identified features within the high influence buckets of the modified vector is then attenuated. The modified vector is input into a classification model to generate a score which can be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
G06F 21/52 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure
G06F 21/55 - Detecting local intrusion or implementing counter-measures
An artifact is received from which features are extracted and used to populate a vector. The features in the vector are then reduced using a feature reduction operation to result in a modified vector having a plurality of buckets. Features within the buckets of the modified vector above a pre-determined projected bucket clipping threshold are then identified. Using the identified features, and overflow vector is then generated. The modified vector is then input into a classification model to generate a score. This score is adjusted based on the overflow vector and can then be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
A system is provided for classifying an instruction sequence with a machine learning model. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: processing an instruction sequence with a trained machine learning model configured to detect one or more interdependencies amongst a plurality of tokens in the instruction sequence and determine a classification for the instruction sequence based on the one or more interdependencies amongst the plurality of tokens; and providing, as an output, the classification of the instruction sequence. Related methods and articles of manufacture, including computer program products, are also provided.
An artefact is received. Features are extracted from this artefact which are, in turn, used to populate a vector. The vector is then input into a classification model to generate a score. The score is then modified using a step function so that the true score is not obfuscated. Thereafter, the modified score can be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
Hash-based application programming interface (API) importing can be prevented by allocating a name page and a guard page in memory. The name page and the guard page being associated with (i) an address of names array, (ii) an address of name ordinal array, and (iii) an address of functions array that are all generated by an operating system upon initiation of an application. The name page can then be filled with valid non-zero characters. Thereafter, protections on the guard page can be changed to no access. An entry is inserted into the address of names array pointing to a relative virtual address corresponding to anywhere within the name page. Access to the guard page causes the requesting application to terminate. Related apparatus, systems, techniques and articles are also described.
Hash-based application programming interface (API) importing can be prevented by allocating a name page and a guard page in memory. The name page and the guard page being associated with (i) an address of names array, (ii) an address of name ordinal array, and (iii) an address of functions array that are all generated by an operating system upon initiation of an application. The name page can then be filled with valid non-zero characters. Thereafter, protections on the guard page can be changed to no access. An entry is inserted into the address of names array pointing to a relative virtual address corresponding to anywhere within the name page. Access to the guard page causes the requesting application to terminate. Related apparatus, systems, techniques and articles are also described.
Executable memory space is protected by receiving, from a process, a request to configure a portion of memory with a memory protection attribute that allows the process to perform at least one memory operation on the portion of the memory. Thereafter, the request is responded to with a grant, configuring the portion of memory with a different memory protection attribute than the requested memory protection attribute. The different memory protection attribute restricting the at least one memory operation from being performed by the process on the portion of the memory. In addition, it is detected when the process attempts, in accordance with the grant, the at least one memory operation at the configured portion of memory. Related systems and articles of manufacture, including computer program products, are also disclosed.
G06F 12/14 - Protection against unauthorised use of memory
G06F 21/79 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
16.
Centroid for improving machine learning classification and info retrieval
Centroids are used for improving machine learning classification and information retrieval. A plurality of files are classified as malicious or not malicious based on a function dividing a coordinate space into at least a first portion and a second portion such that the first portion includes a first subset of the plurality of files classified as malicious. One or more first centroids are defined in the first portion that classify files from the first subset as not malicious. A file is determined to be malicious based on whether the file is located within the one or more first centroids.
Each of a plurality of endpoint computer systems monitors data relating to a plurality of events occurring within an operating environment of the corresponding endpoint computer system. The monitoring can include receiving and/or inferring the data using one or more sensors executing on the endpoint computer systems Thereafter, for each endpoint computer system, artifacts used in connection with the events are stored in a vault maintained on such endpoint computer system. A query is later received by at least a subset of the plurality of endpoint computer systems from a server. Such endpoint computer systems, in response, identify and retrieve artifacts within the corresponding vaults response to the query. Results responsive to the query including or characterizing the identified artifacts is then provided by the endpoint computer systems receiving the query to the server.
An artefact is received. Thereafter, features are extracted from the artefact and a vector is populated. Later, one of a plurality of available classification models is selected. The classification models use different scoring paradigms while providing the same or substantially similar classifications. The vector is input into the selected classification model to generate a score. The score is later provided to a consuming application or process. The classification model can characterize the artefact as being malicious or benign to access, execute, or continue to execute so that appropriate remedial action can be taken or initiated by the consuming application or process. Related apparatus, systems, techniques and articles are also described.
An artefact is received. Features are extracted from this artefact which are, in turn, used to populate a vector. The vector is then input into a classification model to generate a score. The score is then modified using a step function so that the true score is not obfuscated. Thereafter, the modified score can be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
An artefact is received. Features are later extracted from the artefact and are used to populate a vector. The vector is input into a classification model to generate a score. This score is then modified using a time-based oscillation function and is provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
An artefact is received. Features are extracted from this artefact which are, in turn, used to populate a vector. The vector is then input into a classification model to generate a score. The score is then modified to result in a modified score by interleaving the generated score or a mapping thereof into digits of a pseudo-score. Thereafter, the modified score can be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
An artefact is received. Features from such artefact are extracted and then populated in a vector. Subsequently, one of a plurality of available dimension reduction techniques are selected. Using the selected dimension reduction technique, the features in the vector are reduced. The vector is then input into a classification model and the score can be provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
Systems, methods, and devices are described herein for detecting abnormalities within a system based on signal fingerprinting. A plurality of electrical signals are concurrently received from a transceiver over a time period. The time period is partitioned into a plurality of sampling windows. An electrical signal of the plurality of electrical signals is sequentially selected. For the sequentially selected electrical signal, a temporal snapshot of said electrical signal is iteratively captured over a sampling window of the plurality of sampling windows. This iterative capturing is repeated for remaining sampling windows of the plurality of sampling windows. Each captured temporal snapshot is temporally concatenated over the time period according to its respective temporal position of the time period to generate the signal fingerprint.
Systems are provided herein for communications bus signal fingerprinting. A security module monitors a plurality of voltage lines of at least one electronic control unit (ECU) electrically coupled to a communications bus. A voltage differential across at least two of the plurality of voltage lines of the at least one ECU is measured. The voltage differential is compared to a plurality of predetermined signal fingerprints associated with the at least one ECU. A variance in the compared voltage differential is identified relative to one or more of the plurality of predetermined signal fingerprints. Data characterizing the identified variance is provided.
Under one aspect, a method is provided for protecting a device from a malicious file. The method can be implemented by one or more data processors forming part of at least one computing device and can include extracting from the file, by at least one data processor, sequential data comprising discrete tokens. The method also can include generating, by at least one data processor, n-grams of the discrete tokens. The method also can include generating, by at least one data processor, a vector of weights based on respective frequencies of the n-grams. The method also can include determining, by at least one data processor and based on a statistical analysis of the vector of weights, that the file is likely to be malicious. The method also can include initiating, by at least one data processor and responsive to determining that the file is likely to be malicious, a corrective action.
A plurality of events associated with each of a plurality of computing nodes that form part of a network topology are monitored. The network topology includes antivirus tools to detect malicious software prior to it accessing one of the computing nodes. Thereafter, it is determined that, using at least one machine learning model, at least one of the events is indicative of malicious activity that has circumvented or bypassed the antivirus tools. Data is then provided that characterizes the determination. Related apparatus, systems, techniques and articles are also described.
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
G06N 3/00 - Computing arrangements based on biological models
27.
Machine learning model for malware dynamic analysis
In some implementations there may be provided a system. The system may include a processor and a memory. The memory may include program code which causes operations when executed by the processor. The operations may include analyzing a series of events contained in received data. The series of events may include events that occur during the execution of a data object. The series of events may be analyzed to at least extract, from the series of events, subsequences of events. A machine learning model may determine a classification for the received data. The machine learning model may classify the received data based at least on whether the subsequences of events are malicious. The classification indicative of whether the received data is malicious may be provided. Related methods and articles of manufacture, including computer program products, are also disclosed.
Data is received as part of an authentication procedure to identify a user. Such data characterizes a user-generated biometric sequence that is generated by the user interacting with at least one input device according to a desired biometric sequence. Thereafter, using the received data and at least one machine learning model trained using empirically derived historical data generated by a plurality of user-generated biometric sequences (e.g., historical user-generated biometric sequences according to the desired biometric sequence, etc.), the user is authenticated if an output of the at least one machine learning model is above a threshold. Data can be provided that characterizes the authenticating. Related apparatus, systems, techniques and articles are also described.
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
A system is provided for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: processing a container file with a trained machine learning model, wherein the trained machine learning is trained to determine a classification for the container file indicative of whether the container file includes at least one file rendering the container file malicious; and providing, as an output by the trained machine learning model, an indication of whether the container file includes the at least one file rendering the container file malicious. Related methods and articles of manufacture, including computer program products, are also disclosed.
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one memory provides operations including: receiving a disassembled binary file that includes a plurality of instructions; processing the disassembled binary file with a convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file. Related computer-implemented methods are also disclosed.
A mismatch between model-based classifications produced by a first version of a machine learning threat discernment model and a second version of a machine learning threat discernment model for a file is detected. The mismatch is analyzed to determine appropriate handling for the file, and taking an action based on the analyzing. The analyzing includes comparing a human-generated classification status for a file, a first model version status that reflects classification by the first version of the machine learning threat discernment model, and a second model version status that reflects classification by the second version of the machine learning threat discernment model. The analyzing can also include allowing the human-generated classification status to dominate when it is available.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
32.
Machine learning classification using Markov modeling
Systems, methods, and articles of manufacture, including computer program products, are provided for classification systems and methods using modeling. In some example embodiments, there is provided a system that includes at least one processor and at least one memory including program code which when executed by the at least one memory provides operations. The operations can include generating a representation of a sequence of sections of a file and/or determining, from a model including conditional probabilities, a probability for each transition between at least two sequential sections in the representation. The operations can further include classifying the file based on the probabilities for each transition.
An artefact is received and parsed into a plurality of observations. A first subset of the observations are inputted into a machine learning model trained using historical data to classify the artefact. In addition, a second subset of the observations are inputted into a xenospace centroid configured to classify the artefact. Thereafter, the artefact is classified based on a combination of an output of the machine learning model and an output of xenospace centroid. Related apparatus, systems, techniques and articles are also described.
Data is analyzed using feature hashing to detect malware. A plurality of features in a feature set is hashed. The feature set is generated from a sample. The sample includes at least a portion of a file. Based on the hashing, one or more hashed features are indexed to generate an index vector. Each hashed feature corresponds to an index in the index vector. Using the index vector, a training dataset is generated. Using the training dataset, a machine learning model for identifying at least one file having a malicious code is trained.
An identity of a user on a first computing node of a plurality of nodes within a computing environment is authenticated. A first authentication score for the user is calculated at the first computing node using at least one machine learning model. The first authentication score characterize interactions of the user with the first computing node. Subsequent to such authentication, traversal of the user from the first computing node to other computing nodes among the plurality of computing nodes are monitored. An authentication score characterizing interactions of the user with the corresponding computing node are calculated at each of the nodes using respective machine learning models executing on such nodes The respective machine learning models use, as an attribute, an authentication score calculated at a previously traversed computing node. Thereafter, an action is initiated at one of the computing nodes based on the calculated authentication scores.
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: reducing a dimensionality of a plurality of features representative of a file set; determining, based at least on a reduced dimensional representation of the file set, a distance between a file and the file set; and determining, based at least on the distance between the file and the file set, a classification for the file. Related methods and articles of manufacture, including computer program products, are also provided.
In one respect, there is provided a system for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The at least one memory may include program code that provides operations when executed by the at least one processor. The operations may include: training, based on a training data, a machine learning model to enable the machine learning model to determine whether at least one container file includes at least one file rendering the at least one container file malicious; and providing the trained machine learning model to enable the determination of whether the at least one container file includes at least one file rendering the at least one container file malicious. Related methods and articles of manufacture, including computer program products, are also disclosed.
Centroids are used for improving machine learning classification and information retrieval. A plurality of files are classified as malicious or not malicious based on a function dividing a coordinate space into at least a first portion and a second portion such that the first portion includes a first subset of the plurality of files classified as malicious. One or more first centroids are defined in the first portion that classify files from the first subset as not malicious. A file is determined to be malicious based on whether the file is located within the one or more first centroids.
Identifying shellcode in a sequence of instructions by identifying a first instruction, the first instruction identifying a first bound of a sequence of instructions, identifying a second instruction, the second instruction identifying a second bound of the sequence of instructions, and generating a distribution for the sequence of instructions, bounded by the first instruction and the second instructions, the distribution indicative of whether the sequence of instructions is likely to include shellcode.
A nested file having a primary file and at least one secondary file embedded therein is parsed using at least one parser of a cell. The cell assigns a maliciousness score to each of the parsed primary file and each of the parsed at least one secondary file. Thereafter, the cell generates an overall maliciousness score for the nested file that indicates a level of confidence that the nested file contains malicious content. The overall maliciousness score is provided to a data consumer indicating whether to proceed with consuming the data contained within the nested file.
H04L 9/00 - Arrangements for secret or secure communications; Network security protocols
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
An endpoint computer system can harvest data relating to a plurality of events occurring within an operating environment of the endpoint computer system and can add the harvested data to a local data store maintained on the endpoint computer system. A query response can be generated, for example by identifying and retrieving responsive data from the local data store. The responsive data are related to an artifact on the endpoint computer system and/or to an event of the plurality of events. In some examples, the local data store can be an audit log and/or can include one or more tamper resistant features. Systems, methods, and computer program products are described.
A mismatch between model-based classifications produced by a first version of a machine learning threat discernment model and a second version of a machine learning threat discernment model for a file is detected. The mismatch is analyzed to determine appropriate handling for the file, and taking an action based on the analyzing. The analyzing includes comparing a human-generated classification status for a file, a first model version status that reflects classification by the first version of the machine learning threat discernment model, and a second model version status that reflects classification by the second version of the machine learning threat discernment model. The analyzing can also include allowing the human-generated classification status to dominate when it is available.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
43.
Retention and accessibility of data characterizing events on an endpoint computer
An endpoint computer system can harvest data relating to a plurality of events occurring within an operating environment of the endpoint computer system and can add the harvested data to a local data store maintained on the endpoint computer system. In some examples, the local data store can be an audit log and/or can include one or more tamper resistant features. Systems, methods, and computer program products are described.
In one respect, there is provided a system for classifying malware. The system may include a data processor and a memory. The memory may include program code that provides operations when executed by the processor. The operations may include: providing, to a display, contextual information associated with a file to at least enable a classification of the file, when a malware classifier is unable to classify the file; receiving, in response to the providing of the contextual information, the classification of the file; and updating, based at least on the received classification of the file, the malware classifier to enable the malware classifier to classify the file. Methods and articles of manufacture, including computer program products, are also provided.
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one memory. The operations may include: extracting, from an icon associated with a file, one or more features; assigning, based at least on the one or more features, the icon to one of a plurality of clusters; and generating, based at least on the cluster to which the icon is assigned, a classification for the file associated with the icon. Related methods and articles of manufacture, including computer program products, are also provided.
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
G06K 9/46 - Extraction of features or characteristics of the image
Data is received or accessed that includes a structured file encapsulating data required by an execution environment to manage executable code wrapped within the structured file. Thereafter, code and data regions are iteratively identified in the structured file. Such identification is analyzed so that at least one feature can be extracted from the structured file. Related apparatus, systems, techniques and articles are also described.
In some implementations there may be provided a system. The system may include a processor and a memory. The memory may include program code which causes operations when executed by the processor. The operations may include analyzing a series of events contained in received data. The series of events may include events that occur during the execution of a data object. The series of events may be analyzed to at least extract, from the series of events, subsequences of events. A machine learning model may determine a classification for the received data. The machine learning model may classify the received data based at least on whether the subsequences of events are malicious. The classification indicative of whether the received data is malicious may be provided. Related methods and articles of manufacture, including computer program products, are also disclosed.
Described are techniques to enable computers to efficiently determine if they should run a program based on an immediate (i.e., real-time, etc.) analysis of the program. Such an approach leverages highly trained ensemble machine learning algorithms to create a real-time discernment on a combination of static and dynamic features collected from the program, the computer's current environment, and external factors. Related apparatus, systems, techniques and articles are also described.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 21/52 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure
49.
Communications bus data transmission using relative ground shifting
Methods are described herein for communications bus data transmission using relative ground shifting. A plurality of voltage lines of at least one electronic control unit (ECU) are monitored. The at least one ECU electrically coupled to a communications bus. A voltage differential across at least two of the plurality of voltage lines of the at least one ECU is measured. A pulse or data stream is injected into the communications bus via one or two voltage lines based on the measured voltage differential having an amplitude lower than a predetermined voltage threshold.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
G06F 21/81 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer by operating on the power supply, e.g. enabling or disabling power-on, sleep or resume operations
G06F 21/75 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation, e.g. to counteract reverse engineering
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one memory provides operations including: receiving a disassembled binary file that includes a plurality of instructions; processing the disassembled binary file with a convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file. Related computer-implemented methods are also disclosed.
In one aspect, a computer-implemented method is disclosed. The computer-implemented method may include determining a sketch matrix that approximates a matrix representative of a reference dataset. The reference dataset may include at least one computer program having a predetermined classification. A reduced dimension representation of the reference dataset may be generated based at least on the sketch matrix. The reduced dimension representation may have a fewer quantity of features than the reference dataset. A target computer program may be classified based on the reduced dimension representation. The target computer program may be classified to determine whether the target computer program is malicious. Related systems and articles of manufacture, including computer program products, are also disclosed.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
G06F 21/55 - Detecting local intrusion or implementing counter-measures
G06K 9/62 - Methods or arrangements for recognition using electronic means
Data is received as part of an authentication procedure to identify a user. Such data characterizes a user-generated biometric sequence that is generated by the user interacting with at least one input device according to a desired biometric sequence. Thereafter, using the received data and at least one machine learning model trained using empirically derived historical data generated by a plurality of user-generated biometric sequences (e.g., historical user-generated biometric sequences according to the desired biometric sequence, etc.), the user is authenticated if an output of the at least one machine learning model is above a threshold. Data can be provided that characterizes the authenticating. Related apparatus, systems, techniques and articles are also described.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
H04L 9/32 - Arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system
An agent inserts one or more hooks into a sub-execution runtime environment that is configured to include a script and/or targeted to include the script. The agent including the one or more hooks monitors a behavior of the sub-execution runtime environment and/or the script. The agent subsequently obtains context information regarding the sub-execution runtime environment and/or the script so that it can control the runtime of at least the sub-execution runtime environment. Related systems, methods, and articles of manufacture are also disclosed.
G06F 11/34 - Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation
G06F 21/54 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by adding security routines or objects to programs
G06F 9/448 - Execution paradigms, e.g. implementations of programming paradigms
Each of a plurality of endpoint computer systems monitors data relating to a plurality of events occurring within an operating environment of the corresponding endpoint computer system. The monitoring can include receiving and/or inferring the data using one or more sensors executing on the endpoint computer systems Thereafter, for each endpoint computer system, artifacts used in connection with the events are stored in a vault maintained on such endpoint computer system. A query is later received by at least a subset of the plurality of endpoint computer systems from a server. Such endpoint computer systems, in response, identify and retrieve artifacts within the corresponding vaults response to the query. Results responsive to the query including or characterizing the identified artifacts is then provided by the endpoint computer systems receiving the query to the server.
An endpoint computer system monitors data relating to a plurality of events occurring within an operating environment of the endpoint computer system. The monitoring can include receiving and/or inferring the data using one or more sensors executing on the endpoint computer system. The endpoint computer system can store artifacts used in connection with the plurality of events in a vault maintained on such endpoint computer system. The endpoint computer system, in response to a trigger, identifies and retrieves metadata characterizing artifacts associated with the trigger from the vault. Such identified and retrieved metadata is then provided by the endpoint computer system to a remote server.
Under one aspect, a method is provided for protecting a device from a malicious file. The method can be implemented by one or more data processors forming part of at least one computing device and can include extracting from the file, by at least one data processor, sequential data comprising discrete tokens. The method also can include generating, by at least one data processor, n-grams of the discrete tokens. The method also can include generating, by at least one data processor, a vector of weights based on respective frequencies of the n-grams. The method also can include determining, by at least one data processor and based on a statistical analysis of the vector of weights, that the file is likely to be malicious. The method also can include initiating, by at least one data processor and responsive to determining that the file is likely to be malicious, a corrective action.
Systems are provided herein for a hardware protection framework. A security module monitors a plurality of voltage lines of at least one electronic control unit (ECU) electrically coupled to a communications bus. A voltage differential across at least two of the plurality of voltage lines of the at least one ECU is measured. The voltage differential is compared to a plurality of predetermined signal fingerprints associated with the at least one ECU. A variance in the compared voltage differential is identified relative to one or more of the plurality of predetermined signal fingerprints. Data characterizing the identified variance is provided. In some aspects, a pulse or a data stream is injected based on the voltage differential having an amplitude lower than a predetermined voltage threshold.
Methods are provided herein for communications bus signal fingerprinting. A security module monitors a plurality of voltage lines of at least one electronic control unit (ECU) electrically coupled to a communications bus. A voltage differential across at least two of the plurality of voltage lines of the at least one ECU is measured. The voltage differential is compared to a plurality of predetermined signal fingerprints associated with the at least one ECU. A variance in the compared voltage differential is identified relative to one or more of the plurality of predetermined signal fingerprints. Data characterizing the identified variance is provided.
Methods are described herein for communications bus data transmission using relative ground shifting. A plurality of voltage lines of at least one electronic control unit (ECU) are monitored. The at least one ECU electrically coupled to a communications bus. A voltage differential across at least two of the plurality of voltage lines of the at least one ECU is measured. A pulse or data stream is injected into the communications bus via one or two voltage lines based on the measured voltage differential having an amplitude lower than a predetermined voltage threshold.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
G06F 21/81 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer by operating on the power supply, e.g. enabling or disabling power-on, sleep or resume operations
G06F 21/75 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation, e.g. to counteract reverse engineering
Systems and methods are provided herein for redaction of artificial intelligence (AI) training documents. Data comprising an unredacted document is received. The unredacted document comprises a plurality of objects arranged according to a first topology. The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set based on the first topology. The user sensitive information within the unredacted document is substituted with placeholder information to generate a redacted document having a second topology. The second topology is substantially identical to the first topology. In some variations, the redacted document is provided to an AI model for training.
Presence of malicious code can be identified in one or more data samples. A feature set extracted from a sample is vectorized to generate a sparse vector. A reduced dimension vector representing the sparse vector can be generated. A binary representation vector of reduced dimension vector can be created by converting each value of a plurality of values in the reduced dimension vector to a binary representation. The binary representation vector can be added as a new element in a dictionary structure if the binary representation is not equal to an existing element in the dictionary structure. A training set for use in training a machine learning model can be created to include one vector whose binary representation corresponds to each of a plurality of elements in the dictionary structure.
Data is analyzed using feature hashing to detect malware. A plurality of features in a feature set is hashed. The feature set is generated from a sample. The sample includes at least a portion of a file. Based on the hashing, one or more hashed features are indexed to generate an index vector. Each hashed feature corresponds to an index in the index vector. Using the index vector, a training dataset is generated. Using the training dataset, a machine learning model for identifying at least one file having a malicious code is trained.
In one respect, there is provided a system for classifying malware. The system may include a data processor and a memory. The memory may include program code that provides operations when executed by the processor. The operations may include: providing, to a display, contextual information associated with a file to at least enable a classification of the file, when a malware classifier is unable to classify the file; receiving, in response to the providing of the contextual information, the classification of the file; and updating, based at least on the received classification of the file, the malware classifier to enable the malware classifier to classify the file. Methods and articles of manufacture, including computer program products, are also provided.
A plurality of events associated with each of a plurality of computing nodes that form part of a network topology are monitored. The network topology includes antivirus tools to detect malicious software prior to it accessing one of the computing nodes. Thereafter, it is determined that, using at least one machine learning model, at least one of the events is indicative of malicious activity that has circumvented or bypassed the antivirus tools. Data is then provided that characterizes the determination. Related apparatus, systems, techniques and articles are also described.
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
G06N 3/00 - Computing arrangements based on biological models
65.
Isolating data for analysis to avoid malicious attacks
Determining, by a machine learning model in an isolated operating environment, whether a file is safe for processing by a primary operating environment. The file is provided, when the determining indicates the file is safe for processing, to the primary operating environment for processing by the primary operating environment. When the determining indicates the file is unsafe for processing, the file is prevented from being processed by the primary operating environment. The isolated operating environment can be maintained on an isolated computing system remote from a primary computing system maintaining the primary operating system. The isolating computing system and the primary operating system can communicate over a cloud network.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
Data is received or accessed that includes a structured file encapsulating data required by an execution environment to manage executable code wrapped within the structured file. Thereafter, code and data regions are iteratively identified in the structured file. Such identification is analyzed so that at least one feature can be extracted from the structured file. Related apparatus, systems, techniques and articles are also described.
A machine learning model is applied to at least determine whether a computer program includes vulnerable code. The machine learning model is trained to determine whether the computer program includes vulnerable code based at least on a presence and/or absence of a first trait. An indication can be provided, via a user interface, an indication that the computer program includes vulnerable code, when the computer program is determined to include vulnerable code. Related methods and articles of manufacture, including computer program products, are also provided.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
68.
Clustering analysis for deduplication of training set samples for machine learning based computer threat analysis
A method, a system, and a computer program product for performing analysis of data to detect presence of malicious code are disclosed. Reduced dimensionality vectors are generated from a plurality of original dimensionality vectors representing features in a plurality of samples. The reduced dimensionality vectors have a lower dimensionality than an original dimensionality of the plurality of original dimensionality vectors. A first plurality of clusters is determined by applying a first clustering algorithm to the reduced dimensionality vectors. A second plurality of clusters is determined by applying a second clustering algorithm to one or more clusters in the first plurality of clusters using the original dimensionality. An exemplar for a cluster in the second plurality of clusters is added to a training set, which is used to train a machine learning model for identifying a file containing malicious code.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one memory. The operations may include: extracting, from an icon associated with a file, one or more features; assigning, based at least on the one or more features, the icon to one of a plurality of clusters; and generating, based at least on the cluster to which the icon is assigned, a classification for the file associated with the icon. Related methods and articles of manufacture, including computer program products, are also provided.
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
G06K 9/62 - Methods or arrangements for recognition using electronic means
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: reducing a dimensionality of a plurality of features representative of a file set; determining, based at least on a reduced dimensional representation of the file set, a distance between a file and the file set; and determining, based at least on the distance between the file and the file set, a classification for the file. Related methods and articles of manufacture, including computer program products, are also provided.
Identifying shellcode in a sequence of instructions by identifying a first instruction, the first instruction identifying a first bound of a sequence of instructions, identifying a second instruction, the second instruction identifying a second bound of the sequence of instructions, and generating a distribution for the sequence of instructions, bounded by the first instruction and the second instructions, the distribution indicative of whether the sequence of instructions is likely to include shellcode.
Using a recurrent neural network (RNN) that has been trained to a satisfactory level of performance, highly discriminative features can be extracted by running a sample through the RNN, and then extracting a final hidden state hh where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger classifier may then be trained on hand-engineered as well as automatically determined features. Related apparatus, systems, techniques and articles are also described.
Systems, methods, and articles of manufacture, including computer program products, are provided for classification systems and methods using modeling. In some example embodiments, there is provided a system that includes at least one processor and at least one memory including program code which when executed by the at least one memory provides operations. The operations can include generating a representation of a sequence of sections of a file and/or determining, from a model including conditional probabilities, a probability for each transition between at least two sequential sections in the representation. The operations can further include classifying the file based on the probabilities for each transition.
Centroids are used for improving machine learning classification and information retrieval. A plurality of files are classified as malicious or not malicious based on a function dividing a coordinate space into at least a first portion and a second portion such that the first portion includes a first subset of the plurality of files classified as malicious. One or more first geometric regions are defined in the first portion that classify files from the first subset as not malicious. A file is determined to be malicious based on whether the file is located within the one or more first geometric regions.
In one respect, there is provided a system for classifying an instruction sequence with a machine learning model. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: processing an instruction sequence with a trained machine learning model configured to detect one or more interdependencies amongst a plurality of tokens in the instruction sequence and determine a classification for the instruction sequence based on the one or more interdependencies amongst the plurality of tokens; and providing, as an output, the classification of the instruction sequence. Related methods and articles of manufacture, including computer program products, are also provided.
Executable memory space is protected by receiving, from a process, a request to configure a portion of memory with a memory protection attribute that allows the process to perform at least one memory operation on the portion of the memory. Thereafter, the request is responded to with a grant, configuring the portion of memory with a different memory protection attribute than the requested memory protection attribute. The different memory protection attribute restricting the at least one memory operation from being performed by the process on the portion of the memory. In addition, it is detected when the process attempts, in accordance with the grant, the at least one memory operation at the configured portion of memory. Related systems and articles of manufacture, including computer program products, are also disclosed.
G06F 12/14 - Protection against unauthorised use of memory
G06F 21/79 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
77.
Training a machine learning model for analysis of instruction sequences
In one respect, there is provided a system for training a neural network adapted for classifying one or more instruction sequences. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: training, based at least on training data, a machine learning model to detect one or more predetermined interdependencies amongst a plurality of tokens in the training data; and providing the trained machine learning model to enable classification of one or more instruction sequences. Related methods and articles of manufacture, including computer program products, are also provided.
Systems and methods are described herein for computer user authentication using machine learning. Authentication for a user is initiated based on an identification confidence score of the user. The identification confidence score is based on one or more characteristics of the user. Using a machine learning model for the user, user activity of the user is monitored for anomalous activity to generate first data. Based on the monitoring, differences between the first data and historical utilization data for the user determine whether the user's utilization of the one or more resources is anomalous. When the user's utilization of the one or more resource is anomalous, the user's access to the one or more resource is removed.
In one respect, there is provided a system for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: processing a container file with a trained machine learning model, wherein the trained machine learning is trained to determine a classification for the container file indicative of whether the container file includes at least one file rendering the container file malicious; and providing, as an output by the trained machine learning model, an indication of whether the container file includes the at least one file rendering the container file malicious. Related methods and articles of manufacture, including computer program products, are also disclosed.
Under one aspect, a computer-implemented method includes receiving a query at a query interface about whether a computer file comprises malicious code. It is determined, using at least one machine learning sub model corresponding to a type of the computer file, whether the computer file comprises malicious code. Data characterizing the determination are provided to the query interface. Generating the sub model includes receiving computer files at a collection interface. Multiple sub populations of the computer files are generated based on respective types of the computer files, and random training and testing sets are generated from each of the sub populations. At least one sub model for each random training set is generated.
In one respect, there is provided a system for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The at least one memory may include program code that provides operations when executed by the at least one processor. The operations may include: training, based on a training data, a machine learning model to enable the machine learning model to determine whether at least one container file includes at least one file rendering the at least one container file malicious; and providing the trained machine learning model to enable the determination of whether the at least one container file includes at least one file rendering the at least one container file malicious. Related methods and articles of manufacture, including computer program products, are also disclosed.
A nested file having a primary file and at least one secondary file embedded therein is parsed using at least one parser of a cell. The cell assigns a maliciousness score to each of the parsed primary file and each of the parsed at least one secondary file. Thereafter, the cell generates an overall maliciousness score for the nested file that indicates a level of confidence that the nested file contains malicious content. The overall maliciousness score is provided to a data consumer indicating whether to proceed with consuming the data contained within the nested file.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
An agent inserts one or more hooks into a sub-execution runtime environment that is configured to include a script and/or targeted to include the script. The agent including the one or more hooks monitors a behavior of the sub-execution runtime environment and/or the script. The agent subsequently obtains context information regarding the sub-execution runtime environment and/or the script so that it can control the runtime of at least the sub-execution runtime environment. Related systems, methods, and articles of manufacture are also disclosed.
G06F 11/34 - Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation
G06F 21/54 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by adding security routines or objects to programs
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
G06F 9/448 - Execution paradigms, e.g. implementations of programming paradigms
84.
Deployment of machine learning models for discernment of threats
A mismatch between model-based classifications produced by a first version of a machine learning threat discernment model and a second version of a machine learning threat discernment model for a file is detected. The mismatch is analyzed to determine appropriate handling for the file, and taking an action based on the analyzing. The analyzing includes comparing a human-generated classification status for a file, a first model version status that reflects classification by the first version of the machine learning threat discernment model, and a second model version status that reflects classification by the second version of the machine learning threat discernment model. The analyzing can also include allowing the human-generated classification status to dominate when it is available.
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
85.
Man in the middle attack detection using active learning
Data is received that includes a plurality of samples that each characterize interception of data traffic to a computing device over a network. Thereafter, the plurality of samples characterizing the interception of data traffic are grouped into a plurality of clusters. At least a portion of the samples are labeled to characterize a likelihood of each such sample as relating to an unauthorized interception of data traffic. Each cluster is assigned with a label corresponding to a majority of samples within such cluster. At least one machine learning model is trained using the assigned labeled clusters such that, once trained, the at least one machine learning model determines a likelihood of future samples as relating to an unauthorized interception of data traffic to a corresponding computing device.
Determining, by a machine learning model in an isolated operating environment, whether a file is safe for processing by a primary operating environment. The file is provided, when the determining indicates the file is safe for processing, to the primary operating environment for processing by the primary operating environment. When the determining indicates the file is unsafe for processing, the file is prevented from being processed by the primary operating environment. The isolated operating environment can be maintained on an isolated computing system remote from a primary computing system maintaining the primary operating system. The isolating computing system and the primary operating system can communicate over a cloud network.
G06F 21/00 - Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
G06F 21/51 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
H04L 29/06 - Communication control; Communication processing characterised by a protocol
G06N 99/00 - Subject matter not provided for in other groups of this subclass
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
87.
Retention and accessibility of data characterizing events on an endpoint computer
An endpoint computer system can harvest data relating to a plurality of events occurring within an operating environment of the endpoint computer system and can add the harvested data to a local data store maintained on the endpoint computer system. In some examples, the local data store can be an audit log and/or can include one or more tamper resistant features. Systems, methods, and computer program products are described.
In one aspect there is provided a method. The method may include: determining that an executable implements a sub-execution environment, the sub-execution environment being configured to receive an input, and the input triggering at least one event at the sub-execution environment; intercepting the event at the sub-execution environment; and applying a security policy to the intercepted event, the applying of the policy comprises blocking the event, when the event is determined to be a prohibited event. Systems and articles of manufacture, including computer program products, are also provided.
G06F 21/54 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by adding security routines or objects to programs
G06F 21/50 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
89.
Retention and accessibility of data characterizing events on an endpoint computer
An endpoint computer system can harvest data relating to a plurality of events occurring within an operating environment of the endpoint computer system and can add the harvested data to a local data store maintained on the endpoint computer system. A query response can be generated, for example by identifying and retrieving responsive data from the local data store. The responsive data are related to an artifact on the endpoint computer system and/or to an event of the plurality of events. In some examples, the local data store can be an audit log and/or can include one or more tamper resistant features. Systems, methods, and computer program products are described.
Transaction terminal malicious software is detected by monitoring calls of a first process to identify attempts by the first process to read memory used by a second process. The first and second processes are different from each other and are executed by at least one data processor forming part of a transaction terminal system having at least one transaction terminal. Thereafter, it is determined that the memory used by the second process comprises patterns indicative of sensitive financial or identification information. In response, at least one corrective action is initiated to prevent use of the financial or identification information. Related apparatus, systems, techniques and articles are also described.
G06Q 20/40 - Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check of credit lines or negative lists
A first node of a networked computing environment initiates each of a plurality of different types of man-in-the middle (MITM) detection tests to determine whether communications between first and second nodes of a computing network are likely to have been subject to an interception or an attempted interception by a third node. Thereafter, it is determined, by the first node, that at least one of the tests indicate that the communications are likely to have been intercepted by a third node. Data is then provided, by the first node, data that characterizes the determination. In some cases, one or more of the MITM detection tests utilizes a machine learning model. Related apparatus, systems, techniques and articles are also described.
G06F 12/14 - Protection against unauthorised use of memory
G06F 12/16 - Protection against loss of memory contents
G08B 23/00 - Alarms responsive to unspecified undesired or abnormal conditions
H04L 29/06 - Communication control; Communication processing characterised by a protocol
H04L 9/06 - Arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
G06N 99/00 - Subject matter not provided for in other groups of this subclass
As part of an analysis of the likelihood that a given input (e.g. a file, etc.) includes malicious code, a convolutional neural network can be used to review a sequence of chunks into which an input is divided to assess how best to navigate through the input and to classify parts of the input in a most optimal manner. At least some of the sequence of chunks can be further examined using a recurrent neural network in series with the convolutional neural network to determine how to progress through the sequence of chunks. A state of the at least some of the chunks examined using the recurrent neural network summarized to form an output indicative of the likelihood that the input includes malicious code. Methods, systems, and articles of manufacture are also described.
As part of an analysis of the likelihood that a given input (e.g. a file, etc.) includes malicious code, a convolutional neural network can be used to review a sequence of chunks into which an input is divided to assess how best to navigate through the input and to classify parts of the input in a most optimal manner. At least some of the sequence of chunks can be further examined using a recurrent neural network in series with the convolutional neural network to determine how to progress through the sequence of chunks. A state of the at least some of the chunks examined using the recurrent neural network summarized to form an output indicative of the likelihood that the input includes malicious code. Methods, systems, and articles of manufacture are also described.
A first node of a networked computing environment initiates each of a plurality of different man-in-the middle (MITM) detection tests to determine whether communications between first and second nodes of a computing network are likely to have been subject to an interception or an attempted interception by a third node. Thereafter, it is determined, by the first node, that at least one of the tests indicate that the communications are likely to have been intercepted by a third node. Data is then provided, by the first node, data that characterizes the determination. Related apparatus, systems, techniques and articles are also described.
In one respect, there is provided a system for loading managed applications. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one memory provides operations including: generating a single process, the generating comprising running a native code executable, the running of the native code execute loading a loader manager as part of the single process; loading, by the loader manager running within the single process, a runtime environment corresponding to a non-native code application; and loading, by the loader manager, the non-native code application, the non-native code application being loaded to run as part of the single process.
A first node of a networked computing environment initiates each of a plurality of different man-in-the middle (MITM) detection tests to determine whether communications between first and second nodes of a computing network are likely to have been subject to an interception or an attempted interception by a third node. Thereafter, it is determined, by the first node, that at least one of the tests indicate that the communications are likely to have been intercepted by a third node. Data is then provided, by the first node, data that characterizes the determination. Related apparatus, systems, techniques and articles are also described.
G06F 7/04 - Identity comparison, i.e. for like or unlike values
G06F 15/16 - Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
G06F 17/30 - Information retrieval; Database structures therefor
H04L 29/06 - Communication control; Communication processing characterised by a protocol
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one memory provides operations including: receiving a disassembled binary file that includes a plurality of instructions; processing the disassembled binary file with a convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file. Related computer-implemented methods are also disclosed.
A plurality of data files is received. Thereafter, each file is represented as an entropy time series that reflects an amount of entropy across locations in code for such file. A wavelet transform is applied, for each file, to the corresponding entropy time series to generate an energy spectrum characterizing, for the file, an amount of entropic energy at multiple scales of code resolution. It can then be determined, for each file, whether or not the file is likely to be malicious based on the energy spectrum. Related apparatus, systems, techniques and articles are also described.
i, where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger classifier may then be trained on hand-engineered as well as automatically determined features. Related apparatus, systems, techniques and articles are also described.
The present disclosure involves systems and computer-implemented methods for installing software hooks. One process includes identifying a target method and a hook code, where the hook code is to execute instead of at least a portion of the target method, and wherein the target method and the hook code are executed within a managed code environment. A compiled version of the target method and a compiled version of the hook code are located in memory, where the compiled versions of the target method and the hook code are compiled in native code. Then, the compiled version of the target method is modified to direct execution of at least a portion of the compiled version of the target method to the compiled version of the hook code. The non-compiled version of the target method may be originally stored as bytecode. The managed code environment may comprise a managed .NET environment.