The present disclosure is concerned with compositions and methods for reducing the steps used in the generation of monoclonal clusters by combining the enzymes used for linearization and removal of unused surface primers.
The invention relates to methods and kits for use in nucleic acid sequencing, in particular methods for use in concurrent sequencing, and in particular concurrent sequencing of tandem insert libraries.
A method of base calling nucleobases of two or more polynucleotide sequence portions, wherein said polynucleotide sequence portions have been selectively processed such that an intensity of the signals obtained based upon the respective first nucleobase is greater than an intensity of the signals obtained based upon the respective second nucleobase.
A method of determining sequence information from two or more polynucleotide sequence portions, the method comprising: selecting one of a plurality of classifications based on first and second intensity data, wherein each classification represents one or more possible combinations of respective nucleobases of the two or more polynucleotide sequence portions, and wherein at least one classification represents more than one possible combination of respective nucleobases.
Systems and methods of identifying nucleobases in a template polynucleotide are disclosed. In one embodiment, such a method may include providing a substrate comprising a plurality of double stranded template polynucleotides in a cluster. Each double stranded template polynucleotide may comprise a first strand and a second strand. The method may further include contacting the plurality of double stranded template polynucleotides with first primers which bind to the first strand and second primers which bind to the second strand. The method may further include extending the first primers and the second primers by contacting the cluster with labeled nucleobases to form first labeled primers and second labeled primers. The method may further include stimulating light emissions from the first and second labeled primers, wherein an amplitude of the signal generated by the first labeled primers is greater than an amplitude of the signal generated by the second labeled primers. The method may further include identifying the labeled nucleobases added to the first primers and the second primers based on the amplitude of the signal generated by the labeled nucleobases.
Systems and methods of identifying nucleobases in a template polynucleotide are disclosed. In one embodiment, such a method may include providing a substrate comprising a plurality of the template polynucleotides in a cluster. The method may further include generating light to stimulate fluorescent emissions from the cluster. The method may further include receiving a first signal emitted at a first intensity from a first plurality of nucleotide analogs hybridized to the plurality of template polynucleotides at a first site. The method may further include receiving a second signal emitted at a second intensity from a second plurality of nucleotide analogs hybridized to the plurality of template polynucleotides at a second site. The method may further include identifying the nucleobases hybridized at the first and second sites of the template polynucleotide based on a combination of the first and second signals.
An example of a flow cell includes a substrate having depressions separated by interstitial regions. First and second primers are immobilized within the depressions. First transposome complexes are immobilized within the depressions, and the first transposome complexes include a first amplification domain. Second transposome complexes are also immobilized within the depressions, and the second transposome complexes include a second amplification domain. Some of the first transposome complexes, or some of the second transposome complexes, or some of both of the first and second transposome complexes include a modification to reduce tagmentation efficiency.
A variety of different types of targeted transposome complexes are described herein that may be used to mediate sequence-specific targeted transposition of nucleic acids. Also described herein is a method of characterizing desired samples in a mixed pool of samples comprising both desired samples and unwanted samples comprising, to produce sequencing data from double-stranded nucleic acid, initially sequencing a library comprising a plurality of nucleic acid samples from a mixed pool, wherein each nucleic acid library comprises nucleic acids from a single sample and a unique sample barcode to distinguish the nucleic acids from the single sample from the nucleic acids from other samples in the library; analyzing the sequencing data and identifying unique sample barcodes associated with sequencing data from desired samples; performing a selection step on the library comprising enriching nucleic acid samples from desired samples and/or depleting nucleic acid samples from unwanted samples; and resequencing the nucleic acid library.
A functionalized nanostructure includes a metal nanostructure; an un-cleavable first primer and a cleavable second primer attached to a first region of the metal nanostructure through i) a first thiol linkage attached to a first polymer chain having a first polarity or ii) respective first thiol linkages attached to respective first polymer chains having the first polarity; and a cleavable first primer and an un-cleavable second primer attached to a second region of the metal nanostructure through i) a second thiol linkage attached to a second polymer chain having a second polarity different from the first polarity or ii) respective second thiol linkages attached to respective second polymer chains having the second polarity.
An apparatus includes a chassis, a frame, a sample support member, an imaging assembly, an actuation assembly, and a vibration capture assembly. The frame is coupled with the chassis. The sample support member is supported by the frame. The actuation assembly is supported by the frame and is operable to drive movement of the imaging assembly relative to the sample support member. The vibration capture assembly is operable to selectively transition between a plurality of modes, including a damping mode and an isolation mode. In the damping mode, the vibration capture assembly is configured to resist movement of the frame relative to the chassis in response to operation of the actuation assembly. In the isolation mode, the vibration capture assembly is configured to prevent transmission of vibrational movement in the chassis to the frame.
F16F 15/03 - Suppression of vibrations of non-rotating, e.g. reciprocating, systems; Suppression of vibrations of rotating systems by use of members not moving with the rotating system using electromagnetic means
An example of a method includes providing a substrate with an exposed surface comprising a first chemical group, wherein the providing optionally comprises modifying the exposed surface of the substrate to incorporate the first chemical group; reacting the first chemical group with a first reactive group of a functionalized polymer molecule to form a functionalized polymer coating layer covalently bound to the exposed surface of the substrate; grafting a primer to the functionalized polymer coating layer by reacting the primer with a second reactive group of the functionalized polymer coating layer; and forming a water-soluble protective coating on the primer and the functionalized polymer coating layer. Examples of flow cells incorporating examples of the water-soluble protective coating are also disclosed herein.
An apparatus includes a chassis, a frame, a sample support member, an imaging assembly, an actuation assembly, and a vibration capture assembly. The frame is coupled with the chassis. The sample support member is supported by the frame. The actuation assembly is supported by the frame and is operable to drive movement of the imaging assembly relative to the sample support member. The vibration capture assembly is operable to selectively transition between a plurality of modes, including a damping mode and an isolation mode. In the damping mode, the vibration capture assembly is configured to resist movement of the frame relative to the chassis in response to operation of the actuation assembly. In the isolation mode, the vibration capture assembly is configured to prevent transmission of vibrational movement in the chassis to the frame.
This disclosure describes methods, non-transitory computer readable media, and systems that can use a machine-learning to determine factors or scores indicating an error level with which a given methylation assay detects methylation of cytosine bases. For instance, the disclosed systems use a machine-learning model to generate a bias score indicating a degree to which a given methylation assay errs in detecting cytosine methylation when specific sequence contexts surround such cytosines compared to other sequence contexts. The machine-learning model may take various forms of models, including a decision-tree model, a neural network, or a combination of a decision-tree model and a neural network. In some cases, the disclosed system combines or uses bias scores from multiple machine-learning models to generate a consensus bias score.
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 40/00 - ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
20.
CYCLOOCTATETRAENE CONTAINING DYES AND COMPOSITIONS
Embodiments of the present disclosure relate to cyclooctatetraene containing dyes and their uses as fluorescent labels. Also provided are composition containing cyclooctatetraene. The dyes and compositions may be used in various biological applications, such as nucleic acid sequencing.
Embodiments of the present disclosure relate to cyclooctatetraene containing dyes and their uses as fluorescent labels. Also provided are composition containing cyclooctatetraene. The dyes and compositions may be used in various biological applications, such as nucleic acid sequencing.
Described herein are technologies for classifying a protein structure (such as technologies for classifying the pathogenicity of a protein structure related to a nucleotide variant). Such a classification is based on two-dimensional images taken from a three-dimensional image of the protein structure. With respect to some implementations, described herein are multi-view convolutional neural networks (CNNs) for classifying a protein structure based on inputs of two-dimensional images taken from a three-dimensional image of the protein structure. In some implementations, a computer-implemented method of determining pathogenicity of variants includes accessing a structural rendition of amino acids, capturing images of those parts of the structural rendition that contain a target amino acid from the amino acids, and, based on the images, determining pathogenicity of a nucleotide variant that mutates the target amino acid into an alternate amino acid.
A polynucleotide sequencing method comprises (i) removing a label and a blocking moiety from a blocked, labeled nucleotide incorporated into a copy polynucleotide strand that is complementary to at least a portion of a template polynucleotide strand; and (ii) washing the removed label and blocking moiety away from the copy strand with a wash solution comprising a first buffer comprising a scavenger compound. Removing the label and blocking moieties may comprise chemically removing the moieties. The first buffer may also comprise an antioxidant and may be used in a scanning buffer used during a nucleotide detection step.
Described herein are technologies for classifying a protein structure (such as technologies for classifying the pathogenicity of a protein structure related to a nucleotide variant). Such a classification is based on two-dimensional images taken from a three-dimensional image of the protein structure. With respect to some implementations, described herein are multi-view convolutional neural networks (CNNs) for classifying a protein structure based on inputs of two-dimensional images taken from a three-dimensional image of the protein structure. In some implementations, a computer-implemented method of determining pathogenicity of variants includes accessing a structural rendition of amino acids, capturing images of those parts of the structural rendition that contain a target amino acid from the amino acids, and, based on the images, determining pathogenicity of a nucleotide variant that mutates the target amino acid into an alternate amino acid.
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
G16H 50/30 - ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for individual health risk assessment
25.
METHODS OF DETECTING METHYLCYTOSINE AND HYDROXYMETHYLCYTOSINE BY SEQUENCING
Embodiments of the present disclosure relates to various bisulfite-free chemical methods for detecting methylation of cytosine in the DNA sample. These methods convert methylated and hydroxymethylated cytosine in the nucleic acid sequence to a modified or pseudo thymine or a uracil moiety which then can be detected in sequencing.
The technology disclosed relates to inter-model prediction score recalibration. In one implementation, the technology disclosed relates to a system including a first model that generates, based on evolutionary conservation summary statistics of amino acids in a target protein sequence, a first pathogenicity score-to-rank mapping for a set of variants in the target protein sequence; and a second model that generates, based on epistasis expressed by amino acid patterns spanning the target protein sequence and a plurality of non-target protein sequences aligned in multiple sequence alignment, a second pathogenicity score-to-rank mapping for the set of variants. The system also includes a reassignment logic that reassigns pathogenicity scores from the first set of pathogenicity scores to the set of variants based on the first and second score-to-rank mappings, and an output logic to generate a ranking of the set of variants based on the reassigned scores.
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 40/00 - ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
27.
METHODS OF NUCLEIC ACID SEQUENCING USING SURFACE-BOUND PRIMERS
Polynucleotide sequencing methods for sequencing one or more polynucleotide templates use primers bound to a surface as sequencing primers. The surface primers may include at least a portion of a surface oligonucleotide used during cluster formation. The sequencing methods may be used for single stranded sequencing or double stranded sequencing. Double stranded sequencing methods may employ an enzyme that has nick-translation activity. A kit includes all the reagents needed for sequencing does not include sequencing primers. The kit may be used to accomplish the sequencing methods of the present disclosure.
Polynucleotide sequencing methods for sequencing one or more polynucleotide templates that uses primers bound to a surface as sequencing primers. The surface primers may include at least a portion of a surface oligonucleotide used during cluster formation. The sequencing methods may be used for single stranded sequencing or double stranded sequencing. Double stranded sequencing methods may employ an enzyme that has nick-translation activity. A kit includes all the reagents needed for sequencing does not include sequencing primers. The kit may be used to accomplish the sequencing methods of the present disclosure.
Imaging systems and related methods are disclosed. In accordance with an implementation, a system includes a flow cell receptacle to receive a flow cell that receives a sample and an imaging system having a light source assembly, and an imaging device. The light source assembly to form a substantially collimated beam. The optical assembly including an asymmetric beam expander group that includes one or more asymmetric elements or anamorphic elements disposed along an optical axis. The optical assembly to receive the substantially collimated beam from the light source assembly, and transform the substantially collimated beam into a shaped sampling beam having an elongated cross section in a far field at or near a focal plane of the optical assembly to optically probe the sample. The imaging device to obtain image data associated with the sample in response to the optical probing of the sample with the sampling beam.
This disclosure describes methods, non-transitory computer readable media, and systems that can facilitate execution of external workflows for diagnostic analysis of nucleotide sequencing data utilizing a container orchestration engine. For example, the disclosed systems can utilize a container orchestration engine to allow external systems (e.g., third-party systems) to generate and implement workflows for analyzing sequencing data. In executing individual workflow containers of a sequencing diagnostic workflow, the disclosed systems can isolate the workflow containers to prevent access to, or corruption of, other data while also orchestrating allocation of computing resources available at a genomic sequence processing device to execute the workflow containers.
The technology disclosed relates to accessing a multiple sequence alignment that aligns a query residue sequence to a plurality of non-query residue sequences, applying a set of periodically-spaced masks to a first set of residues at a first set of positions in the multiple sequence alignment, and cropping a portion of the multiple sequence alignment that includes the set of periodically-spaced masks at the first set of positions, and a second set of residues at a second set of positions in the multiple sequence alignment to which the set of periodically-spaced masks is not applied. The first set of residues includes a residue-of-interest at a position-of-interest in the query residue sequence.
The technology disclosed relates to generating species-differentiable evolutionary profiles using a weighting logic. In particular, the technology disclosed relates to determining a weighted summary statistic for a given residue category at a given position in a multiple sequence alignment based on one or more weights of one or more sequences in the multiple sequence alignment that have a residue of the given residue category at the given position.
The technology disclosed relates to determining feasibility of using a reference genome of a non-target species for variant calling a sample of a target species. In particular, the technology disclosed relates to mapping sequenced reads of a sample of a target species to a reference genome of a non-target species to detect a first set of variants in the sequenced reads of the sample of the target species, and mapping the sequenced reads of the sample of the target species to a reference genome of a pseudo-target species to detect a second set of variants in the sequenced reads of the sample of the target species.
The technology disclosed relates to inter-model prediction score recalibration. In one implementation, the technology disclosed relates to a system including a first model that generates, based on evolutionary conservation summary statistics of amino acids in a target protein sequence, a first pathogenicity score-to-rank mapping for a set of variants in the target protein sequence; and a second model that generates, based on epistasis expressed by amino acid patterns spanning the target protein sequence and a plurality of non-target protein sequences aligned in multiple sequence alignment, a second pathogenicity score-to-rank mapping for the set of variants. The system also includes a reassignment logic that reassigns pathogenicity scores from the first set of pathogenicity scores to the set of variants based on the first and second score-to-rank mappings, and an output logic to generate a ranking of the set of variants based on the reassigned scores.
The technology disclosed relates to a system for inter-model prediction score recalibration. The system includes a first model that generates, based on evolutionary conservation summary statistics of amino acids in a reference protein sequence, a first set of pathogenicity scores with rankings for variants that mutate the reference sequence to alternate protein sequences. The system further includes a second model that generates, based on epistasis expressed by amino acid patterns spanning a multiple sequence alignment aligning the reference sequence to non-target sequences, a second set of pathogenicity scores with rankings for the variants. The system further includes a rank loss determination logic that determines a rank loss parameter by comparing the two sets of rankings, a loss function reconfiguration logic that reconfigures a loss function based on the rank loss parameter, and a training logic that uses the reconfigured loss function to train the first model.
G16B 40/00 - ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
The technology disclosed relates to determining feasibility of using a reference genome of a non-target species for variant calling a sample of a target species. In particular, the technology disclosed relates to mapping sequenced reads of a sample of a target species to a reference genome of a non-target species to detect a first set of variants in the sequenced reads of the sample of the target species, and mapping the sequenced reads of the sample of the target species to a reference genome of a pseudo-target species to detect a second set of variants in the sequenced reads of the sample of the target species.
The technology disclosed relates to generating species-differentiable evolutionary profiles using a weighting logic. In particular, the technology disclosed relates to determining a weighted summary statistic for a given residue category at a given position in a multiple sequence alignment based on one or more weights of one or more sequences in the multiple sequence alignment that have a residue of the given residue category at the given position.
This disclosure describes methods, non-transitory computer readable media, and systems that can query the status of various stages in an end-to-end sequencing process and generate a graphical status summary for the sequencing process that depicts icons indicating statuses of the various stages. For instance, the disclosed systems can generate a graphical status summary for a nucleotide sequencing taskset that includes icons depicting statuses of a sequencing run, a data transfer of base-call data to a device for variant analysis, and the variant analysis—each part of the same nucleotide sequencing taskset. By exchanging data with a sequencing device for read data and one or more servers for variant analysis, the disclosed system can quickly provide a graphical status summary of an end-to-end sequencing process marked by various tasks within a nucleotide sequencing taskset.
Embodiments of the present disclosure relates to periodate salt compositions for use in the chemical linearization of double-stranded polynucleotides in preparation for sequencing application, for example, sequencing-by-synthesis (SBS). Kits containing the periodate salt composition and methods of sequencing polynucleotides are also described.
The present disclosure relates to a method including exposing a composition comprising a wax-microsphere matrix to a first melt-condition, wherein said wax-microsphere matrix comprises a wax component and a plurality of lyophilised microspheres, wherein said plurality of lyophilised microspheres comprise one or more reagent, whereby exposing said composition comprising said wax-microsphere matrix to said first melt-condition melts the wax component; exposing said composition to a first release-condition to rehydrate at least one lyophilised microsphere; and exposing said at least one rehydrated lyophilised microsphere to a separation-condition to separate said wax component from said at least one rehydrated lyophilised microsphere. Also disclosed are methods of preparing a wax-microsphere matrix and releasing one or more reagent from a wax-microsphere matrix as well as compositions. Also disclosed are cartridges with a reagent reservoir including the compositions described herein. Also disclosed are systems for controlling release of one or more reagent including the compositions described herein.
C12Q 1/6848 - Nucleic acid amplification reactions characterised by the means for preventing contamination or increasing the specificity or sensitivity of an amplification reaction
41.
QUALITY DETECTION OF VARIANT CALLING USING A MACHINE LEARNING CLASSIFIER
The technology disclosed relates to variant calling of sequenced reads of a sample of a target species against a reference genome of a pseudo-target species. Low-quality variants are identified as false positive variants that are present in the second set of variants but absent from the first set of variants.
A first reference genome is segmented into a plurality of bins and high-quality sequenced reads are mapped on a bin-by-bin basis to the plurality of bins in the first reference genome, and a second reference genome is segmented into a plurality of bins and high-quality sequenced reads are mapped on a bin-by-bin basis to the plurality of bins in the second reference genome. A best-mapped bin is identified in the second reference genome based on the greatest degree of match between the best-mapped bin in the second reference genome and a corresponding bin in the first reference genome.
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 40/00 - ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06N 3/12 - Computing arrangements based on biological models using genetic models
The technology disclosed relates to accessing a multiple sequence alignment that aligns a query residue sequence to a plurality of non-query residue sequences, applying a set of periodically-spaced masks to a first set of residues at a first set of positions in the multiple sequence alignment, and cropping a portion of the multiple sequence alignment that includes the set of periodically-spaced masks at the first set of positions, and a second set of residues at a second set of positions in the multiple sequence alignment to which the set of periodically-spaced masks is not applied. The first set of residues includes a residue-of-interest at a position-of-interest in the query residue sequence.
A system comprises chunking logic that chunks (or splits) a multiple sequence alignment (MSA) into chunks, first attention logic that attends to a representation of the chunks and produces a first attention output, first aggregation logic that produces a first aggregated output that contains those features in the first attention output that correspond to masked residues in the plurality of masked residues, mask revelation logic that produces an informed output based on the first aggregated output and a Boolean mask, second attention logic that attends to the informed output and produces a second attention output based on masked residues revealed by the Boolean mask, second aggregation logic that produces a second aggregated output that contains those features in the second attention output that correspond to masked residues concealed by the Boolean mask, and output logic that produces identifications of the masked residues based on the second aggregated output.
This disclosure describes methods, non-transitory computer readable media, and systems that can query the status of various stages in an end-to-end sequencing process and generate a graphical status summary for the sequencing process that depicts icons indicating statuses of the various stages. For instance, the disclosed systems can generate a graphical status summary for a nucleotide sequencing taskset that includes icons depicting statuses of a sequencing run, a data transfer of base-call data to a device for variant analysis, and the variant analysis—each part of the same nucleotide sequencing taskset. By exchanging data with a sequencing device for read data and one or more servers for variant analysis, the disclosed system can quickly provide a graphical status summary of an end-to-end sequencing process marked by various tasks within a nucleotide sequencing taskset.
An example of a flow cell includes a substrate having depressions separated by interstitial regions. First and second primers are immobilized within the depressions. First transposome complexes are immobilized within the depressions, and the first transposome complexes include a first amplification domain. Second transposome complexes are also immobilized within the depressions, and the second transposome complexes include a second amplification domain. Some of the first transposome complexes, or some of the second transposome complexes, or some of both of the first and second transposome complexes include a modification to reduce tagmentation efficiency.
This disclosure describes methods, non-transitory computer readable media, and systems that can facilitate execution of external workflows for diagnostic analysis of nucleotide sequencing data utilizing a container orchestration engine. For example, the disclosed systems can utilize a container orchestration engine to allow external systems (e.g., third-party systems) to generate and implement workflows for analyzing sequencing data. In executing individual workflow containers of a sequencing diagnostic workflow, the disclosed systems can isolate the workflow containers to prevent access to, or corruption of, other data while also orchestrating allocation of computing resources available at a genomic sequence processing device to execute the workflow containers.
Embodiments of the present disclosure relates to periodate salt compositions for use in the chemical linearization of double-stranded polynucleotides in preparation for sequencing application, for example, sequencing-by-synthesis (SBS). Kits containing the periodate salt composition and methods of sequencing polynucleotides are also described.
Techniques are described for reducing the number of angles needed in structured illumination imaging of biological samples through the use of patterned flowcells, where nanowells of the patterned flowcells are arranged in, e.g., a square array, or an asymmetrical array. Accordingly, the number of images needed to resolve details of the biological samples is reduced. Techniques are also described for combining structured illumination imaging with line scanning using the patterned flowcells.
The present disclosure relates to a method including exposing a composition comprising a wax-microsphere matrix to a first melt-condition, wherein said wax-microsphere matrix comprises a wax component and a plurality of lyophilised microspheres, wherein said plurality of lyophilised microspheres comprise one or more reagent, whereby exposing said composition comprising said wax-microsphere matrix to said first melt-condition melts the wax component; exposing said composition to a first release-condition to rehydrate at least one lyophilised microsphere; and exposing said at least one rehydrated lyophilised microsphere to a separation-condition to separate said wax component from said at least one rehydrated lyophilised microsphere. Also disclosed are methods of preparing a wax-microsphere matrix and releasing one or more reagent from a wax-microsphere matrix as well as compositions. Also disclosed are cartridges with a reagent reservoir including the compositions described herein. Also disclosed are systems for controlling release of one or more reagent including the compositions described herein.
The invention relates to methods for pairwise sequencing of a double-stranded polynucleotide template, which permit the sequential determination of nucleotide sequences in two distinct and separate regions on complementary strands of the double-stranded polynucleotide template. The two regions for sequence determination may or may not be complementary to each other.
The present disclosure is directed to decoupling library capture (template seeding) from cluster generation to optimise both processes. This is achieved by introducing orthogonality between the seeding and clustering primer.
An iterative process may be implemented for incrementally aggregating available batches of sample data with previously available batches to perform sequencing analysis. Genomic variant call files associated with one or more samples may be received in batches from sequencing devices and aggregated for performing sequencing analysis. The aggregated genomic variant call files may be used to generate cohort files and census files that comprise summary information related to the genomic variant call files in each batch. The census data in census files may be aggregated into a global census file that includes summary genome variant data. Multi-sample variant call files may be generated based on the global census file, cohort files, and census files. The genomic variant call files may be processed using parallel processing at multiple compute nodes. The files may be further compressed and overlapping data may be efficiently stored in buffer positions.
Embodiments of the present disclosure relate to method of chemical linearization of double stranded polynucleotides for sequencing by synthesis. In particular, a heterogenous cobalt catalyst is used to cleave one or more diol moieties at a predetermined cleavage site of one strand of the double stranded polynucleotides.
This application describes methods of preparing an immobilized library of tagged RNA fragments. Also described herein are a number of methods of preparing DNA and RNA sequencing libraries from a single sample. These methods can include library preparation from single cells.
An iterative process may be implemented for incrementally aggregating available batches of sample data with previously available batches to perform sequencing analysis. Genomic variant call files associated with one or more samples may be received in batches from sequencing devices and aggregated for performing sequencing analysis. The aggregated genomic variant call files may be used to generate cohort files and census files that comprise summary information related to the genomic variant call files in each batch. The census data in census files may be aggregated into a global census file that includes summary genome variant data. Multi-sample variant call files may be generated based on the global census file, cohort files, and census files. The genomic variant call files may be processed using parallel processing at multiple compute nodes. The files may be further compressed and overlapping data may be efficiently stored in buffer positions.
The present disclosure relates to new compounds and their use as fluorescent labels. The compounds may be used as fluorescent labels for nucleotides in nucleic acid sequencing applications.
C12Q 1/6816 - Hybridisation assays characterised by the detection means
C09B 23/06 - Methine or polymethine dyes, e.g. cyanine dyes characterised by the methine chain containing an odd number of CH groups three CH groups, e.g. carbocyanines
C07D 209/08 - Indoles; Hydrogenated indoles with only hydrogen atoms or radicals containing only hydrogen and carbon atoms, directly attached to carbon atoms of the hetero ring
C07D 403/06 - Heterocyclic compounds containing two or more hetero rings, having nitrogen atoms as the only ring hetero atoms, not provided for by group containing two hetero rings linked by a carbon chain containing only aliphatic carbon atoms
C09B 23/08 - Methine or polymethine dyes, e.g. cyanine dyes characterised by the methine chain containing an odd number of CH groups more than three CH groups, e.g. polycarbocyanines
G01N 33/58 - Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances
C09B 23/10 - Methine or polymethine dyes, e.g. cyanine dyes characterised by the methine chain containing an even number of CH groups
59.
SEQUENCING FROM MULTIPLE PRIMERS TO INCREASE DATA RATE AND DENSITY
The present invention relates to a sequencing method which allows for increased rates of sequencing and an increase in the density of sequencing data. The system may be based on next generation sequencing methods such as sequencing by synthesis (SBS) but uses multiple primers bound at different positions on the same nucleic acid strand.
An example primer set includes first and second nuclease resistant primers. The first nuclease resistant primer includes a first sequence; a first cleavage site attached at a 3' end of the first sequence; and a first nuclease resistant modification incorporated between the first sequence and the first cleavage site. The second nuclease resistant primer includes a second sequence that is different from the first sequence; a second nuclease resistant modification incorporated at a 3' end of the second sequence; and a second cleavage site attached between the second sequence and the second nuclease resistant modification. The second cleavage site is different from the first cleavage site.
The present disclosure is concerned with compositions and methods for the paired-end sequencing of target nucleic acids, and more particularly to obtaining nucleotide sequence information from two separate regions of target nucleic acids using amplification sites having a single type of surface primer.
The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids, generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, determining an evolutionary conservation at the particular position of respective amino acids of respective amino acid classes based at least in part on the gapped spatial representation, and based at least in part on the evolutionary conservation of the respective amino acids, determining a pathogenicity of respective nucleotide variants that respectively substitute the particular amino acid with the respective amino acids in alternate representations of the protein.
The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids. The technology disclosed further relates to generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, and determining a pathogenicity of a nucleotide variant based at least in part on the gapped spatial representation, and a representation of an alternate amino acid created by the nucleotide variant at the particular position.
The technology disclosed relates to training a pathogenicity predictor. In particular, the technology disclosed relates to accessing a gapped training set that includes respective gapped protein samples for respective positions in a proteome, accessing a non-gapped training set that includes non-gapped benign protein samples and non-gapped pathogenic protein samples, generating respective gapped spatial representations for the gapped protein samples, and generating respective non-gapped spatial representations for the non-gapped benign protein samples and the non-gapped pathogenic protein samples, training a pathogenicity predictor over one or more training cycles and generating a trained pathogenicity predictor, wherein each of the training cycles uses as training examples gapped spatial representations from the respective gapped spatial representations and non-gapped spatial representations from the respective non-gapped spatial representations, and using the trained pathogenicity classifier to determine pathogenicity of variants.
The technology disclosed relates to training a pathogenicity predictor. In particular, the technology disclosed relates to accessing a gapped training set that includes respective gapped protein samples for respective positions in a proteome, accessing a non-gapped training set that includes non-gapped benign protein samples and non-gapped pathogenic protein samples, generating respective gapped spatial representations for the gapped protein samples, and generating respective non-gapped spatial representations for the non-gapped benign protein samples and the non-gapped pathogenic protein samples, training a pathogenicity predictor over one or more training cycles and generating a trained pathogenicity predictor, wherein each of the training cycles uses as training examples gapped spatial representations from the respective gapped spatial representations and non-gapped spatial representations from the respective non-gapped spatial representations, and using the trained pathogenicity classifier to determine pathogenicity of variants.
An apparatus and method for imaging includes an imaging system formed of a movable objective stage proximal to a sample and positioned for providing an excitation beam onto and for capturing an emission from the sample. The movable objective stage includes an optical lens apparatus and a turn reflector optically coupled to the imaging optics, where at least one of the optical lens apparatus and the turn reflector are movable relative to one another for scanning the sample, and wherein the movement is achieved while maintaining a substantially fixed optical path length between the optical lens apparatus and a fixed plane in a fixed imaging optics stage.
A polynucleotide sequencing method includes a wash step that employs a composition including a polymerase. The composition may also include a plurality of nucleotides. The composition may be configured to prevent the polymerase from incorporating one of the plurality of nucleotides into a copy polynucleotide strand. The composition may be substantially free of Mg2.
C12Q 1/6848 - Nucleic acid amplification reactions characterised by the means for preventing contamination or increasing the specificity or sensitivity of an amplification reaction
C12Q 1/6874 - Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation [SBH]
68.
PREDICTING VARIANT PATHOGENICITY FROM EVOLUTIONARY CONSERVATION USING THREE-DIMENSIONAL (3D) PROTEIN STRUCTURE VOXELS
The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids, generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, determining an evolutionary conservation at the particular position of respective amino acids of respective amino acid classes based at least in part on the gapped spatial representation, and based at least in part on the evolutionary conservation of the respective amino acids, determining a pathogenicity of respective nucleotide variants that respectively substitute the particular amino acid with the respective amino acids in alternate representations of the protein.
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16B 15/00 - ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
An example of an array includes a support, a cross-linked epoxy polyhedral oligomeric silsesquioxane (POSS) resin film on a surface of the support, and a patterned hydrophobic polymer layer on the cross-linked epoxy POSS resin film. The patterned hydrophobic polymer layer defines exposed discrete areas of the cross-linked epoxy POSS resin film, and a polymer coating is attached to the exposed discrete areas. Another example of an array includes a support, a modified epoxy POSS resin film on a surface of the support, and a patterned hydrophobic polymer layer on the modified epoxy POSS resin film. The modified epoxy POSS resin film includes a polymer growth initiation site, and the patterned hydrophobic polymer layer defines exposed discrete areas of the modified epoxy POSS resin film. A polymer brush is attached to the polymer growth initiation site in the exposed discrete areas.
An apparatus and method for imaging includes an imaging system formed of a movable objective stage proximal to a sample and positioned for providing an excitation beam onto and for capturing an emission from the sample. The movable objective stage includes an optical lens apparatus and a turn reflector optically coupled to the imaging optics, where at least one of the optical lens apparatus and the turn reflector are movable relative to one another for scanning the sample, and wherein the movement is achieved while maintaining a substantially fixed optical path length between the optical lens apparatus and a fixed plane in a fixed imaging optics stage.
G01N 21/63 - Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
G01N 21/17 - Systems in which incident light is modified in accordance with the properties of the material investigated
G02B 21/36 - Microscopes arranged for photographic purposes or projection purposes
An example flow cell includes a substrate having a surface. The flow cell also includes a polymeric hydrogel attached to at least a portion of the substrate surface, where the polymeric hydrogel includes a dark quencher. The flow cell further includes at least one primer set attached to the polymeric hydrogel.
The invention relates to methods of preventing renaturation of single-stranded nucleic acid libraries during storage, the method comprising using blocking oligonucleotides substantially complementary to adaptor sequences in the nucleic acid library.
A polynucleotide sequencing method includes a wash step that employs a composition including a polymerase. The composition may also include a plurality of nucleotides. The composition may be configured to prevent the polymerase from incorporating one of the plurality of nucleotides into a copy polynucleotide strand. The composition may be substantially free of Mg2+.
Some of the resin compositions are ultraviolet light or thermally curable, while others are ultraviolet light curable. One example of the ultraviolet light or thermally curable resin composition consists of a predetermined mass ratio of a (meth)acrylate cyclosiloxane monomer and a non-siloxane (meth)acrylate based monomer ranging from about >0:<100 to about 80:20; from 0 mass% to about 10 mass%, based on a total solids content of the resin composition, of an initiator selected from the group consisting of an azo-initiator, an acetophenone, a phosphine oxide, a brominated aromatic acrylate, and a dithiocarbamate; a surface additive; and a solvent.
Provided herein are various examples of a method of coupling oligonucleotides to a polymer. The method may include selectively irradiating first inactive moieties in a one or more first region of a polymer with light, while not irradiating second inactive moieties in a one or more second region of the polymer, to generate first active moieties in the one or more first region of the polymer. The method may also include coupling the first active moieties to first oligonucleotides. The method may further include irradiating the second inactive moieties in the one or more second region of the polymer with light to generate second active moieties in the one or more second region of the polymer. The method may also include coupling the second active moieties to second oligonucleotides.
Some of the resin compositions are ultraviolet light or thermally curable, while others are ultraviolet light curable. One example of the ultraviolet light or thermally curable resin composition consists of a predetermined mass ratio of a (meth)acrylate cyclosiloxane monomer and a non-siloxane (meth)acrylate based monomer ranging from about >0:<100 to about 80:20; from 0 mass % to about 10 mass %, based on a total solids content of the resin composition, of an initiator selected from the group consisting of an azo-initiator, an acetophenone, a phosphine oxide, a brominated aromatic acrylate, and a dithiocarbamate; a surface additive; and a solvent.
An example of an ultraviolet light curable resin composition includes a predetermined mass ratio of a first epoxy substituted polyhedral oligomeric silsesquioxane monomer and a second substituted polyhedral oligomeric silsesquioxane monomer, wherein the first and second epoxy substituted polyhedral oligomeric silsesquioxane monomers are different, and wherein the predetermined mass ratio ranges from about 3:7 to about 7:3; bis-(4-methylphenyl)iodonium hexafluorophosphate as a first initiator; a second initiator selected from the group consisting of a free radical initiator and a cationic initiator other than bis-(4-methylphenyl)iodonium hexafluorophosphate; a surface additive; and a solvent.
An example flow cell includes a substrate having a surface. The flow cell also includes a polymeric hydrogel attached to at least a portion of the substrate surface, where the polymeric hydrogel includes a dark quencher. The flow cell further includes at least one primer set attached to the polymeric hydrogel.
The invention relates to methods for pairwise sequencing of a polynucleotide template which result in the sequential determination of nucleotide sequence in two distinct and separate regions of the polynucleotide template.
An example of an ultraviolet light curable resin composition includes a predetermined mass ratio of a first epoxy substituted polyhedral oligomeric silsesquioxane monomer and a second substituted polyhedral oligomeric silsesquioxane monomer, wherein the first and second epoxy substituted polyhedral oligomeric silsesquioxane monomers are different, and wherein the predetermined mass ratio ranges from about 3:7 to about 7:3; bis-(4-methylphenyl)iodonium hexafluorophosphate as a first initiator; a second initiator selected from the group consisting of a free radical initiator and a cationic initiator other than bis-(4- methylphenyl)iodonium hexafluorophosphate; a surface additive; and a solvent.
The present disclosure relates to compositions including a shell surrounding an interior compartment, wherein said interior compartment comprises one or more reagent and wherein said shell releases said interior compartment when said shell is exposed to a first release condition, wherein said interior compartment releases said one or more reagent when said interior compartment is exposed to a second release condition, and wherein said first release condition is different from said second release condition. Also disclosed are compositions including a dissolvable first shell, and a dissolvable second shell, the second shell comprising one or more reagent. Also disclosed are methods for controlling release of one or more reagent using the compositions described herein. The present disclosure further relates to cartridges that include a reagent reservoir including the compositions described herein. Also disclosed are systems for controlling release of one or more reagent including the compositions described herein.
The present disclosure relates to compositions including a shell surrounding an interior compartment, wherein said interior compartment comprises one or more reagent and wherein said shell releases said interior compartment when said shell is exposed to a first release condition, wherein said interior compartment releases said one or more reagent when said interior compartment is exposed to a second release condition, and wherein said first release condition is different from said second release condition. Also disclosed are compositions including a dissolvable first shell, and a dissolvable second shell, the second shell comprising one or more reagent. Also disclosed are methods for controlling release of one or more reagent using the compositions described herein. The present disclosure further relates to cartridges that include a reagent reservoir including the compositions described herein. Also disclosed are systems for controlling release of one or more reagent including the compositions described herein.
Disclosed herein include methods, compositions, reaction mixtures, kits and systems for identification of methylated cytosines in nucleic acids using a bisulfite-free, one-step chemoenzymatic modification of methylated cytosines.
Embodiments of the present disclosure relate to nucleotides labeled with photoswitchable compounds. Also provided herein are methods and kits of using these labeled nucleotides for sequencing applications.
Presented herein are altered polymerase enzymes for improved incorporation of nucleotides and nucleotide analogues, in particular altered polymerases that maintain low pre-phasing rates when using ambiently stored polymerases, as well as methods and kits using the same.
Presented herein are altered polymerase enzymes for improved incorporation of nucleotides and nucleotide analogues, in particular altered polymerases that maintain low pre-phasing rates when using ambiently stored polymerases, as well as methods and kits using the same.
The technology disclosed describes determination of which elements of a sequence are nearest to uniformly spaced cells in a grid, where the elements have element coordinates, and the cells have dimension-wise cell indices and cell coordinates. The determination includes generating an element-to-cells mapping that maps, to each of the elements, a subset of the cells. The subset of the cells mapped to a particular element in the sequence includes a nearest cell in the grid and one or more neighborhood cells in the grid, and the nearest cell is selected based on matching element coordinates of the particular element to the cell coordinates. The determination further includes generating a cell-to-elements mapping that maps, to each of the cells, a subset of the elements, and using the cell-to-elements mapping to determine, for each of the cells, a nearest element in the sequence.
An example of a kit includes a flow cell, a primer fluid, and a cleaving fluid. The flow cell includes at least one surface functionalized with a polymeric hydrogel including azide functional groups or amine functional groups. The primer fluid includes a plurality of alkyne-containing primers, each alkyne-containing primer having an amino cleavable group attaching a primer sequence of the alkyne-containing primer to an alkyne-containing moiety of the alkyne-containing primer. The cleaving fluid includes a substance that is reactive with the amino cleavable group.
Method includes positioning a first carrier assembly on a system stage. The carrier assembly includes a support frame having an inner frame edge that defines a window of the support frame. The first carrier assembly includes a first substrate that is positioned within the window and surrounded by the inner frame edge. The first substrate has a sample thereon. The method includes detecting optical signals from the sample of the first substrate. The method also includes replacing the first carrier assembly on the system stage with a second carrier assembly on the system stage. The second carrier assembly includes the support frame and an adapter plate held by the support frame. The second carrier assembly has a second substrate held by the adapter plate that has a sample thereon. The method also includes detecting optical signals from the sample of the second substrate.
Provided herein are various examples of a method of coupling oligonucleotides to a polymer. The method may include selectively irradiating first inactive moieties in a one or more first region of a polymer with light, while not irradiating second inactive moieties in a one or more second region of the polymer, to generate first active moieties in the one or more first region of the polymer. The method may also include coupling the first active moieties to first oligonucleotides. The method may further include irradiating the second inactive moieties in the one or more second region of the polymer with light to generate second active moieties in the one or more second region of the polymer. The method may also include coupling the second active moieties to second oligonucleotides.
A61K 47/50 - Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
B01J 19/00 - Chemical, physical or physico-chemical processes in general; Their relevant apparatus
C08F 8/00 - Chemical modification by after-treatment
C08F 220/00 - Copolymers of compounds having one or more unsaturated aliphatic radicals, each having only one carbon-to-carbon double bond, and only one being terminated by only one carboxyl radical or a salt, anhydride, ester, amide, imide, or nitrile thereof
The present application relates to secondary amine-substituted coumarin compounds and their uses as fluorescent labels. The compounds may be used as fluorescent labels for nucleotides in nucleic acid sequencing applications.
C07D 417/04 - Heterocyclic compounds containing two or more hetero rings, at least one ring having nitrogen and sulfur atoms as the only ring hetero atoms, not provided for by group containing two hetero rings directly linked by a ring-member-to-ring- member bond
C07D 405/04 - Heterocyclic compounds containing both one or more hetero rings having oxygen atoms as the only ring hetero atoms, and one or more rings having nitrogen as the only ring hetero atom containing two hetero rings directly linked by a ring-member-to-ring- member bond
C07D 413/04 - Heterocyclic compounds containing two or more hetero rings, at least one ring having nitrogen and oxygen atoms as the only ring hetero atoms containing two hetero rings directly linked by a ring-member-to-ring- member bond
C07H 19/10 - Pyrimidine radicals with the saccharide radical being esterified by phosphoric or polyphosphoric acids
The present application relates to exocyclic amine-substituted coumarin derivatives and their uses as fluorescent labels. These compounds may be used as fluorescent labels for nucleotides in nucleic acid sequencing applications.
Embodiments of the present application relate to substrate comprising a surface-bound azido functionalized organosilane wherein the substrate is free or substantially free of a hydrogel or a hydrophilic polymer. Methods of preparing such substrate surface for sequencing applications are also disclosed.
Novel rhodamine dye compounds, labelled conjugates comprising the dyes are described, together with methods for their use. The dyes and labelled conjugates are useful as molecular probes in a variety of applications, such as in assays involving staining of cells, protein binding, and analysis of nucleic acids, such as hybridization assays and nucleic acid sequencing.
An example of a kit includes a flow cell, a primer fluid, and a cleaving fluid. The flow cell includes at least one surface functionalized with a polymeric hydrogel including azide functional groups or amine functional groups. The primer fluid includes a plurality of alkyne-containing primers, each alkyne-containing primer having an amino cleavable group attaching a primer sequence of the alkyne-containing primer to an alkyne-containing moiety of the alkyne-containing primer. The cleaving fluid includes a substance that is reactive with the amino cleavable group.
C12Q 1/6874 - Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation [SBH]
C12Q 1/689 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
An example of a flow cell includes a substrate; a first primer set attached to a first region on the substrate, the first primer set including an un-cleavable first primer and a cleavable second primer; and a second primer set attached to a second region on the substrate, the second primer set including a cleavable first primer and an un-cleavable second primer.
G03F 7/00 - Photomechanical, e.g. photolithographic, production of textured or patterned surfaces, e.g. printed surfaces; Materials therefor, e.g. comprising photoresists; Apparatus specially adapted therefor
This disclosure describes methods, non-transitory computer readable media, and systems that can train a genome-location-classification model to classify or score genomic coordinates or regions by the degree to which nucleobases can be accurately identified at such genomic coordinates or regions. For instance, the disclosed systems can determine sequencing metrics for sample nucleic-acid sequences or contextual nucleic-acid subsequences surrounding particular nucleobase calls. By leveraging ground-truth classifications for genomic coordinates, the disclosed systems can train a genome-location-classification model to relate data from one or both of the sequencing metrics and contextual nucleic-acid subsequences to confidence classifications for such genomic coordinates or regions. After training, the disclosed systems can also apply the genome-location-classification model to sequencing metrics or contextual nucleic-acid subsequences to determine individual confidence classifications for individual genomic coordinates or regions and then generate at least one digital file comprising such confidence classifications for display on a computing device.
The invention relates to methods for pairwise sequencing of a polynucleotide template which result in the sequential determination of nucleotide sequence in two distinct and separate regions of the polynucleotide template.
The present disclosure relates to a composition including one or more modified nucleotide, wherein the modified nucleotide comprises a purine or pyrimidine base and a sugar moiety having a 3′-hydroxy blocking group, and a radical scavenger, wherein the composition is lyophilised. The present disclosure further relates to a composition including one or more functional protein; one or more functional protein activator; and one or more non-reducing sugar, wherein the composition is lyophilised. Also disclosed are methods of rehydration of one or more compositions described herein and kits including one or more compositions described herein. Further disclosed are cartridges including a flow cell comprising one or more reagent reservoirs, where the one or more reagent reservoirs include one or more compositions described herein.
This disclosure describes methods, non-transitory computer readable media, and systems that can train a genome-location-classification model to classify or score genomic coordinates or regions by the degree to which nucleobases can be accurately identified at such genomic coordinates or regions. For instance, the disclosed systems can determine sequencing metrics for sample nucleic-acid sequences or contextual nucleic-acid subsequences surrounding particular nucleobase calls. By leveraging ground-truth classifications for genomic coordinates, the disclosed systems can train a genome-location-classification model to relate data from one or both of the sequencing metrics and contextual nucleic-acid subsequences to confidence classifications for such genomic coordinates or regions. After training, the disclosed systems can also apply the genome-location-classification model to sequencing metrics or contextual nucleic-acid subsequences to determine individual confidence classifications for individual genomic coordinates or regions and then generate at least one digital file comprising such confidence classifications for display on a computing device.