CN104685064A - Highly multiplex PCR methods and compositions - Google Patents

Highly multiplex PCR methods and compositions Download PDF

Info

Publication number
CN104685064A
CN104685064A CN201280075224.8A CN201280075224A CN104685064A CN 104685064 A CN104685064 A CN 104685064A CN 201280075224 A CN201280075224 A CN 201280075224A CN 104685064 A CN104685064 A CN 104685064A
Authority
CN
China
Prior art keywords
primer
gene seat
target gene
target
dna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201280075224.8A
Other languages
Chinese (zh)
Inventor
B·齐默曼
M·M·希尔
P·G·拉格劳特
M·多德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Natera Inc
Original Assignee
Gene Security Network Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gene Security Network Inc filed Critical Gene Security Network Inc
Priority claimed from US13/683,604 external-priority patent/US20130123120A1/en
Publication of CN104685064A publication Critical patent/CN104685064A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Abstract

The invention provides methods for simultaneously amplifying multiple nucleic acid regions of interest in one reaction volume as well as methods for selecting a library of primers for use in such amplification methods. The invention also provides library of primers with desirable characteristics, such as minimal formation of amplified primer dimers or other non-target amplicons.

Description

Height composite PCR method and composition
the cross reference of related application
This application claims on November 21st, 2012 submit to U.S. utility application the 13/683rd, No. 604 and on July 24th, 2012 submission U.S. Provisional Application the 61/675th, the rights and interests of No. 020 and right of priority.U.S. utility application the 13/683rd, No. 604 is the U.S. utility application the 13/300th submitted on November 18th, 2011, the part continuation application of No. 235, it is the U.S. utility application the 13/110th submitted on May 18th, 2011, the part continuation application of No. 685, and require the U.S. Provisional Application the 61/675th that on July 24th, 2012 submits to, the rights and interests of No. 020.U.S. utility application the 13/110th, No. 685 require the U.S. Provisional Application the 61/395th that on May 18th, 2010 submits to, No. 850; The U.S. Provisional Application the 61/398th that on June 21st, 2010 submits to, No. 159; The U.S. Provisional Application the 61/462nd that on February 9th, 2011 submits to, No. 972; The U.S. Provisional Application the 61/448th that on March 2nd, 2011 submits to, No. 547; And the U.S. Provisional Application the 61/516th that on April 12nd, 2011 submits to, the rights and interests of No. 996.U.S. utility application the 13/300th, No. 235 require the U.S. Provisional Application the 61/571st that on June 23rd, 2011 submits to, the rights and interests of No. 248.Full contents of all these applications are incorporated herein by reference hereby in order to content of teaching wherein.
about the statement of federal sponsored research or exploitation
This work obtains the support of the grant number 5R44HD60423-3 authorized by NIH.United States Government can have the right of any patent issued based on the application.
Technical field
The present invention relates generally to the method and composition for multiple associated nucleic acid region of increasing in a reaction volume simultaneously.
Background technology
Flux and allow more effectively to use nucleic acid samples is detected in order to improve, can by then multiple Oligonucleolide primers and sample combination be made described sample be called in the art in the process of composite PCR to experience polymerase chain reaction (PCR) condition to perform in associated sample to increase while multiple target nucleic acid.The use of composite PCR significantly can simplify experimental arrangement and shorten for the time needed for foranalysis of nucleic acids and detection.But, when adding multiple pairs in same PCR reaction, non-targeted amplified production can be produced, such as, through the primer dimer of amplification.The risk producing this kind of product increases along with primer quantity and increases.These non-targeted amplicons significantly limit the use that amplified production is further analyzed and/or detects.Therefore, need to improve one's methods to reduce to form non-targeted amplicon during composite PCR.
The composite PCR method improved will be applicable to various application, such as Noninvasive prenatal gene diagnosis (NPD).Specifically, the current method of antenatal diagnosis can warn the exception of the fetus that doctor and father and mother are growing up.Do not carry out antenatal diagnosis, at birth, have one in 50 babies and occur serious health or mental disorder, and just have one will have the congenital abnormality of certain form in reaching 30.Regrettably, standard method or poor accuracy, or relate to invasive program, there is the risk causing miscarrying.Method right and wrong based on maternal Blood Hormones or ultrasonic measurement are invasive, but their accuracy is also low.The method that such as amniocentesis, fine hair biopsy and fetal blood are sampled has high accuracy, but they are invasive and there is material risk.In the U.S., in all conceived persons' about 3%, perform amniocentesis, but in the past in 15 years, its frequency of utilization declines.
Normal human subject has two group chromosomes in the diploid cell of each health, often organizes 23, respectively from a copy of parent.Dysploidy, a kind of in karyocyte cell contain the too many and/or chromosomal patient's condition very little, be considered to cause the reason of vast scale graft failure, miscarriage and heredopathia.Chromosome abnormalty detects the following patient's condition can differentiating individuality or embryo, especially such as Down's syndrome (Down syndrome), kirschner syndromes (Klinefelter ' s syndrome) and Turner syndrome (Turner syndrome), add the chance of successful pregnancy in addition.Chromosome abnormalty test is as the of the same age particularly important of mother, and according to estimates, the embryo of at least 40% that has between 35 years old and 40 years old is abnormal, and time more than 40 years old, has embryo over half to be abnormal.
Have been found that acellular foetal DNA and complete fetal cell can enter maternal blood circulation recently.Therefore, early stage NPD can be allowed to the analysis of this genetic material.The method through improving is needed to improve Sensitivity and Specificity and the time reduced needed for NPD and cost.
Summary of the invention
On the one hand, the invention is characterized in the method for the target gene seat in amplification of nucleic acid sample.In certain embodiments, described method relate to (i) make described nucleic acid samples with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the test primer storehouse contact of 000 different target gene seat is to produce reaction mixture; And (ii) make described reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon.In certain embodiments, described method also comprises and determines at least one target amplicon of presence or absence (target amplicon of such as at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%).In certain embodiments, described method also comprises the sequence measuring at least one target amplicon (target amplicon of such as at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%).
In the various embodiments of either side of the present invention, at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.In certain embodiments, the amplified production of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% is target amplicon.In certain embodiments, the target gene seat of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% is amplified.In different embodiments, the amplified production being less than 60%, 50%, 40%, 30%, 20%, 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.25%, 0.1% or 0.05% is primer dimer.In certain embodiments, test primer storehouse and comprise at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 test primer pair, wherein often pair of primer comprises the positive test primer and negative testing primer that hybridize to identical target gene seat.In certain embodiments, test primer storehouse and comprise at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 independent test primer hybridizing to different target locus, wherein said independent primer is not a part for primer pair.
In the various embodiments of either side of the present invention, the concentration of each test primer is less than 100,75,50,25,10,5,2 or 1nM.In different embodiments, the GC content of test primer, between 30% to 80%, such as, between 40% to 70% or 50% to 60%, comprises end points.In certain embodiments, the scope (such as, maximum GC content deducts minimum GC content, the scope of such as 80%-60%=20%) of testing the GC content of primer is less than 30%, 20%, 10% or 5%.In certain embodiments, the melting temperature(Tm) (T of primer is tested m) between 40 DEG C to 80 DEG C, such as, between 50 DEG C to 70 DEG C, 55 DEG C to 65 DEG C or 57 DEG C to 60.5 DEG C, comprise end points.In certain embodiments, the scope of testing the melting temperature(Tm) of primer is less than 20 DEG C, 15 DEG C, 10 DEG C, 5 DEG C, 3 DEG C or 1 DEG C.In certain embodiments, the length of test primer, between 15 to 100 Nucleotide, such as, between 15 to 75 Nucleotide, 15 to 40 Nucleotide, 17 to 35 Nucleotide, 18 to 30 Nucleotide, 20 to 65 Nucleotide, comprises end points.In certain embodiments, test primer comprises non-targeted specific marker, such as, form the mark of inner loop structure.In certain embodiments, described mark is between two DNA lands.In different embodiments, test primer to comprise and to target gene seat, there is specific 5th ' district, not there is specificity to target gene seat and form the interior region of ring structure and to target gene seat, there is specific 3rd ' district.In different embodiments, the length in 3rd ' district is at least 7 Nucleotide.In certain embodiments, the length in 3rd ' district, between 7 and 20 Nucleotide, such as, between 7 to 15 Nucleotide or 7 to 10 Nucleotide, comprises end points.In different embodiments, test primer to comprise target gene seat is not had to specific 5th ' district (such as mark or universal primer binding site), is then have specific region to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to target gene seat target gene seat.In certain embodiments, the scope of testing the length of primer is less than 50,40,30,20,10 or 5 Nucleotide.In certain embodiments, the length of target amplicon, between 50 and 100 Nucleotide, such as, between 60 and 80 Nucleotide or 60 to 75 Nucleotide, comprises end points.In certain embodiments, the scope of the length of target amplicon is less than 50,25,15,10 or 5 Nucleotide.
In the various embodiments of either side of the present invention, primer extension reaction condition is polymerase chain reaction condition (PCR).In different embodiments, the length of annealing steps is greater than 3,5,8,10 or 15 minutes.In different embodiments, the length extending step is greater than 3,5,8,10 or 15 minutes.
In the various embodiments of either side of the present invention, test primer is used for amplification simultaneously and comprises from the different target gene seat of at least 1,000 in the maternal DNA of the pregnant mothers of fetus and the sample of foetal DNA to determine presence or absence fetal chromosomal abnormalities.In different embodiments, described method comprises the DNA molecular joined to by universal primer binding site in described sample; Use at least 1,000 Auele Specific Primer and a universal primer to increase the DNA molecular engaged, produce first group of amplified production; And use at least 1,000 pair of primer amplified first group of amplified production, produce second group of amplified production.In different embodiments, use at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different primer pair.
In the various embodiments of either side of the present invention, test primer is used for amplification simultaneously and comprises the different target gene seat of in the sample of the DNA of hypothesis father from fetus at least 1,000 and amplification simultaneously comprises from the target gene seat in the maternal DNA of the pregnant mothers of described fetus and the sample of foetal DNA to determine that whether described hypothesis father is the natural father of described fetus.
In the various embodiments of either side of the present invention, test primer is used for increasing from the different target gene seat of at least 1,000 in the cell of embryo or multiple cell to determine presence or absence chromosome abnormalty simultaneously.In different embodiments, analyze the cell from one group of two or more embryo, and select an embryo in vitro fertilization.
In the various embodiments of either side of the present invention, the target gene seat that test primer at least 1,000 of being used for simultaneously increasing in legal medical expert nucleic acid samples is different.In different embodiments, the length of annealing steps is greater than 3,5,8,10 or 15 minutes.
In the various embodiments of either side of the present invention, described method relate at least 1,000 different target gene seat that use test primer increases in contrast nucleic acid samples simultaneously with produce first group of target amplicon and target gene seat simultaneously in amplification assay nucleic acid samples to produce second group of target amplicon; And compare first group and second group of target amplicon whether to be present in a sample to determine target gene seat but not to be present in another sample, or whether target gene seat is present in control sample and test sample with different levels.In different embodiments, sample is tested from the doubtful individuality with the risk of relative disease or phenotype (such as cancer) or relative disease or phenotypic increase; And one or more wherein in target gene seat comprise relevant to the risk of relative disease or phenotypic increase or with relative disease or the relevant sequence of phenotype (such as, polymorphism or other suddenly change).In different embodiments, described method relate to use test primer increase simultaneously the different target gene seat of at least 1,000 of comprising in the control sample of RNA with produce first group of target amplicon and the target gene seat that comprises in the test sample of RNA of simultaneously increasing to produce second group of target amplicon; And compare first group and second group of target amplicon with determine rna expression between control sample and test sample horizontal in presence or absence difference.In different embodiments, described RNA is mRNA.In different embodiments, sample is tested from the doubtful individuality with the risk of the increase of relative disease or phenotype (such as cancer) or relative disease or phenotype (such as cancer); And one or more wherein in target gene seat comprise relevant to the risk of relative disease or phenotypic increase or with relative disease or the relevant sequence of phenotype (such as, polymorphism or other suddenly change).In certain embodiments, sample is tested from the individuality after diagnosing with relative disease or phenotype (such as cancer); And wherein between control sample and test sample, Discrepancy Description target gene seat in rna expression is horizontal comprises the sequence relevant with the risk of relative disease or phenotypic increase or reduction (such as, polymorphism or other suddenly change).
In some embodiments of either side of the present invention, test primer is selected from candidate drugs storehouse based on one or more parameter, such as, use any one the selection primer in method of the present invention.In certain embodiments, test primer and be selected from candidate drugs storehouse based on the ability of candidate drugs formation primer dimer at least partly.
On the one hand, the invention is characterized in the method selecting test primer from candidate drugs storehouse.In different embodiments, described selection relates to (i) and calculates major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, and wherein each undesirable mark is based, at least in part, between two kinds of candidate drugs and forms dimeric likelihood; (ii) from candidate drugs storehouse, the highest candidate drugs of undesirable mark is removed; And (iii) is if the candidate drugs removed in step (ii) is the member of primer pair, so removes another member of described primer pair from candidate drugs storehouse; And (iv) optionally repeating step (ii) and (iii), thus select test primer storehouse.In certain embodiments, described system of selection is performed until the undesirable mark of remaining candidate drugs combination is all equal to or less than minimum threshold in storehouse.In certain embodiments, described system of selection is performed until the quantity of remaining candidate drugs reduces to desired quantity in storehouse.In different embodiments, the undesirable mark of the possible candidate drugs combination of in storehouse at least 80%, 90%, 95%, 98%, 99% or 99.5% is calculated.In different embodiments, in storehouse, remaining candidate drugs can increase at least 1,000,2,000,5,000,7 simultaneously, 500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat.In different embodiments, described method also comprises the nucleic acid samples that (v) make to comprise target gene seat and contacts to produce reaction mixture with candidate drugs remaining in storehouse; And (vi) make reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon.
On the one hand, the invention is characterized in the method selecting test primer from candidate drugs storehouse.In different embodiments, the selection of test primer is selected from candidate drugs storehouse, relate to (i) and calculate major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, wherein each undesirable mark is based, at least in part, between two kinds of candidate drugs and forms dimeric likelihood; (ii) from candidate drugs storehouse, undesirable mark is removed in the maximum quantity combination as two kinds of candidate drugs higher than the candidate drugs of the part of the first minimum threshold; (iii) if the candidate drugs removed in step (ii) is the member of primer pair, from candidate drugs storehouse, another member of described primer pair is so removed; And (iv) optionally repeating step (ii) and (iii), thus select test primer storehouse.In certain embodiments, described system of selection is performed until the undesirable mark of remaining candidate drugs combination is all equal to or less than the first minimum threshold in storehouse.In certain embodiments, described system of selection is performed until the quantity of remaining candidate drugs reduces to desired quantity in storehouse.In different embodiments, the undesirable mark of the possible candidate drugs combination of in storehouse at least 80%, 90%, 95%, 98%, 99% or 99.5% is calculated.In different embodiments, in storehouse, remaining candidate drugs can increase at least 1,000,2,000,5,000,7 simultaneously, 500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat.In different embodiments, described method also comprises the nucleic acid samples that (v) make to comprise target gene seat and contacts to produce reaction mixture with candidate drugs remaining in storehouse; And (vi) make reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon.
In the various embodiments of either side of the present invention, described system of selection relates to by the first minimum threshold used in step (ii) being reduced to the second lower minimum threshold and optionally repeating step (ii) and (iii) reduce the quantity of remaining candidate drugs in storehouse further.In certain embodiments, described system of selection relates to and the first minimum threshold used in step (ii) is increased to the second higher minimum threshold and optionally repeating step (ii) and (iii).In certain embodiments, perform described system of selection until the undesirable mark of remaining candidate drugs combination is all equal to or less than the second minimum threshold in storehouse, or until in storehouse the quantity of remaining candidate drugs reduce to desired quantity.
In the various embodiments of either side of the present invention, described method relates to before step (i), differentiates or select to hybridize to the primer of target gene seat.In certain embodiments, multiple primer (or primer pair) hybridizes to identical target gene seat, and described system of selection is used for based on the primer (or a primer pair) of one or more Selecting parameter about this target gene seat.In different embodiments, described method relates to before step (ii), removes the primer pair producing the target amplicon overlapping with the target amplicon produced by another primer pair from described storehouse.In different embodiments, from the group of two or more candidate drugs, the candidate drugs with the equal undesirable mark removed from candidate drugs storehouse is selected based on one or more other parameter.In certain embodiments, the test primer storehouse in any one that in storehouse, remaining candidate drugs is used as in the inventive method.In certain embodiments, gained test primer storehouse comprises any one in primer storehouse of the present invention.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, with the sequence at target gene seat (such as, polymorphism) relevant incidence rate, with the sequence at target gene seat (such as, polymorphism) relevant disease penetrance, candidate drugs is to the specificity of target gene seat, the size of candidate drugs, the melting temperature(Tm) of target amplicon, the GC content of target amplicon, the amplification efficiency of target amplicon and the size of target amplicon.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, candidate drugs are to the GC content of the melting temperature(Tm) of the size of the specificity of target gene seat, candidate drugs, target amplicon, target amplicon, the amplification efficiency of target amplicon and the size of target amplicon; And test primer to comprise from the different target gene seat of at least 1,000 in the maternal DNA of the pregnant mothers of fetus and the sample of foetal DNA for amplification simultaneously to determine presence or absence fetal chromosomal abnormalities.In different embodiments, described method comprises the DNA molecular joined to by universal primer binding site in described sample; Use at least 1,000 Auele Specific Primer and a universal primer to increase the DNA molecular engaged, produce first group of amplified production; And use at least 1,000 pair of primer amplified first group of amplified production, produce second group of amplified production.In different embodiments, use at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different primer pair.In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, candidate drugs are to the GC content of the melting temperature(Tm) of the size of the specificity of target gene seat, candidate drugs, target amplicon, target amplicon, the amplification efficiency of target amplicon and the size of target amplicon; And at least 1,000 that tests that primer comprises in the sample of the DNA of hypothesis father from fetus for amplification simultaneously different target gene seat and amplification simultaneously comprise from the target gene seat in the maternal DNA of the pregnant mothers of fetus and the sample of foetal DNA to determine that whether described hypothesis father is the natural father of described fetus.In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, candidate drugs are to the GC content of the melting temperature(Tm) of the size of the specificity of target gene seat, candidate drugs, target amplicon, target amplicon, the amplification efficiency of target amplicon and the size of target amplicon; And test primer for increasing from the different target gene seat of at least 1,000 in the cell of embryo or multiple cell to determine presence or absence chromosome abnormalty simultaneously.In different embodiments, analyze the cell from one group of two or more embryo, and select an embryo in vitro fertilization.In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, candidate drugs are to the GC content of the melting temperature(Tm) of the size of the specificity of target gene seat, candidate drugs, target amplicon, target amplicon, the amplification efficiency of target amplicon and the size of target amplicon; And test primer for the different target gene seat of at least 1,000 of increasing in legal medical expert's nucleic acid samples simultaneously.In different embodiments, the length of annealing steps is greater than 3,5,8,10 or 15 minutes.In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, with the sequence at target gene seat (such as, polymorphism) relevant incidence rate, with the sequence at target gene seat (such as, polymorphism) relevant disease penetrance, candidate drugs is to the specificity of target gene seat, the size of candidate drugs, the melting temperature(Tm) of target amplicon, the GC content of target amplicon, the amplification efficiency of target amplicon and the size of target amplicon, and described method relate at least 1,000 different target gene seat that use test primer increases in contrast nucleic acid samples simultaneously with produce first group of target amplicon and target gene seat simultaneously in amplification assay nucleic acid samples to produce second group of target amplicon, and compare first and second groups of target amplicon not still to be present in another sample to determine whether target gene seat is present in a sample, or whether target gene seat is present in control sample and test sample with different levels.In different embodiments, sample is tested from the doubtful individuality with the risk of relative disease or phenotype or relative disease or phenotypic increase; And one or more wherein in target gene seat be included in the risk of target gene seat place and relative disease or phenotypic increase relevant or with relative disease or the relevant sequence (such as, polymorphism) of phenotype.In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
In the various embodiments of either side of the present invention, undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of target gene seat, with the sequence at target gene seat (such as, polymorphism) relevant incidence rate, with the sequence at target gene seat (such as, polymorphism) relevant disease penetrance, candidate drugs is to the specificity of target gene seat, the size of candidate drugs, the melting temperature(Tm) of target amplicon, the GC content of target amplicon, the amplification efficiency of target amplicon and the size of target amplicon, and described method relates to use test primer and increases and comprise 1 in the control sample of RNA simultaneously, 000 different target gene seat with produce first group of target amplicon and the target gene seat that comprises in the test sample of RNA of simultaneously increasing to produce second group of target amplicon, and compare first and second groups of target amplicon with determine control sample and test sample between in rna expression is horizontal presence or absence difference.In different embodiments, described RNA is mRNA.In different embodiments, sample is tested from the doubtful individuality with the risk of the increase of relative disease or phenotype (such as cancer) or relative disease or phenotype (such as cancer); And one or more wherein in target gene seat comprise relevant to the risk of relative disease or phenotypic increase or with relative disease or the relevant sequence of phenotype (such as, polymorphism or other suddenly change).In certain embodiments, sample is tested from the individuality after diagnosing with relative disease or phenotype (such as cancer); And wherein between control sample and test sample, Discrepancy Description target gene seat in rna expression is horizontal comprises the sequence relevant with the risk of relative disease or phenotypic increase or reduction (such as, polymorphism or other suddenly change).In different embodiments, at least 2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is amplified.
On the one hand, the invention is characterized in primer storehouse.In certain embodiments, any one in use the inventive method selects primer from candidate drugs storehouse.In certain embodiments, described storehouse comprises and hybridizes at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, the primer of 000 different target gene seat.In certain embodiments, described storehouse comprises increases at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, the primer of 000 different target gene seat.In certain embodiments, described storehouse comprises and increases at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is the primer of primer dimer to make the amplified production being less than 60%, 40%, 30%, 20%, 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.25%, 0.1% or 0.05%.In certain embodiments, described storehouse comprises and increases 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat is the primer of target amplicon to make the amplified production of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%.In certain embodiments, described storehouse comprises the target gene seat that simultaneously increases to make 1, and 000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer that in 000 different target gene seat, the target gene seat of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% is amplified.In certain embodiments, primer storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 primer pair, wherein often pair of primer comprises positive test primer and negative testing primer, and wherein often pair of test primer hybridization is to a target gene seat.In certain embodiments, primer storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 independent primer hybridizing to different target locus separately, wherein said independent primer is not a part for primer pair.
In the various embodiments of either side of the present invention, the concentration of each primer is less than 100,75,50,25,10,5,2 or 1nM.In different embodiments, the GC content of primer, between 30% to 80%, such as, between 40% to 70% or 50% to 60%, comprises end points.In certain embodiments, the scope of the GC content of primer is less than 30%, 20%, 10% or 5%.In certain embodiments, the melting temperature(Tm) of primer, between 40 DEG C to 80 DEG C, such as, between 50 DEG C to 70 DEG C, 55 DEG C to 65 DEG C or 57 DEG C to 60.5 DEG C, comprises end points.In certain embodiments, the scope of the melting temperature(Tm) of primer is less than 15 DEG C, 10 DEG C, 5 DEG C, 3 DEG C or 1 DEG C.In certain embodiments, the length of primer between 15 to 100 Nucleotide, such as, between 15 to 75 Nucleotide, 15 to 40 Nucleotide, between 17 to 35 Nucleotide, 18 to 30 Nucleotide or 20 to 65 Nucleotide, comprise end points.In certain embodiments, primer comprises non-targeted specific marker, such as, form the mark of inner loop structure.In certain embodiments, described mark is between two DNA lands.In different embodiments, primer comprises and has specific 5th ' district to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to target gene seat target gene seat.In different embodiments, the length in 3rd ' district is at least 7 Nucleotide.In certain embodiments, the length in 3rd ' district, between 7 and 20 Nucleotide, such as, between 7 to 15 Nucleotide or 7 to 10 Nucleotide, comprises end points.In different embodiments, primer comprises target gene seat is not had to specific 5th ' district (such as another kind of mark or universal primer binding site), is then have specific region to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to target gene seat target gene seat.In certain embodiments, the scope of the length of primer is less than 50,40,30,20,10 or 5 Nucleotide.In certain embodiments, the length of target amplicon, between 50 and 100 Nucleotide, such as, between 60 and 80 Nucleotide or 60 to 75 Nucleotide, comprises end points.In certain embodiments, the scope of the length of target amplicon is less than 50,25,15,10 or 5 Nucleotide.
On the one hand, the invention provides any one test kit for the target gene seat in amplification of nucleic acid sample comprised in primer storehouse of the present invention.In certain embodiments, test kit comprises the specification sheets using described storehouse amplification target gene seat.
On the one hand, the invention is characterized in the method for measuring the chromosomal ploidy state in the fetus in breeding.In certain embodiments, described method relate to make nucleic acid samples with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different polymorphic locus is to produce reaction mixture; Wherein said nucleic acid samples comprises from the maternal DNA of mother of fetus and the foetal DNA from fetus.In certain embodiments, reaction mixture is made to experience primer extension reaction condition to produce amplified production; Amplified production is measured to produce sequencing data with high-flux sequence instrument; Calculate on computers based on sequencing data and count at the allelotrope of polymorphic locus; Create multiple separately about the ploidy hypothesis of chromosomal different possibility ploidy state on computers; For often kind of ploidy hypothesis, be that the expectation allelotrope counting at polymorphic locus place on chromosome builds simultaneous distribution model on computers; Simultaneous distribution model and allelotrope is used to count the relative probability of each measured on computers in ploidy hypothesis; And by selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of interpretation fetus.
On the one hand, the invention is characterized in the method for measuring the chromosomal ploidy state in the fetus in breeding.In one embodiment, the method for measuring the chromosomal ploidy state in the fetus in breeding comprises the first DNA sample obtaining and comprise the maternal DNA from mother of fetus and the foetal DNA from fetus; The first sample is prepared so that sample is prepared in acquisition by DNA isolation; Measure the DNA at multiple polymorphic locus places of to prepare in sample on chromosome; Calculate count from about preparing DNA measuring result that sample obtains at the allelotrope of multiple polymorphic locus on computers; Create multiple separately about the ploidy hypothesis of chromosomal different possibility ploidy state on computers; For often kind of ploidy hypothesis, be that the expectation allelotrope counting at multiple polymorphic locus places on chromosome builds simultaneous distribution model on computers; Use simultaneous distribution model and about the allelotrope counting preparing sample measurement, measure the relative probability of each in ploidy hypothesis on computers; And by selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of interpretation fetus.
On the one hand, the invention is characterized in test chromosome comprise maternal and fetus DNA mixture sample in the method for skewed distribution.In certain embodiments, described method relate to (i) make described sample with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different target gene seat is to produce reaction mixture; Wherein said target gene seat is from multiple different karyomit(e); And wherein multiple different karyomit(e) comprise at least one doubtful there is skewed distribution in described sample the first chromosome and the second karyomit(e) of at least one supposition normal distribution in described sample; (ii) reaction mixture is made to experience primer extension reaction condition to produce amplified production; (iii) check order to obtain multiple sequence mark aimed at target gene seat to amplified production; Wherein the length of sequence mark is enough to distribute to specific targets locus; (iv) on computers multiple sequence mark is distributed to the target gene seat of its correspondence; V quantity that () measures the sequence mark aimed at the target gene seat of the first chromosome on computers and the quantity of sequence mark of aiming at the second chromosomal target gene seat; And (vi) compares quantity from step (v) on computers to determine the skewed distribution of presence or absence the first chromosome.
On the one hand, the invention provides the method for detecting presence or absence fetus dysploidy.In certain embodiments, described method relates to (i) and makes to comprise the sample of the mixture of the DNA of maternal and fetus and hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different non-polymorphic target gene seat is to produce reaction mixture; Wherein said target gene seat is from multiple different karyomit(e); (ii) reaction mixture is made to experience primer extension reaction condition to produce the amplified production comprising target amplicon; (iii) carry out quantitatively to the relative frequency of the target amplicon from the first and second relative chromosome on computers; (iv) relative frequency of the target amplicon from the first and second relative chromosome is compared on computers; And (v) is based on the relative frequency of the first and second compared relative chromosome, differentiate presence or absence dysploidy.In certain embodiments, the first chromosome is doubtful euploid karyomit(e).In certain embodiments, the second karyomit(e) is the karyomit(e) of doubtful dysploidy.
On the one hand, disclose a kind of method for determining presence or absence fetus dysploidy in the maternal tissue sample comprising fetus and maternal genomic dna, described method comprises (a) obtains fetus and maternal genomic dna mixture from described maternal tissue sample; B () DNA fragmentation to the mixture of the fetus and maternal genomic dna that are selected from step (a) at random carries out large-scale parallel DNA sequencing to determine the sequence of described DNA fragmentation; C () differentiates the karyomit(e) in step (b) belonging to gained sequence; D () uses the data of step (c) to determine the amount of at least one the first chromosome in the mixture of the genomic dna of described female parent and fetus, wherein said at least one the first chromosome supposition is euploid in fetus, e () uses the data of step (c) to determine the second chromosomal amount in the mixture of the genomic dna of described female parent and fetus, wherein said second karyomit(e) is doubtful is aneuploid in fetus; F () calculates the mark of foetal DNA in the mixture of fetus and maternal DNA; If (g) second target chromosome be euploid, so use the quantity in step (d) calculate the amount of the second target chromosome expectation distribution; If (h) second target chromosome be aneuploid, so use the first quantity in step (d) and the foetal DNA that calculates in step (f) mark in the mixture of fetus and maternal DNA, calculate the expectation distribution of the amount of the second target chromosome; And (i) uses maximum likelihood method or maximum a posteriori probability method to determine, and whether the second chromosomal amount as measured in step (e) may be more the distribution of calculating in step (g) or the part in the middle distribution calculated of step (h); Thus presence or absence fetus dysploidy is described.
In the various embodiments of either side of the present invention, one or both also comprising from fetus father and mother of described method obtains genotype data.In certain embodiments, obtain genotype data from one or both of fetus father and mother and comprise the DNA of preparation from father and mother, wherein said preparation is included in the DNA of multiple polymorphic locus priority enrichment to obtain prepared parent DNA, the parent DNA optionally increasing prepared, and measure the parent DNA prepared at multiple polymorphic locus place in sample.
In the various embodiments of either side of the present invention, the expectation allelotrope counting probability that the genetic data that one or both obtains from father and mother is polymorphic locus multiple karyomit(e) is used to build simultaneous distribution model.In certain embodiments, sample (such as, the first sample) has been separated and wherein by obtaining genotype data from about preparing DNA measuring result that sample obtains to estimate female genotype data from female parent from maternal blood plasma.
On the one hand, disclose a kind of diagnosis box for the chromosomal ploidy state in the fetus in helping to determine to breed, wherein said diagnosis box can perform the preparation of any one in the inventive method and measuring process.
In the various embodiments of either side of the present invention, allelotrope counting is probabilistic instead of binary.In certain embodiments, measure and to prepare in sample at the DNA at multiple polymorphic locus place also for determining the fetus the haplotype whether one or more disease of heredity is chain.
In the various embodiments of either side of the present invention, be undertaken by model dependency on karyomit(e) between polymorphic allele of the data of the probability by the chiasma about positions different in karyomit(e) for allelotrope counting probability builds simultaneous distribution model.In certain embodiments, use does not need use with reference to chromosomal method for allelotrope counting builds simultaneous distribution model and the step of the relative probability of execution mensuration often kind of hypothesis.
In the various embodiments of either side of the present invention, the relative probability measuring often kind of hypothesis make use of foetal DNA and is preparing the estimated score in sample.In certain embodiments, the DNA measuring result preparing sample for the relative probability calculating allelotrope counting probability and mensuration often kind hypothesis comprises original genetic data.In certain embodiments, the ploidy state corresponding to the hypothesis with maximum probability is selected to use maximum likelihood estimation or maximum a-posteriori estimation to perform.
In the various embodiments of either side of the present invention, the ploidy state of interpretation fetus also comprises the relative probability of each in the ploidy of the relative probability of each in the ploidy hypothesis using simultaneous distribution model and allelotrope counting probability to measure and Using statistics technique computes being supposed and combines, described statistical technique is taken from by the following group formed: read analysis of accounts, relatively heterozygosis rate, only in the Statistical information using parent's genetic information just available, for the probability of the normalization method genotype signal of some parent's background, use sample (such as, first sample) or prepare sample estimation fetus mark calculate Statistical information and its combination.
In the various embodiments of either side of the present invention, calculate the reliability estimating of the ploidy state of institute's interpretation.In certain embodiments, the ploidy state that described method also comprises based on institute's interpretation of fetus takes clinical evolution, the one that wherein said clinical evolution is selected from termination of pregnancy or maintains in gestation.
In the various embodiments of either side of the present invention, described method can between 4 weeks and 5 weeks gestation, between 5 weeks and 6 weeks gestation, between 6 weeks and 7 weeks gestation, between 7 weeks and 8 weeks gestation, between 8 weeks and 9 weeks gestation, between 9 weeks and 10 weeks gestation, between 10 weeks and 12 weeks gestation, between 12 weeks and 14 weeks gestation, between 14 weeks and 20 weeks gestation, between 20 weeks and 40 weeks gestation, three months, in three months, perform at three months, end or its fetus combined.
In the various embodiments of either side of the present invention, described method is used to create a report showing chromosomal the measured ploidy state in the fetus in breeding.In certain embodiments, disclose a kind of test kit using the ploidy state of the target chromosome measured in the fetus in breeding through design together with any one in the inventive method, described test kit comprises multiple inner forward primer and optionally multiple inner reverse primer, each in wherein said primer is designed to hybridize to target chromosome and the optionally and then upstream of one of polymorphic site and/or the region of DNA territory in downstream on extrachromosome, wherein said hybridising region is separated by a small amount of base and polymorphic site, wherein saidly be selected from by the following group formed on a small quantity: 1, 2, 3, 4, 5, 6 to 10, 11 to 15, 16 to 20, 21 to 25, 26 to 30, 31 to 60 and its combination.
On the one hand, the invention is characterized in and suppose that whether father is the method for the natural father of the fetus bred in pregnant mothers body for determining.In certain embodiments, described method relates to (i) and increases from the multiple polymorphic locuses on the genetic material of hypothesis father simultaneously, comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different polymorphic locus, thus produce first group of amplified production; (ii) amplification simultaneously derives from multiple polymorphic locuses corresponding on the DNA biased sample of the blood sample of pregnant mothers to produce second group of amplified production; Wherein DNA biased sample comprises foetal DNA and maternal DNA; (iii) based on first and second groups of amplified productions, use genotype measuring result, measure the probability that hypothesis father is the natural father of fetus on computers; And (iv) uses the hypothesis father measured to be that the probability of the natural father of fetus determines to suppose that whether father is the natural father of fetus.In different embodiments, described method comprises further and increasing from multiple polymorphic locuses corresponding on the genetic material of mother to produce the 3rd group of amplified production simultaneously; Wherein based on first, second, and third group of amplified production, use genotype measuring result to measure and suppose that father is the probability of the natural father of fetus.
On the one hand, the invention provides the method for relative likelihood that each embryo in estimation one group of embryo will grow on demand.In certain embodiments, described method relate to make from each embryo sample with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different target gene seat is to produce reaction mixture for each embryo, and wherein said sample derives from one or more cell from embryo separately.In certain embodiments, each reaction mixture is made to experience primer extension reaction condition to produce amplified production.In certain embodiments, described method comprises based on amplified production, measures one or more feature of at least one cell from each embryo on computers; And one or more feature of at least one cell based on each embryo, estimates the relative likelihood that each embryo will grow on demand on computers.
On the one hand, the invention is characterized in the method for the amount of two or more the target gene seats measured in nucleic acid samples.In certain embodiments, described method relates to (i) and uses pcr amplification to comprise the nucleic acid samples of the first standard gene seat, the second standard gene seat, first object locus and the second target gene seat to form amplified production; Wherein the first standard gene seat and first object locus have the Nucleotide of equal amts but its sequence is different at one or more Nucleotide place; And wherein the second standard gene seat and the second target gene seat have the Nucleotide of equal amts but its sequence is different at one or more Nucleotide place; (ii) check order to determine to compare the standard ratio of the first increased standard gene seat compared to the relative quantity of the second increased standard gene seat to amplified production; Wherein said standard ratio indicates the difference of amplification in PCR efficiency of the first standard gene seat and the second standard gene seat; (iii) mensuration compares the target rate of increased first object locus compared to the relative quantity of the second increased target gene seat; And (iv) based on the target rate of the standard ratio set-up procedure (iii) of step (ii) to determine the relative quantity of first object locus and the second target gene seat in sample.In different embodiments, described method relates to the absolute magnitude of first object locus and the second target gene seat in working sample.In different embodiments, described method comprises further and determines presence or absence target gene seat (such as at least 1,000,2 in sample, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different target gene seat).In different embodiments, described method relates to any one in use primer storehouse of the present invention.In different embodiments, described method relates to and increases 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, 000 different target gene seat.
On the one hand, the invention is characterized in the method for multiple genetic targets in Measurement and analysis sample quantitatively.In certain embodiments, described method comprises (i) and the genetic material deriving from sample for analysis and plurality of target specific amplification reagent and multiple standard sequences of corresponding to desired specificities amplifing reagent target is mixed; (ii) target area of amplification genetic material and standard sequence is to produce target amplicon and standard sequence amplicon; And (iii) measures the quantity of target amplicon and the standard sequence amplicon produced.In certain embodiments, genetic material is present in gene pool.In certain embodiments, genetic targets is polymorphic locus (such as SNP).In certain embodiments, the measurement of quantity realizes by counting sequence.In certain embodiments, described method comprises further and measures the chromosomal estimation copy number of at least one in the sample that comes from of gene pool, and wherein said mensuration relates to the quantity of the quantity of the sequence reads of comparison object amplicon and the sequence reads of standard amplification.In certain embodiments, standard sequence and gene pool comprise the general priming site that can be caused by same primers.In certain embodiments, mixing step comprises at least 10,100,500,1, and 000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 part of different desired specificities amplifing reagent and at least 10,100,500,1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 standard sequence.In different embodiments, described method relates to any one in use primer storehouse of the present invention.In different embodiments, described method relates to and increases 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, 000 different target area.In certain embodiments, the relative quantity of each in standard sequence is known.In certain embodiments, the relative quantity of each in described sequence is calibrated about with reference to genome.In certain embodiments, sample for analysis comprises fetus and maternal genomic mixture.In certain embodiments, sample for analysis derives from the blood of pregnant woman or derives from blood plasma.In certain embodiments, there is at least one dysploidy with reference to genome, such as, No. 13, No. 18, No. 21, the dysploidy at X or Y chromosome place.In certain embodiments, reference genome is diploid.
On the one hand, the invention is characterized in a kind of mixture comprising multiple gene standard sequence, the relative quantity of each gene standard sequence wherein in mixture is by determining with reference to genomic calibration.In different embodiments, described mixture comprises at least 10,100,500,1, and 000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 gene standard sequence.In different embodiments, gene standard sequence comprises the first general priming site, the second general priming site, first object specificity priming site, the second desired specificities priming site and the marker sequence between the first and second desired specificities priming sites, and wherein first object specific position and the second desired specificities priming site are between the first and second general priming sites.In different embodiments, described calibration relates to any one in use primer storehouse of the present invention.In different embodiments, described calibration relates to increases 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, 000 different target area.In certain embodiments, there is at least one dysploidy with reference to genome, such as, No. 13, No. 18, No. 21, the dysploidy at X or Y chromosome place.In certain embodiments, reference genome is diploid.
On the one hand, the invention is characterized in generation one group of method through calibrating cdna standard sequence.In certain embodiments, described method comprises (i) and forms amplification reaction mixture, and it comprises from the gene pool prepared with reference to genome, multiple desired specificities amplimer reagent set and the multiple gene standard sequences corresponding to desired specificities amplifing reagent group; (ii) amplification gene storehouse and gene standard sequence are with the amplicon of the amplicon and gene standard sequence that produce target sequence; (iii) quantity of the amplicon of measurement target sequence and the amplicon of gene standard sequence; And (iv) measures each relative quantity relative to each other in gene standard sequence, calibrates multiple gene standard sequence whereby.In different embodiments, at least 10,100,500,1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 are used, 000 gene standard sequence.In different embodiments, described method relates to any one in use primer storehouse of the present invention.In different embodiments, described method relates to and increases 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different sequence simultaneously.In certain embodiments, there is at least one dysploidy with reference to genome, such as, No. 13, No. 18, No. 21, the dysploidy at X or Y chromosome place.In certain embodiments, reference genome is diploid.
On the one hand, the invention provides one group according to the gene standard sequence that any one in the inventive method carries out calibrating.On the one hand, the invention provides one group can at the gene standard sequence performing before described method, carry out calibrating during this period or after this.
On the one hand, the invention is characterized in measure that there is the method that at least one has the copy number of the allelic genes involved of disappearance.In certain embodiments, described method comprises (i) and the genetic material deriving from sample for analysis is had specificity with to genes involved and the allelic amplifing reagent comprising disappearance of the genes involved that can not significantly increase, to mix corresponding to the standard sequence of genes involved, the standard sequence that reference sequences had to specific amplifing reagent and correspond to reference sequences; (ii) increase related gene sequence, standard sequence corresponding to genes involved, reference sequences and correspond to reference sequences standard sequence to produce genes involved amplicon, reference sequences amplicon and standard sequence amplicon; And (iii) measures the quantity of target amplicon and the standard sequence amplicon produced.In certain embodiments, the measurement of quantity realizes by counting sequence reads.In certain embodiments, described method comprises further and measures the chromosomal estimation copy number of at least one in the sample that comes from of gene pool, and wherein said mensuration relates to the quantity of the quantity of the sequence of comparison object amplicon and the sequence of standard amplification.In certain embodiments, standard sequence and gene pool comprise the general priming site that can be caused by same primers.In certain embodiments, the relative quantity of each in described sequence is calibrated about with reference to genome.In different embodiments, at least 10,100,500,1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 are used, 000 gene standard sequence.In different embodiments, described method relates to any one in use primer storehouse of the present invention.In different embodiments, described method relates to and increases 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100 simultaneously, 000 different target area.In certain embodiments, reference genome is diploid.In certain embodiments, sample for analysis derives from blood.
In some embodiments of either side of the present invention, priority enrichment sample (such as, first sample) at target gene seat (such as, multiple polymorphic locus) DNA at place comprises and obtains multiple circularizing probes in advance, wherein each probe target to locus (such as, polymorphic locus) in one, 3 ' and 5 ' end of wherein said probe is preferably designed to for hybridizing to the region of DNA territory separated by the polymorphic site of a small amount of base and locus, wherein said is 1 on a small quantity, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 to 25, 26 to 30, 31 to 60 or its combination, circularizing probes is in advance made to hybridize to the DNA of sample (such as, the first sample), use the gap between archaeal dna polymerase filling hybridization probe end, make circularizing probes cyclisation in advance, and the circularizing probes that increases.
In some embodiments of either side of the present invention, at target gene seat (such as, multiple polymorphic locus) DNA of priority enrichment comprises and obtains multiple PCR probe engaging mediation, wherein each PCR probe target to target gene seat (such as, polymorphic locus) in one, and wherein upstream and downstream PCR probe is designed to hybridize to the region of DNA territory on a chain of the DNA that the polymorphic site preferably by a small amount of base and locus separates, wherein said is 1 on a small quantity, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 to 25, 26 to 30, 31 to 60 or its combination, the PCR probe hybridization that joint is mediated is to the DNA of sample (such as, the first sample), gap between the PCR probe end using archaeal dna polymerase filling joint to mediate, the PCR probe engaging mediation is engaged, and the PCR probe of the joint mediation engaged of increasing.
Of the present invention various in some embodiments in, comprise the hybrid capture probe obtaining multiple target gene seat (such as, polymorphic locus) at the DNA of target gene seat (such as, multiple polymorphic locus) priority enrichment; Make hybrid capture probe hybridization to the DNA in sample (such as, the first sample) and from DNA sample (such as, the first sample), remove some or all of non-hybrid dna for physically.
In some embodiments of either side of the present invention, but hybrid capture probe is designed to the region hybridizing to side joint not overlapping polymorphic site.In certain embodiments, but hybrid capture probe is designed to the region hybridizing to side joint not overlapping polymorphic site, and wherein the length of side joint capture probe can be selected from by the following group formed: be less than about 120 bases, be less than about 110 bases, be less than about 100 bases, be less than about 90 bases, be less than about 80 bases, be less than about 70 bases, be less than about 60 bases, be less than about 50 bases, be less than about 40 bases, be less than about 30 bases and be less than about 25 bases.In certain embodiments, hybrid capture probe is designed to the region hybridizing to overlapping polymorphic site, and wherein multiple hybrid capture probe comprises at least two hybrid capture probes for each polymorphic locus, and wherein each hybrid capture probe is designed to and the different allelic complementations at polymorphic locus place.
In some embodiments of either side of the present invention, comprise at the DNA of multiple polymorphic locus priority enrichment and obtain multiple inner forward primer, one wherein in each primer target polymorphic locus, and wherein 3 ' end of inner forward primer is designed to hybridize to polymorphic site upstream and the region of DNA territory separated by a small amount of base and polymorphic site, is wherein saidly selected from the group be made up of 1,2,3,4,5,6 to 10,11 to 15,16 to 20,21 to 25,26 to 30 or 31 to 60 base pairs on a small quantity; Optionally obtain multiple inner reverse primer, one wherein in each primer target polymorphic locus, and wherein 3 ' end of inner reverse primer is designed to hybridize to polymorphic site upstream and the region of DNA territory separated by a small amount of base and polymorphic site, is wherein saidly selected from the group be made up of 1,2,3,4,5,6 to 10,11 to 15,16 to 20,21 to 25,26 to 30 or 31 to 60 base pairs on a small quantity; Internal primer is made to hybridize to DNA; And use polymerase chain reaction (PCR) amplification DNA to form amplicon.
In some embodiments of either side of the present invention, described method also comprises the multiple outside forward primer of acquisition, wherein each primer target target (such as, polymorphic locus) in one, and its peripheral forward primer is designed to the region of DNA territory hybridizing to inner forward primer upstream; Optionally obtain multiple outside reverse primer, one wherein in each primer target target gene seat (such as, polymorphic locus), and wherein outside reverse primer be designed to hybridize to and then inner counter to the region of DNA territory of primer downstream; Make the first primer hybridization to DNA; And use polymerase chain reaction (PCR) amplification DNA.
In some embodiments of either side of the present invention, described method also comprises the multiple outside reverse primer of acquisition, one wherein in each primer target polymorphic locus, and wherein outside reverse primer be designed to hybridize to and then inner counter to the region of DNA territory of primer downstream; Optionally obtain multiple outside forward primer, one wherein in each primer target target gene seat (such as, polymorphic locus), and its peripheral forward primer is designed to the region of DNA territory hybridizing to inner forward primer upstream; Make the first primer hybridization to DNA; And use polymerase chain reaction (PCR) amplification DNA.
In some embodiments of either side of the present invention, prepare sample (such as, first sample) comprise further general adapter is attached to sample (such as, first sample) in DNA and use the DNA in polymerase chain reaction (PCR) amplification sample (such as, the first sample).In certain embodiments, the amplicon be amplified of at least certain mark is less than 100bp, is less than 90bp, is less than 80bp, is less than 70bp, is less than 65bp, is less than 60bp, is less than 55bp, is less than 50bp or is less than 45bp, and wherein said mark is 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 99%.
In some embodiments of either side of the present invention, DNA amplification is carried out in one or more independent reaction volume, and wherein each independent reaction volume contains the forward different more than 100 and reverse primer pair, the forward different more than 200 and reverse primer pair, the forward different more than 500 and reverse primer pair, more than 1, 000 different forward and reverse primer pair, more than 2, 000 different forward and reverse primer pair, more than 5, 000 different forward and reverse primer pair, more than 10, 000 different forward and reverse primer pair, more than 20, 000 different forward and reverse primer pair, more than 50, 000 different forward and reverse primer to or more than 100, 000 different forward and reverse primer pair.
In some embodiments of either side of the present invention, prepare sample (such as, first sample) comprise further sample (such as, first sample) be divided into multiple part, and wherein in each part of priority enrichment at the DNA at target gene seat (such as, multiple polymorphic locus) subgroup place.In certain embodiments, internal primer by differentiate may be formed undesirable primer double helix primer pair and from multiple primer remove at least one being identified as in the primer pair that may form undesirable primer double helix select.In certain embodiments, internal primer contains the upstream or the region in downstream that are designed to hybridize target gene seat (such as, polymorphic locus), and optionally containing being designed to the general initiation sequence allowing pcr amplification.In certain embodiments, at least some in primer is additionally containing the random areas being different from each independent primer molecule.In certain embodiments, at least some in primer is additionally containing molecular barcode.
In some embodiments of either side of the present invention, the multiple of what priority enrichment caused the prepare average degree of allelotrope deviation between sample and sample (such as, the first sample) is selected from by the following group formed: be no more than 2 times, be no more than 1.5 times, be no more than 1.2 times, be no more than 1.1 times, be no more than 1.05 times, be no more than 1.02 times, be no more than 1.01 times, be no more than 1.005 times, be no more than 1.002 times, be no more than 1.001 times and be no more than 1.0001 times.In certain embodiments, multiple polymorphic locus is SNP.In certain embodiments, measure the DNA prepared in sample to be undertaken by order-checking.
In some embodiments of either side of the present invention, target gene seat is present on identical associated nucleic acid (such as phase homologous chromosomes or chromosomal same area).In certain embodiments, at least some in target gene seat is present on different associated nucleic acids (such as coloured differently body).In certain embodiments, nucleic acid samples comprises fragmentation or the nucleic acid through digestion.In certain embodiments, nucleic acid samples comprises genomic dna, cDNA or mRNA.In certain embodiments, nucleic acid samples comprises from single celled DNA.In certain embodiments, nucleic acid samples is not celliferous in fact blood or plasma sample.In certain embodiments, nucleic acid samples comprises or derives from blood, blood plasma, saliva, seminal fluid, sperm, cell culture supernatant, mucus secretion, dental plaque, stomach intestinal tissue, ight soil, urine, hair, bone, body fluid, tear, tissue, skin, nail, blastomere, embryo, amniotic fluid, fine hair sample, bile, lymph, cervical mucus or forensic samples.In certain embodiments, target gene seat is the section of people's nucleic acid.In certain embodiments, target gene seat comprises or it consists of single nucleotide polymorphism (SNP).In certain embodiments, primer is DNA molecular.
In some embodiments of either side of the present invention, the DNA in sample (such as, the first sample) is derived from maternal blood plasma.In certain embodiments, prepare sample (such as, the first sample) and comprise DNA amplification further.In certain embodiments, prepare sample (such as, the first sample) and comprise DNA at target gene seat (such as, multiple polymorphic locus) place in priority enrichment sample (such as, the first sample) further.
In different embodiments, primer extension reaction or polymerase chain reaction comprise and add one or more Nucleotide by polysaccharase.In different embodiments, primer extension reaction or polymerase chain reaction do not comprise the PCR engaging mediation.In different embodiments, primer extension reaction or polymerase chain reaction do not comprise and connect two primers by ligase enzyme.In different embodiments, primer does not comprise chain inversion probes (LIP), and it can also be called as circularizing probes, pre-cyclisation middle probe, cyclisation middle probe in advance; Padlock probe or molecular inversion probes (MIP).
Should be appreciated that, each aspect of the present invention described herein and embodiment comprise " comprising ", " consisting of " and " substantially consisting of " each side and embodiment.
Definition
Mononucleotide that can be different between the genome that single nucleotide polymorphism (SNP) refers to two members of same species.The use of described term should not imply any restriction to the frequency that each varient occurs.
Sequence refers to DNA sequence dna or gene order.It can refer to the one-level physical structure of DNA molecular or chain in individuality.It can refer to the sequence of the Nucleotide found in described DNA molecular, or the complementary strand of described DNA molecular.It can refer to the information comprised in DNA molecular, as being used for the information of representation DNA molecule in computer simulation.
Locus refers to the specific relevant range on individual DNA, and it can refer to the site of SNP, possible insertion or deletion segment or some other related genes variants.The chain SNP of disease can also refer to the chain locus of disease.
Polymorphic allele also claims " polymorphic locus ", refers to allelotrope or locus that between the individuality in appointment species, genotype is different.Some examples of polymorphic allele comprise single nucleotide polymorphism, STR typing, lack, copy and be inverted.
Polymorphic site refers to the specific nucleotide found in the Polymorphic Regions changed between individuals.
Allelotrope refers to the gene occupying specific gene seat.
Genetic data also claims " genotype data ", refers to the data of the genome aspect describing one or more individuality.It can refer to one or one group of locus, part or whole sequence, part or whole karyomit(e) or whole genome.It can refer to the identity of one or more Nucleotide; It can refer to one group of continuous print Nucleotide or from the Nucleotide of different positions in genome or its combination.Genotype data is normally simulated in a computer, but, also may consider in the sequence with the actual nucleotide that the genetic data of chemical code represents.Genotype data can be said to be " " individuality " on ", individual " ", " " individual, " from " individual or " " individuality " on ".Genotype data can refer to the output measuring result from gene type platform, and wherein those measurements are carried out genetic material.
Genetic material, also referred to as " genetic material ", refers to body substances, such as, come tissue or the blood of one or more individuality of self-contained DNA or RNA.
Noisy genetic data refers to following any genetic data: allelic loss, uncertain base pair are measured, incorrect base pair is measured, the base pair lost is measured, uncertain insertion or disappearance is measured, the measurement of uncertain chromosome segment copy number, spurious signal, loss are measured, other mistake or its combination.
Degree of confidence refers to that so-called SNP, allelotrope, allelotrope group, ploidy interpretation or the chromosome segment copy number measured correctly represent the statistical likelihood of individual breeding true state.
Ploidy interpretation, also referred to as " chromosomal copy number interpretation " or " copy number interpretation " (CNC), can refer to measure one or more chromosomal quantity existing in cell and/or the behavior of karyomit(e) identity.
Dysploidy refers to the state of the karyomit(e) (such as, the complete chromosome of number of errors or the chromosome segment of number of errors, such as, exist the disappearance of chromosome segment or copy) that there is number of errors in cell.When human body cell, it can not contain 22 pairs of euchromosomes and a pair heterosomal situation by phalangeal cell.When people's gamete, it can phalangeal cell containing the situation of in 23 karyomit(e)s.When single chromosome type, it can refer to wherein exist greater or less than two homologies but inconsistent chromosome copies, or wherein there is the situation being derived from two chromosome copies of same parent.In certain embodiments, the disappearance of chromosome segment is micro-deleted.
Ploidy state is quantity and/or the karyomit(e) identity of one or more chromosome type in phalangeal cell.
Karyomit(e) can refer to single chromosome copies, means the unique DNA molecule having 46 in normal somatic cell; An example is ' coming from No. 18 maternal karyomit(e)s '.Karyomit(e) can also refer to the chromosome type having 23 in normal human body cell; An example is ' No. 18 karyomit(e)s '.
Karyological character can refer to reference to karyomit(e) quantity, i.e. chromosome type.The normal mankind have the numbered euchromosome of 22 types and the sex chromosome of two types.It can also refer to chromosomal parental source.It can also refer to the specific karyomit(e) from parent's heredity.It can also refer to other attributive character chromosomal.
Genetic material state or abbreviation " genetic state " can refer to the identity of the upper one group of SNP of DNA, the phasing haplotype of genetic material and DNA sequence dna, comprise insertion, disappearance, repeat and sudden change.It can also refer to the ploidy state of one or more karyomit(e), chromosome segment or chromosome segment group.
Allelotrope data refer to about one group of one or more allelic one group of genotype data.It can specify phase Haplotype data.It can refer to SNP identity, and it can refer to the sequence data of DNA, comprises insertion, disappearance, repeats and sudden change.It can comprise each allelic parental source.
Allele status refers to the virtual condition of the gene in one group of one or more allelotrope.It can refer to the virtual condition of the gene described by allelotrope data.
Allelic ratio or allele ratio refer to the ratio between each allelic amount at the locus place be present in sample or in individuality.When by order-checking measure sample, allelic ratio can refer to the ratio of each allelic sequence reads being mapped to locus place.When by measuring method measure sample based on intensity, allele ratio can refer to as by as described in the ratio being present in each allelic amount at locus place estimated of measuring method.
Allelotrope counting refers to the quantity of the sequence being mapped to specific gene seat, and if described locus is polymorphic, so it refers to the quantity of the sequence of each be mapped in allelotrope.If counted each allelotrope in a binary fashion, so described allelotrope counting will be integer.If counted allelotrope with probabilistic manner, so described allelotrope counting can be mark.
Allelotrope counting probability refers to the quantity and mapping probabilities that may be mapped to specific gene seat or one group of allelic sequence at polymorphic locus place.It should be noted that allelotrope counting is equivalent to allelotrope counting probability, wherein the mapping probabilities of each counting sequence is binary (zero or one).In certain embodiments, allelotrope counting probability can be binary.In certain embodiments, allelotrope counting probability can be configured to equal DNA measuring result.
Allele distributions or ' allelotrope count distribution ' refer to each allelic relative quantity that each locus place in one group of locus exists.One group of measurement that allele distributions can refer to individuality, sample or carry out sample.When checking order, allele distributions refers to quantity or the possibility quantity of the specific allelic reading at each allelotrope place be mapped in one group of polymorphic locus.Allelotrope measuring result can process with probabilistic manner, that is, the allelic likelihood of appointment for the existence of specified sequence reading is the mark between 0 and 1, or they can process by binary mode, that is, any appointment reading is considered to be specific allelic zero or a copy just.
Allele distributions pattern refers to a different set of allele distributions for different parent's backgrounds.Some allele distributions pattern can indicate some ploidy state.
Allelotrope deviation refers to ratio existing in the allelic ratio and initial DNA sample measured at heterozygous genes seat in various degree.The allelotrope extent of deviation at specific gene seat place equal as measured as described in the allele ratio observed of locus place divided by initial DNA sample in as described in the allelic ratio at locus place.Allelotrope deviation can be defined as and be greater than one, if with the value x making the calculating of allelotrope extent of deviation return to be less than 1, so allelotrope extent of deviation can repeat as 1/x.Allelotrope deviation may be due to amplification deviation, purifying deviation or affect other phenomenon more not homoallelic by different way.
Primer is also referred to as " PCR probe ", refer to the set of unique DNA molecule (DNA oligomer) or DNA molecular (DNA oligomer), wherein DNA molecular is consistent, or it is almost consistent, and wherein primer contains the region being designed to hybridize to target gene seat (such as target polymorphic locus or non-polymorphic locus), and can containing being designed to the initiation sequence allowing pcr amplification.Primer can also contain molecular barcode.Primer can contain for the different random areas of each independent molecule.Term " test primer " and " candidate drugs " do not mean that to be restrictive and can to refer to any one in primer disclosed herein.
Primer storehouse refers to the colony of two or more primers.In different embodiments, described storehouse comprises at least 1,000,2, and 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different primers.In different embodiments, described storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different primer pair, wherein often pair of primer comprises positive test primer and negative testing primer, and wherein often pair of test primer hybridization is to a target gene seat.In certain embodiments, primer storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different independent primer hybridizing to different target locus separately, wherein said independent primer is not a part for primer pair.In certain embodiments, described storehouse has the independent primer (such as universal primer) that (i) primer pair and (ii) are not parts for primer pair.
Hybrid capture probe refers to by various method generation such as such as PCR or direct synthesis etc. and is intended to the possible adorned any nucleotide sequence with a chain complementation of the specific targets DNA sequence dna in sample.Can exogenous hybrid capture probe be added to preparing in sample and hybridized with the double helix forming exogenous-endogenous segment by sex change-reannealing process.These double helixs then can by various means for physically with sample separation.
Sequence reads refers to the data representing the nucleotide base sequence using clone sequencing to measure.Cloning and sequencing can produce single part of an expression initial DNA molecular or the sequence data of clone or cluster.Sequence reads can also have the correlated quality mark at each base positions place in sequence, and it represents that Nucleotide is by the probability of correctly interpretation.
Sequence of mapping reading is the process measuring the source position of sequence reads in the genome sequence of specific organism.The source position of sequence reads is the similarity of the nucleotide sequence based on reading and genome sequence.
Coupling copy errors, also referred to as " coupling chromosomal aneuploidy " (MCA), refers to that a cell contains two unanimously or almost consistent chromosomal aneuploid state.Such dysploidy can appear between the Formation period of gamete in reduction division, and can be referred to as reduction division not explode error.Such mistake can appear in mitotic division.Coupling trisomy can refer in individuality, there is the appointment karyomit(e) of three copies and two in described copy are consistent situations.
Unmatched copy errors, also referred to as " unique chromosomal aneuploidy " (UCA), refers to that a cell contains from same parent and can be homology but inconsistent two chromosomal aneuploid states.During such dysploidy can appear at reduction division, and reduction division mistake can be referred to as.Unmatched trisomy can refer to exist the appointment karyomit(e)s of three copies in individuality and in described copy two are homology but inconsistent situation from same parent.It should be noted that unmatched trisomy can refer to wherein to exist two homologous chromosomess from a parent and wherein said more chromosomal sections are consistent and other section is only the situation of homology.
Homologous chromosomes refers to the chromosome copies containing the same group of gene usually matched during reduction division.
Consistent karyomit(e) refers to containing same group gene and about each gene, they have unanimously or same group of almost consistent allelic chromosome copies.
Allelic loss (ADO) refers to from least one base pair in one group of base pair of homologous chromosomes in the situation of specifying allelotrope place can't detect.
Locus is lost (LDO) and is referred to from two base pairs in one group of base pair of homologous chromosomes in the situation of specifying allelotrope place can't detect.
Homozygote refers to have the chromogene seat of similar allelotrope as correspondence.
Heterozygote refers to have the chromogene seat of different allelotrope as correspondence.
Heterozygosis rate refers to the ratio of specifying locus place to have the individuality of Heterozygous alleles in colony.Heterozygosis rate can also refer to the allele ratio that the appointment locus place in individuality or DNA sample estimates or measures.
The single nucleotide polymorphism (HISNP) of high information quantity refers to that wherein fetus has the allelic SNP be not present in the genotype of mother.
Chromosomal region refers to chromosomal section or complete chromosome.
Chromosomal section refers to that magnitude range can be from a base pair to whole chromosomal chromosomal section.
Karyomit(e) refers to complete chromosome or chromosomal section or part.
Copy refers to the copy number of chromosome segment.It can refer to the consistent copying of chromosome segment or inconsistent homologous copies, and wherein the difference of chromosome segment copies containing one group of locus similar in fact, and one or more in its allelic is different.Should note under the certain situation of dysploidy, such as M2 copy errors, it is inconsistent for likely specifying some copies of chromosome segment to be some copies of consistent and identical chromosome segment.
Allelic combination on multiple locus that haplotype refers to usual coinheritance on same karyomit(e).Depend on one group of quantity of specifying the recombination event occurred between locus, haplotype can refer to reach two locus less or refer to whole karyomit(e).Haplotype can also refer to one group of single nucleotide polymorphism (SNP) on single chromatid that statistics is correlated with.
Haplotype data, also referred to as " phasing data " or " orderly genetic data ", refers to from the single chromosomal data in diploid or polyploid genome, that is, the chromosomal separated female parent in diploid gene group or male parent copy.
Phasing refers in view of unordered diploid (or polyploid) genetic data, measures the behavior of individual haplotype genetic data.It can pointer to item chromosome finds one group of allelotrope, determine which and each the relevant behavior in the homologous chromosomes of two in individuality in two genes at allelotrope place.
Phasing data refer to the genetic data determining one or more haplotype.
Supposing to refer to may ploidy state or specify one group of locus possible allele status at a group one group of specify chromosomal.This group possibility can comprise one or more element.
Copy number hypothesis, also referred to as " ploidy state hypothesis ", refers to the hypothesis about the chromosomal copy number in individuality.It can also refer to the hypothesis of the identity about each in karyomit(e), which bar comprising the chromosomal parental source of every bar and exist in individuality in two karyomit(e)s of parent.It can also refer in heredity, correspond to the individual chromosomal hypothesis of appointment about from which karyomit(e) of related individuals or chromosome segment (if present).
Target individual refers to the individuality that will measure its genetic state.In certain embodiments, only limited amount DNA can available from target individual.In certain embodiments, target individual is fetus.In certain embodiments, more than one target individual can be there is.In certain embodiments, each fetus deriving from a pair father and mother can be considered to target individual.In certain embodiments, the genetic data that measure is one or one group of allelotrope interpretation.In certain embodiments, the genetic data that measure is ploidy interpretation.
Related individuals refers to relevant in target individual heredity and therefore shares any individuality in single times of territory with it.In one case, related individuals can be the gene father and mother of target individual or derive from any genetic material of father and mother, such as sperm, polar body, embryo, fetus or child.It can also refer to siblings, father and mother or grand parents.
Siblings refer to any individuality that its gene father and mother are identical with discussed individuality.In certain embodiments, it can refer to go out to bear child, embryo or fetus, or derive to go out to bear child, one or more cell of embryo or fetus.The monoploid that siblings can also refer to be derived from father and mother one side is individual, such as sperm, polar body or other group haplotype genetic material any.The individual siblings that can be considered to self.
Fetus refers to " fetus " or " heredity being similar to the placenta region of fetus ".In pregnant woman, some part of placenta is similar to fetus in heredity, and the foetal DNA of the unmanaged flexibility found in maternal blood may derive from the part that placenta matches with fetus genotype.Should notice that in fetus, the chromosomal genetic information of half is heredity mother from fetus.In certain embodiments, be considered to " fetal origin " from the chromosomal DNA from fetal cell of these maternal inheritances, instead of " maternal source ".
The DNA of fetal origin refers to the initial DNA being substantially equivalent to a part for the genotypic cell of fetus as genotype.
The DNA in maternal source refers to the initial DNA being substantially equivalent to a part for the genotypic cell of mother as genotype.
Child can refer to embryo, blastomere or fetus.Should note in disclosed embodiment of this invention, described concept is applicable to as going out to bear child equally well, fetus, embryo or one group of cell from it individuality.The use of term child can be intended to mean that the individuality being referred to as child is the hereditary offspring of father and mother simply.
Father and mother refer to individual genetic mother or father.Individuality has two parents (maternal and male parent) usually, but situation may not necessarily like this, such as, in gene or chromosomal mosaic.Parent can be considered to individual.
Parent's background refers in each in one or both two relative chromosome in two parents of target, specifies the genetic state of SNP.
Grow on demand and also claim " normal development ", refer to the Embryonic limb bud cell uterus that can survive and cause gestation and/or gestation continue and cause life birth and/or gone out to bear child not have chromosome abnormalty and/or gone out to bear child not have other undesirable hereditary patient's condition, such as disease linked gene.Any situation intending to contain father and mother and the promoter that keeps healthy may wish " grown " on demand in term.In some cases, " on demand grow " can refer to the embryo maybe can survived of can not surviving being applicable to medical research or other object.
Be inserted in uterus the process referring to and embryo to be transferred to when being fertilized in vitro in uterine cavity.
Maternal blood plasma refers to the blood plasma fractions of the blood from pregnant female.
Clinical decision refers to any decision of the action taking or do not take the result with the individual health of impact or survival.When antenatal diagnosis, the decision that clinical decision can refer to have an abortion or not have an abortion.The clinical decision of A can also refer to carry out test further decision, take action to alleviate undesirable phenotype or take action to get out the birth with abnormal child.
Diagnosis box refers to the machine of one or more aspects or the combination of machine that are designed to perform method disclosed herein.In one embodiment, diagnosis box can be placed on patient care point.In one embodiment, diagnosis box can perform target amplification, order-checking in succession.In one embodiment, diagnosis box can separately or work by means of technician.
Method based on information refers to and depends on the method that statistics understands mass data to a great extent.When antenatal diagnosis, it refers to a large amount of genetic datas be designed in view of such as from molecular array or order-checking, by inferring most probable state with statistical instead of being determined the method for one or more chromosomal ploidy state or one or more allelic allele status by direct physical measuring state.In one embodiment of the invention, the technology based on information can be technology disclosed in this patent.In one embodiment of the invention, it can be PARENTAL SUPPORT tM.
Original genetic data refers to the analog intensity signal exported by gene type platform.When SNP array, original genetic data refers to and is carrying out the strength signal before any genotype interpretation.When check order, original genetic data refers to the analogue measurement result being similar to color atlas, it measure any base pair identity before and before sequence is mapped to genome, complete sequenator.
Secondary genetic data refers to the treated genetic data exported by gene type platform.When SNP array, secondary genetic data refers to the allelotrope interpretation undertaken by the software relevant to SNP array reader, and wherein said software has been made and specified the interpretation of allelotrope presence or absence in sample.When checking order, secondary genetic data refers to the base pair identity determining sequence, and also may refer to that described sequence has been mapped to genomic where.
Non-invasive prenatal diagnosis (NPD) or also referred to as " the antenatal screening of Noninvasive " (NPS), refer to the genetic material using and find in the blood of mother, measure the method for the genetic state of the fetus bred in mother's body, wherein said genetic material is that the intravenously blood by extracting mother obtains.
Corresponding to the DNA of locus priority enrichment or after the priority enrichment of the DNA at locus place refers to impel enrichment, correspond to the DNA molecular of described locus in DNA mixture per-cent higher than enrichment before correspond to any method of the per-cent of the DNA molecular of described locus in DNA mixture.Described method can relate to the DNA molecular that selective amplification corresponds to locus.Described method can relate to the DNA molecular removed and do not correspond to locus.Described method can relate to Combination of Methods.The per-cent that enrichment is defined as the DNA molecular corresponding to described locus after enrichment in mixture is divided by the per-cent of DNA molecular corresponding to described locus in enrichment forward slip value thing.Priority enrichment can perform at multiple locus place.In some embodiments of the invention, enrichment is greater than 20.In some embodiments of the invention, enrichment is greater than 200.In some embodiments of the invention, enrichment is greater than 2, and 000.When execution is when the priority enrichment of multiple locus, enrichment can refer to the average enrichment of all locus in locus group.
Amplification refers to the method for the copy number increasing DNA molecular.
Selective amplification can refer to the method for the copy number of the DNA molecular increasing specific DNA molecular or correspond to specific region of DNA territory.It can also refer to increase particular target does not just increase non-targeted molecule or method from region of DNA territory to the copy number in DNA molecular or target region of DNA territory.Selective amplification can be the method for priority enrichment.
General initiation sequence refers to the DNA sequence dna that such as can be attached to target dna molecule colony by joint, PCR or the PCR engaging mediation.After adding target molecule colony to, having specific primer to general initiation sequence can in order to use pair for amplification primer to the target group that increase.The usual driftlessness sequence of general initiation sequence has nothing to do.
General adapter or ' joint adapter ' or ' storehouse mark ' are the DNA moleculars of the general initiation sequence containing 5 ' and the 3 ' end that can be covalently attached to target double chain DNA molecule colony.5 ' and the 3 ' end being added to target group of adapter provides general initiation sequence, pair for amplification primer can be used from described general initiation sequence generation pcr amplification, increase to all molecules from target group.
Target refers to the method for those DNA moleculars for corresponding to one group of locus in selective amplification or priority enrichment DNA mixture.
Simultaneous distribution model refers to the model of definition probability of occurrence, and described event is defined about multiple stochastic variable, and specify in multiple stochastic variables that identical probability space defines, wherein the probability of variable is chain.In certain embodiments, the degeneracy situation that the probability of variable can be used not chain.
Accompanying drawing explanation
Disclosed embodiment of this invention will make an explanation about accompanying drawing further, wherein in several view, mentions similar structure by similar numeral.Shown figure not necessarily draws in proportion, and emphasis is in fact mainly placed in the principle of explanation disclosed embodiment of this invention.
Fig. 1: the figure of the miniature PCR method of direct combination represents.
The figure of the miniature PCR method of Fig. 2: half nested type represents.
Fig. 3: the figure of the complete miniature PCR method of nested type represents.
Fig. 4: the figure of the miniature PCR method of half side nested type represents.
Fig. 5: the figure of the miniature PCR method of triple half side nested types represents.
Fig. 6: the figure of the miniature PCR method of monolateral nested type represents.
Fig. 7: the figure of monolateral miniature PCR method represents.
Fig. 8: the figure of the miniature PCR method of reverse half nested type represents.
Some possible workflows of Fig. 9: half nested type method.
Figure 10: become the figure of engagement of loops adapter to represent.
Figure 11: the figure of inner marker primer represents.
Figure 12: the example with some primers of inner marker.
Figure 13: use the figure of the method for carrying out containing the primer engaging adapter land to represent.
Figure 14: the simulation ploidy interpretation accuracy adopting the counting process of two kinds of different analytical technologies.
Two kinds of allelic ratios of Figure 15: the multiple SNP in experiment 4 in clone.
Figure 16: the multiple two kinds of allelic ratios being carried out the SNP classified by karyomit(e) in experiment 4 in clone.
Figure 17 A-D: the multiple two kinds of allelic ratios being carried out the SNP classified by karyomit(e) in four parts of pregnant woman blood plasma samples.
Figure 18: the mark of the data can explained by the binomial variance before and after Data correction.
Figure 19: the figure showing the foetal DNA relative enrichment after scheme is prepared in short storehouse in sample.
Figure 20: the reading depth map comparing Direct PCR and half nested type method.
The reading depth ratio of the Direct PCR of Figure 21: three genomic samples comparatively.
The reading depth ratio of the miniature PCR of half nested type of Figure 22: three samples comparatively.
Figure 23: 1,200 heavy and 9,600 reading depth ratios of heavily reacting comparatively.
Figure 24: at the reader percentage of six cells at three karyomit(e) places.
Figure 25: the allele ratio of two three cell responses at three articles of karyomit(e) places and the 3rd reaction operated on 1ng genomic dna.
Figure 26: at the allele ratio of two unicellular reactions at three karyomit(e) places.
The comparison in Figure 27: two primer storehouses, shows the quantity with the locus of specific secondary gene frequency of each primer storehouse institute target.
The electrophorogram of Figure 28 A:PCR product.Figure 28 B-28M is the electrophorogram of the swimming lane 1-12 in Figure 28 A respectively.
Figure 29 A-29E: the sketch for measuring the inventive method of fetus dysploidy describes (Figure 29 A).Adopt maternal and male parent gene type data (carry out autoblood or the cheek smears examination) and produces (Figure 29 B) from the chiasma frequency data of HapMap database and suppose (Figure 29 C) for multiple independences of often kind of computer simulation possible fetus ploidy state.Each in these hypothesis is extended to the son hypothesis comprising and consider different possible point of crossing.Data model predicts in view of often kind of fetus genotype and different fetus cfDNA marks supposed, it in what (expectation allele distributions), and will compare with actual sequencing data by sequencing data; Bayesian statistics (Bayesian statistics) is used to measure the likelihood of often kind of hypothesis.In this hypothesis instance, determine the highest hypothesis of likelihood (orthoploidy) (Figure 29 D).For each copy number hypothesis family (monosomy, disomy or triploidy), the independent likelihood of Figure 29 C is sued for peace.The hypothesis with PRML is read as ploidy state, discloses fetus mark, and represents the accuracy (Figure 29 E) of sample specificity calculating.
Figure 30 A-30H: the typical graphics of orthoploidy (Figure 30 A-30C), monosomy (Figure 30 D) and trisomy (Figure 30 E-30H) represents.About all figure, x-axis represents that independent polymorphic locus is along the chromosomal linear position of every bar (as pointed in figure below), and y-axis represents the allelic reading value of A of the fractional form in total (A+B) allelotrope reading.Genotype and the band center position in y-axis that is maternal and fetus is indicated on the right side of figure.Visual in order to promote if desired, color coding can be carried out according to female genotype to described figure, to make redness represent female genotype AA, blue expression female genotype BB, and green expression female genotype AB.If desired, in " fetus genotype " those row, maternal allelotrope contribution can be marked by color.With female parent | fetal forms represents that allelotrope is contributed, to make mother for AA and the allelotrope that fetus is AB is expressed as AA|AB.Figure 30 A: when existence two karyomit(e)s and fetus cfDNA mark is 0% time produce figure.Therefore this figure from not having conceived women, and represents the pattern when genotype is female parent completely.Allelotrope is bunch therefore centered by 1 (AA allelotrope), 0.5 (AB allelotrope) and 0 (BB allelotrope).Figure 30 B: when existence two karyomit(e)s and fetus mark is 12% time produce figure.The contribution of foetal allele to the mark of A allelotrope reading makes the position of some allelotrope points move in y-axis or move down, to make each band centered by 1 (AA|AA allelotrope), 0.94 (AA|AB allelotrope), 0.56 (AB|AA allelotrope), 0.50 (AB|AB allelotrope), 0.44 (AB|BB allelotrope), 0.06 (BB|AB allelotrope) and 0 (BB|BB allelotrope).Figure 30 C. when existence two karyomit(e)s and fetus mark is 26% time produce figure.The pattern comprising two redness and two blue peripheral zones and one group of three middle green band is obviously (color is not shown in the figures).Each band is centered by 1 (AA|AA allelotrope), 0.87 (AA|AB allelotrope), 0.63 (AB|AA allelotrope), 0.50 (AB|AB allelotrope), 0.37 (AB|BB allelotrope), 0.13 (BB|AB allelotrope) and 0 (BB|BB allelotrope).Figure 30 D: the figure produced when there is item chromosome and fetus mark is 26%.The indicator model of an external red and an outside blue peripheral zone and two middle green bands represents the monosomy (color is not shown in the figures) of maternal inheritance.Because fetus only contribute to single allelotrope (A or B) to allelotrope reading, so internal edge band that is red and blueness does not exist, and one of centre group of three band is compressed into two bands (color is not shown in the figures).Each band is centered by 1 (AA|A allelotrope), 0.57 (AB|A allelotrope), 0.43 (AB|B allelotrope) and 0 (BB|B allelotrope).Figure 30 E: when existence three karyomit(e)s and fetus mark is 27% time produce figure.This pattern has two redness and two blue peripheral zones and two middle green bands, represents the reduction division trisomy (color is not shown in the figures) of maternal inheritance.Each band is centered by 1 (AA|AAA allelotrope), 0.88 (AA|AAB allelotrope), 0.56 (AB|AAB allelotrope), 0.44 (AB|ABB allelotrope), 0.12 (BB|ABB allelotrope) and 0 (BB|BBB allelotrope).Figure 30 F: when existence three karyomit(e)s and fetus mark is 14% time produce figure.This pattern has three redness and three blue peripheral zones and two middle green bands, represents the reduction division trisomy (color is not shown in the figures) of paternal inheritance.Each band is centered by 1 (AA|AAA allelotrope), 0.93 (AA|AAB allelotrope), 0.87 (AA|ABB allelotrope), 0.60 (AB|AAA allelotrope), 0.53 (AB|AAB allelotrope), 0.47 (AB|ABB allelotrope), 0.40 (AB|BBB allelotrope), 0.13 (BB|AAB allelotrope), 0.07 (BB|ABB allelotrope) and 0 (BB|BBB allelotrope).Figure 30 G: when existence three karyomit(e)s and fetus mark is 35% time produce figure.This pattern has two redness and two blue peripheral zones and four middle green bands, represents the mitotic division trisomy (color is not shown in the figures) of maternal inheritance.Each band is centered by 1 (AA|AAA allelotrope), 0.85 (AA|AAB allelotrope), 0.72 (AB|AAA allelotrope), 0.57 (AB|AAB allelotrope), 0.43 (AB|ABB allelotrope), 0.28 (AB|BBB allelotrope), 0.15 (BB|ABB allelotrope) and 0 (BB|BBB allelotrope).Figure 30 H: when existence three karyomit(e)s and fetus mark is 25% time produce figure.This pattern has two redness and two blue peripheral zones and four middle green bands, represents the mitotic division trisomy (color is not shown in the figures) of paternal inheritance.This pattern can be distinguished by the position of internal edges marginal zone and the mitotic division trisomy (as in Figure 30 G) of maternal inheritance.Specifically, be respectively with centered by 1 (AA|AAA allelotrope), 0.78 (AA|ABB allelotrope), 0.67 (AB|AAA allelotrope), 0.56 (AB|AAB allelotrope), 0.44 (AB|ABB allelotrope), 0.33 (AB|BBB allelotrope), 0.22 (BB|AAB allelotrope) and 0 (BB|BBB allelotrope).
Figure 31: as directed (Figure 31 A) euploid, (Figure 31 B) T13, (Figure 31 C) T18, (Figure 31 D) T21, (Figure 31 E) 45, the figure that X and (Figure 31 F) 47, XXY tests sample represents.Every bar karyomit(e) illustrates the top at figure, and fetus and maternal genotype illustrate the right side at figure, and x-axis represents that SNP is along the chromosomal linear position of every bar, and y-axis represents the allelic reading value of the A of the fractional form in total indicator reding.It should be noted that bunch location of change is based on fetus mark, as described herein.The single SNP locus of each expression.Fetus and maternal genotype illustrate the right side at figure, and karyomit(e) identity illustrates the top at figure.
Figure 32: sex chromosome dysploidy combination birth time prevalence rate be greater than euchromosome dysploidy combination birth time prevalence rate.
Although above-mentioned each figure has set forth disclosed embodiment of this invention, also contain other embodiment, pointed by discussion.The present invention illustrates exemplary embodiments by means of diagram, and is not limited thereto.Those skilled in the art can expect many other amendment and embodiments in the scope and spirit of the principle of disclosed embodiment of this invention.
Embodiment
Part of the present invention is based on following beat all discovery: the primer dimer of a large amount of amplifications usually only having the primer of relative small number result in primer storehouse to be formed during composite PCR reaction.Develop method to select the most undesirable primer to remove from candidate drugs storehouse.By the amount of primer dimer being reduced to negligible quantity (about 0.1% of PCR primer), increase a large amount of target gene seat in single composite PCR reaction simultaneously in these methods permission gained primer storehouse.Because primer hybridization increases to target gene seat and to it instead of hybridizes to other primer and form the primer dimer increased, so the quantity of the different target locus that can be amplified increases.Also find, used the primer concentration lower than normal circumstances and much longer annealing time to add primer hybridization to target gene seat instead of hybridize each other and form the likelihood of primer dimer.
In genomic samples 19, during the pcr amplification of 488 target gene seats and order-checking, the order-checking reading of 99.4%-99.7% is mapped to genome, wherein 99.99% is mapped to target gene seat.About the plasma sample with 1,000 ten thousand order-checking readings, usual 19, in 488 target gene seats, at least 19,350 (99.3%) are amplified and check order.The target gene seat that can once simultaneously increase so a large amount of greatly reduces analyzes time quantum needed for thousands of target locus and DNA amount.For example, be enough to analyze thousands of target locus from single celled DNA simultaneously, this is important concerning DNA measures low application, such as, carries out genetic test to from the unicellular of embryo or carry out genetic test to the forensic samples containing few DNA before being fertilized in vitro.In addition, instead of multiple differential responses can be become by sample splitting to reduce contingent variability between reaction at evaluating objects locus in a reaction volume (such as in a room or hole).In addition, method of having developed uses reference standard to correct contingent amplification deviation between different target locus.For example, the difference of the amplification efficiency caused due to factors such as such as GC content between target gene seat can cause the target gene seat in fact existed with identical amount to produce the PCR primer of different amount.The use being similar to the reference standard of target gene seat allows to detect this kind of amplification deviation and can be corrected in the quantitative period of target gene seat to make it.
During the order-checking of PCR primer, the artefact of such as primer dimer detected and therefore inhibit the detection of target amplicon.Due to this restriction, the microarray containing hybridization probe is usually used to detect, because the interference of microarray to primer dimer is more insensitive.The present replacement scheme allowing in succession to use PCR, order-checking is used as microarray by the high-level compound that minimum non-targeted amplicon realizes.
Composite PCR method of the present invention may be used in various application, the detection of such as gene type, chromosome abnormalty (such as fetal chromosomal aneuploidy), transgenation and polymorphism (such as single nucleotide polymorphism, SNP) analysis, Deletion analysis, parental right measure, the analysis of gene difference in colony, forensic analysis, measurement predisposition thereto, the quantitative analysis of mRNA and the detection of infectious agent (such as bacterium, parasite and virus) and discriminating.Composite PCR method can also be used for the antenatal test of Noninvasive, such as the detection of parental right test or fetal chromosomal abnormalities.
exemplary primer design method
Height composite PCR can produce the DNA product produced by unproductive side reaction (such as primer dimer is formed) of very high percentage usually.In one embodiment, the specific primer most possibly causing unproductive side reaction can being removed from primer storehouse, obtaining the primer storehouse being mapped to genomic DNA amplification by producing more vast scale.The step removing problematic primer (that is, especially likely forming those primers dimeric) has achieved extremely high PCR compound level, unexpectedly to carry out subsequent analysis by order-checking.In performance because of primer dimer and/or other fault product in significantly reduced system (such as checking order), achieve the compound described than other and exceeded and be greater than 10 times, be greater than 50 times and be greater than the compound of 100 times.It should be noted that this is contrary based on the detection method of probe with such as microarray, Plutarch graceful (TAQMAN), PCR etc., in these methods, excessive primer dimer can not affect result significantly.The general conviction that it shall yet further be noted that in the art is that the composite PCR for checking order is limited to about 100 detections in same hole.Fu Luda (Fluidigm) and reyn provide the platform for performing 48 or 1000 PCR detections in parallel reaction to a sample when this (Rain Dance).
From storehouse, select to make the amount of non-mapping primer dimer or other primer fault product drop to minimum primer there is various ways.Rule of thumb data shows, a small amount of ' bad ' primer causes a large amount of non-mapping primer dimer side reaction.Remove the per-cent that these ' bad ' primers can increase the sequence reads being mapped to target gene seat.Differentiate that a kind of mode of ' bad ' primer checks the sequencing data of the DNA be amplified that to be increased by target; Those primer dimers that seen frequency is maximum can be removed, obtain the primer storehouse that obvious unlikely generation is not mapped to genomic by product DNA.Also exist disclose the available combination that can calculate various combination of primers can program, and remove those combination of primers that combination can be the highest and also will obtain obvious unlikely generation and be not mapped to the primer storehouse of genomic by product DNA.
For selecting in some embodiments of primer, create initial candidate primer storehouse by one or more primer or primer pair being designed to candidate target locus.Can select one group of candidate target locus (such as SNP) based on the openly available information about the parameter desired by target gene seat, described information is the frequency of SNP or the heterozygosis rate of SNP in target group such as.In one embodiment, Primer3 program (www.primer3.sourceforge.net can be used; Libprimer3 version 2 .2.3, its mode hereby quoted in full is incorporated herein) design PCR primer.If desired, primer can be designed to anneal in specific annealing region, has the GC content of specified range, has certain size range, produces the target amplicon in certain size range and/or has other parameter attribute.Primer is added for initial substance or primer pair will remain in the likelihood in storehouse for major part or all target gene seats with often kind of multiple primer of candidate target locus or primer pair.In one embodiment, selection criterion may need each at least one primer pair of target gene seat to remain in storehouse.By that way, most of or all target gene seats will be amplified when using final primer storehouse.This is just desired by following application, such as, disappearance in a large amount of position in screening-gene group or copy, or a large amount of sequences that the disease risks of screening and disease or increase is relevant (such as polymorphism or other suddenly change).If the target amplicon overlapping with the target amplicon produced by another primer pair will be produced from a primer pair in storehouse, in described primer pair one so can be removed to prevent from disturbing from storehouse.
In certain embodiments, (such as on computers calculate) is calculated from the major part of two kinds of primers in candidate drugs storehouse or " undesirable mark " (desirability that higher fraction representation is minimum) of likely combining.In different embodiments, the undesirable mark of the possible candidate drugs combination of in storehouse at least 80%, 90%, 95%, 98%, 99% or 99.5% is calculated.Each undesirable mark is based, at least in part, between two kinds of candidate drugs and forms dimeric likelihood.If desired, undesirable mark can also be selected from by other parameter of the following group formed based on one or more: the heterozygosis rate of target gene seat, with the sequence at target gene seat (such as, polymorphism) relevant incidence rate, with at the relevant disease penetrance of the sequence (such as, polymorphism) of target gene seat, candidate drugs to the GC content of the melting temperature(Tm) of the size of the specificity of target gene seat, candidate drugs, target amplicon, target amplicon, the amplification efficiency of target amplicon and the size of target amplicon.If consider multiple factor, so undesirable mark can calculate based on the weighted mean of parameters.Described parameter can distribute different weights based on it for by the importance of the application-specific using primer.In certain embodiments, from storehouse, remove the highest primer of undesirable mark.If the primer removed is the member of the primer pair hybridizing to a target gene seat, another member of described primer pair so can be removed from storehouse.Can the process of optionally repeated removal primer.In certain embodiments, described system of selection is performed until the undesirable mark of remaining candidate drugs combination is all equal to or less than minimum threshold in storehouse.In certain embodiments, described system of selection is performed until the quantity of remaining candidate drugs reduces to desired quantity in storehouse.
In different embodiments, after the undesirable mark of calculating, from storehouse, to remove in the maximum quantity combination as two kinds of candidate drugs undesirable mark higher than the candidate drugs of the part of the first minimum threshold.This step have ignored the interaction being equal to or less than the first minimum threshold, because these interactions are less important.If the primer removed is the member of the primer pair hybridizing to a target gene seat, another member of described primer pair so can be removed from storehouse.Can the process of optionally repeated removal primer.In certain embodiments, described system of selection is performed until the undesirable mark of remaining candidate drugs combination is all equal to or less than the first minimum threshold in storehouse.If the quantity of remaining candidate drugs is higher than desired in storehouse, so can by the first minimum threshold being reduced to the second lower minimum threshold and the process of repeated removal primer reduces primer quantity.If the quantity of remaining candidate drugs is lower than desired in storehouse, so can by the first minimum threshold being increased to the second higher minimum threshold and using the process of initial candidate primer storehouse repeated removal primer to continue described method, thus allow to remain more candidate drugs in storehouse.In certain embodiments, perform described system of selection until the undesirable mark of remaining candidate drugs combination is all equal to or less than the second minimum threshold in storehouse, or until in storehouse the quantity of remaining candidate drugs reduce to desired quantity.
If desired, the primer pair producing the target amplicon overlapping with the target amplicon produced by another primer pair can be assigned to the amplified reaction separated.Application for Water demand all candidate targets locus (instead of due to overlapping with target amplicon and omit candidate target locus from analyze) can need multiple pcr amplification reaction.
It is minimum that these systems of selection make the quantity of the candidate drugs must removed from storehouse drop to, and achieves the desired minimizing of primer dimer.By removing the candidate drugs of smaller amounts from storehouse, gained primer storehouse amplification more (or all) target gene seat can be used.
The a large amount of primer of compound is applied with a large amount of restriction to the detection that can be included.By mistake interactional detection creates spuious amplified production.The size restriction of miniature PCR can cause further restriction.In one embodiment, likely attempt to design for initial substance the primer of each SNP of increasing with the potential SNP target (between about 500 to being greater than between 100 ten thousand) of huge amount.When primer can be designed, likely attempt, by the likelihood using the open thermodynamical coordinate formed for DNA double spirochete to estimate to be formed spuious primer double helix between all possible primer pair, to differentiate the primer pair that may form spuious product.Primer interact can by mutually carry out rank with relevant mark function mutually and eliminate the mutually mutual primer the poorest with mark until primer quantity desired by satisfied.When the SNP that may be heterozygosis is the most applicable, also likely detection inventory is carried out to rank and selects the detection that heterozygosis is the most compatible.Test empirical tests, most possibly form primer dimer with the primer that mark is high mutually mutually.Under high compound, all spuious interactions can not be eliminated, but the mutually mutual primer the highest with mark or primer pair in computer simulation must be removed, because they can dominate whole reaction, greatly limit the amplification of intended target.We performed this program creation up to and in some cases, more than 10, the composite primer group of 000 primer.Due to this program, improvement is huge, compared to from there is no 10% of the reaction of removing the poorest primer, achieve more than 80%, more than 90%, more than 95%, more than 98% is carried out to target product and even more than 99% amplification, as the order-checking by all PCR primer measure.When with part half nested type Combination of Methods as discussed previously, more than 90% and even the amplicon of more than 95% can be mapped to target sequence.
It should be noted that and exist for determining which PCR probe may form other method dimeric.In one embodiment, analysis has used a pond DNA of one group of unoptimizable primer amplification can be enough to determine problematic primer.For example, order-checking can be used to analyze, and determine those dimers existed with maximum quantity be most possibly formed dimeric those, and can to remove.
This method has multiple potential application, such as, measure and other target order-checking application for SNP gene type, heterozygosis rate mensuration, copy number.In one embodiment, primer design method can combinationally use with the miniature PCR method in this document described in other place.In certain embodiments, primer design method can be used as a part for extensive composite PCR method.
On primer, applying marking can reduce amplification and the order-checking of primer dimer product.In certain embodiments, primer contains the interior region forming ring structure with mark.In a particular embodiment, primer comprises and has specific 5th ' district to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to target gene seat target gene seat.In certain embodiments, ring region can be between two basic change district, and wherein two basic change district is designed to be attached to the continuous of template DNA or adjacent area.In different embodiments, the length in 3rd ' district is at least 7 Nucleotide.In certain embodiments, the length in 3rd ' district, between 7 and 20 Nucleotide, such as, between 7 to 15 Nucleotide or 7 to 10 Nucleotide, comprises end points.In different embodiments, primer comprises target gene seat is not had to specific 5th ' district (such as mark or universal primer binding site), is then have specific region to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to target gene seat target gene seat.Mark-primer may be used for required desired specificities sequence to shorten to lower than 20, lower than 15, lower than 12 and even lower than 10 base pairs.This can be when standard primer designs when the target sequence fragmentation made in primer binding site or it can be designed in design of primers time serendipitous.The advantage of this method comprises: which increase the amount detection that can be designed to a certain maximum amplicon length, and it shortens " not providing information " order-checking of primer sequence.It can also combinationally use with inner marker (local referring to other in this document).
In one embodiment, composition target can be reduced by raising anneal temperature to the relative quantity of unproductive products in pcr amplification.When the amplification bank containing the mark identical with desired specificities primer, annealing temperature can increase to some extent compared to genomic dna, because mark will cause primer to combine.In certain embodiments, our primer concentration of using significantly lower than draft report and the annealing time used is longer than that other place reports.In certain embodiments, annealing time can be longer than 3 minutes, be longer than 5 minutes, be longer than 8 minutes, be longer than 10 minutes, be longer than 15 minutes, be longer than 20 minutes, be longer than 30 minutes, be longer than 60 minutes, be longer than 120 minutes, be longer than 240 minutes, be longer than 480 minutes and be even longer than 960 minutes.In one embodiment, use than annealing time long in previous report, allow lower primer concentration.In different embodiments, use is longer than the normal extension time, such as, more than 3,5,8,10 or 15 minutes.In certain embodiments, primer concentration is low to 50nM, 20nM, 10nM, 5nM, 1nM and lower than 1 μM.This unexpectedly creates the firm performance of height complex reaction, such as 1,000 heavily to react, 2,000 heavily to react, 5,000 heavily to react, 10,000 heavily to react, 20,000 heavily to react, 50,000 heavily reaction and even 100, and 000 heavily reacts.In one embodiment, amplification use one, two, three, four or five cycles operated with long annealing time are then the PCR cycles containing labeled primer with more common annealing time.
In order to select target position, can from a pond candidate drugs to design start and the potential unfavorable interactional thermodynamical model created between primer pair, and then use described model eliminate and pond in other design incompatible design.
After chosen process, in storehouse, remaining primer may be used in any one in the inventive method.
exemplary primer storehouse
On the one hand, the invention is characterized in primer storehouse, such as use in the inventive method any one be selected from the primer in candidate drugs storehouse.In certain embodiments, described storehouse to be included in a reaction volume hybridize simultaneously (or can hybridize) simultaneously to or amplification (or can simultaneously increase) at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer of 000 different target gene seat.In different embodiments, described storehouse is included in a reaction volume increase (or can increase) between 1 simultaneously simultaneously, and 000 to 2,000,2,000 to 5,000,5,000 to 7,500,7,500 to 10,000,10,000 to 20,000,20,000 to 25,000,25,000 to 30,000,30,000 to 40,000,40,000 to 50,000,50,000 to 75,000 or 75,000 are to 100, the primer of the different target locus between 000, comprises end points.In different embodiments, described storehouse is included in a reaction volume increase (or can increase) between 1 simultaneously simultaneously, and 000 to 100, the primer of the different target locus between 000, such as, between 1,000 to 50,000,1,000 to 30,000,1,000 to 20,000,1,000 to 10,000,2,000 to 30,000,2,000 to 20,000,2,000 to 10,000,5,000 to 30,000,5,000 to 20,000 or 5, different target locus between 000 to 10,000, comprises end points.In certain embodiments, described storehouse to be included in a reaction volume (or can simultaneously increase) the target gene seat that simultaneously increases is the primer of primer dimer to make the amplified production being less than 60%, 40%, 30%, 20%, 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.25%, 0.1% or 0.5%.In various embodiments, as the amount of the amplified production of primer dimer between 0.5% to 60%, such as between 0.1% to 40%, 0.1% to 20%, 0.25% to 20%, 0.25% to 10%, 0.5% to 20%, 0.5% to 10%, 1% to 20% or 1% to 10%, comprise end points.In certain embodiments, primer in a reaction volume simultaneously increase (or can simultaneously increase) target gene seat be target amplicon to make the amplified production of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%.In different embodiments, as the amount of the amplified production of target amplicon between 50% to 99.5%, such as, between 60% to 99%, 70% to 98%, 80% to 98%, 90% to 99.5% or 95% to 99.5%, end points is comprised.In certain embodiments, primer in a reaction volume simultaneously increase (or can simultaneously increase) target gene seat be amplified to make the target gene seat of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%.In different embodiments, the amount of the target gene seat be amplified, between 50% to 99.5%, such as, between 60% to 99%, 70% to 98%, 80% to 99%, 90% to 99.5%, 95% to 99.9% or 98% to 99.99%, comprises end points.In certain embodiments, primer storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 primer pair, wherein often pair of primer comprises positive test primer and negative testing primer, and wherein often pair of test primer hybridization is to a target gene seat.In certain embodiments, primer storehouse comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 independent primer hybridizing to different target locus separately, wherein said independent primer is not a part for primer pair.
In different embodiments, the concentration of each primer is less than 100,75,50,25,20,10,5,2 or 1nM, or is less than 500,100,10 or 1 μMs.In different embodiments, the concentration of each primer, between 1 μM to 100nM, such as, between 1 μM to 1nM, 1 to 75nM, 2 to 50nM or 5 to 50nM, comprises end points.In different embodiments, the GC content of primer, between 30% to 80%, such as, between 40% to 70% or 50% to 60%, comprises end points.In certain embodiments, the scope of the GC content of primer is less than 30%, 20%, 10% or 5%.In certain embodiments, the scope of the GC content of primer, between 5% to 30%, such as, between 5% to 20% or 5% to 10%, comprises end points.In certain embodiments, the melting temperature(Tm) (T of primer is tested m) between 40 DEG C to 80 DEG C, such as, between 50 DEG C to 70 DEG C, 55 DEG C to 65 DEG C or 57 DEG C to 60.5 DEG C, comprise end points.In certain embodiments, Primer3 program (1ibprimer3 version 2 .2.3) is used to use built-in Sheng Talu fine jade Asia (SantaLucia) parameter (www.primer3.sourceforge.net) to calculate T m.In certain embodiments, the scope of the melting temperature(Tm) of primer is less than 15 DEG C, 10 DEG C, 5 DEG C, 3 DEG C or 1 DEG C.In certain embodiments, the scope of the melting temperature(Tm) of primer, between 1 DEG C to 15 DEG C, such as, between 1 DEG C to 10 DEG C, 1 DEG C to 5 DEG C or 1 DEG C to 3 DEG C, comprises end points.In certain embodiments, the length of primer, between 15 to 100 Nucleotide, such as, between 15 to 75 Nucleotide, 15 to 40 Nucleotide, 17 to 35 Nucleotide, 18 to 30 Nucleotide, 20 to 65 Nucleotide, comprises end points.In certain embodiments, the scope of the length of primer is less than 50,40,30,20,10 or 5 Nucleotide.In certain embodiments, the scope of the length of primer, between 5 to 50 Nucleotide, such as, between 5 to 40 Nucleotide, 5 to 20 Nucleotide or 5 to 10 Nucleotide, comprises end points.In certain embodiments, the length of target amplicon, between 50 and 100 Nucleotide, such as, between 60 and 80 Nucleotide or 60 to 75 Nucleotide, comprises end points.In certain embodiments, the scope of the length of target amplicon is less than 50,25,15,10 or 5 Nucleotide.In certain embodiments, the scope of the length of target amplicon, between 5 to 50 Nucleotide, such as, between 5 to 25 Nucleotide, 5 to 15 Nucleotide or 5 to 10 Nucleotide, comprises end points.
These primer storehouses may be used in any one in the inventive method.
exemplary primer kit
On the one hand, the invention is characterized in a kind of test kit (such as the test kit of the target gene seat in amplification of nucleic acid sample), it comprises any one in primer storehouse of the present invention.In certain embodiments, can prepare and comprise multiple test kit being designed to the primer realizing method described in the present invention.Primer can be outside forward as disclosed herein and reverse primer, inner forward and reverse primer, they can be as the primer being designed to have with other primer in test kit low binding affinity disclosed in design of primers part, they can be hybrid capture probe as described in relevant portion or circularizing probes in advance, or its some combinations.In one embodiment, the test kit being designed to the ploidy state made together with method disclosed herein for measuring the target chromosome in the fetus in breeding can be prepared, described test kit comprises multiple inner forward primer and optionally multiple inner reverse primer and optionally outside forward primer and outside reverse primer, each in wherein said primer is designed to hybridize to the region of DNA territory in the upstream of in and then target chromosome and the target site (such as, polymorphic site) optionally on extrachromosome and/or downstream.In one embodiment, primer kit can combinationally use with the diagnosis box in this document described in other place.In certain embodiments, test kit comprises the specification sheets using described storehouse amplification target gene seat.
exemplary composite PCR method
On the one hand, the invention is characterized in the method for the target gene seat in amplification of nucleic acid sample, described method relates to (i) to be made nucleic acid samples and hybridizes to minimum 1 simultaneously, 000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different target gene seat is to produce reaction mixture; And (ii) make reaction mixture experience primer extension reaction condition (such as PCR condition) to produce the amplified production comprising target amplicon.In certain embodiments, described method also comprises and determines at least one target amplicon of presence or absence (target amplicon of such as at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%).In certain embodiments, described method also comprises the sequence measuring at least one target amplicon (target amplicon of such as at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5%).In certain embodiments, the target gene seat of at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% is amplified.In different embodiments, the amplified production being less than 60%, 50%, 40%, 30%, 20%, 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.25%, 0.1% or 0.05% is primer dimer.
In one embodiment, method disclosed herein uses the target PCR of height virtual height compound to carry out DNA amplification, then uses high-flux sequence to measure the gene frequency at each target gene seat place.The ability that the mode compound being mapped to target gene seat with most of gained sequence reads in a reaction volume exceedes about 50 or 100 PCR primer is novel and non-obvious.Height composition target is allowed to relate to the unlikely primer of hybridizing each other of design to PCR with a kind of technology that height effective means performs.By creating at least 500, at least 1,000, at least 2,000, at least 5,000, at least 7,500, at least 10,000, at least 20,000, at least 25,000, at least 30,000, at least 40,000, at least 50,000, at least 75,000 or at least 100, the potential unfavorable interaction between 000 potential primer pair, or the undesirable interactional thermodynamical model between primer and sample DNA, and then use described model eliminate with pond in other design incompatible design to select the PCR probe being commonly referred to primer.Highly composition target is allowed to be the partially or completely nested type method of use target PCR with the another kind of technology that height effective means performs to PCR.The one in these methods or combination is used to allow in the single pond of compound at least 300, at least 800, at least 1,200, at least 4,000 or at least 10,000 primer, the major part will be mapped in the DNA molecular of target gene seat when the DNA that wherein gained is amplified is included in order-checking.Use a large amount of primers in a kind of or combination permission single pond of compound in these methods, wherein gained is amplified the DNA DNA molecular comprised more than 50%, more than 60%, more than 67%, more than 80%, more than 90%, more than 95%, more than 96%, more than 97%, more than 98%, more than 99% or more than 99.5% and is mapped to target gene seat.
In certain embodiments, the detection of Multi-Objective Genetic material can be undertaken by complex method.Can parallel work-flow gene target sequence quantity can one to ten, ten to one hundred, 1 hundred to one thousand, 1 thousand to one ten thousand, 1 ten thousand to ten ten thousand, 10 ten thousand to one hundred ten thousand or 100 ten thousand to one thousand ten thousand scope in change.Each pond compound has created undesirable side reaction prominent question more than the previous trial of 100 primers, and such as primer-dimer is formed.
Target PCR
In certain embodiments, PCR can in order to the specific position of target gene group.In plasma sample, make initial DNA height fragmentation (be usually less than 500bp, mean length is less than 200bp).In PCR, forward and reverse primer are annealed to same clip to make it possible to amplification.Therefore, if fragment is short, so PCR detects relatively short region of must increasing equally.As MIP, if polymorphic position is too near to polymerase binding site point, so it can cause not homoallelic amplification deviation.Current, the PCR primer of target Polymorphic Regions (those primers such as containing SNP) is generally designed to and makes 3 ' of primer to hold will to hybridize to the base closely adjacent with polymorphic base.In one embodiment of the invention, 3 ' end of forward and inverse PCR primer is designed to the base of one or several position hybridized to away from the allelic varient position (polymorphic site) of target.Base between polymorphic site (SNP or other) holds the quantity of the base of hybridizing can be a base with 3 ' of design primer, it can be two bases, it can be three bases, it can be four bases, it can be five bases, and it can be six bases, and it can be seven to ten bases, it can be ten one to ten five bases, or it can be ten six to two ten bases.Forward and reverse primer can be designed to the base away from polymorphic site of hybridizing different quantities.
Can produce a large amount of PCR to detect, but the interaction between different PCR detects makes to be difficult to exceed about 100 detections to they compounds.Various compound molecule method can be used to improve compound level, but it still may be limited to each reaction and is less than 100, perhaps 200 or may 500 detections.Sample containing a large amount of DNA can be split into multiple son reaction and then restructuring before order-checking.About the sample that whole sample or some sub-groups of DNA molecular are restricted, split sample and will introduce statistical noise.In one embodiment, DNA that is a small amount of or limited quantity can refer to lower than 10pg, amount between 10 and 100pg, between 100pg and 1ng, between 1 and 10ng or between 10 and 100ng.Should note, although this method is particularly useful for a small amount of DNA, wherein relate to other method splitting into multiple pond and can cause the prominent question relevant to introduced random noise, but this method is when it is also so providing make deviation drop to minimum benefit containing when the sample of any quantity D NA operates.In these cases, general pre-amplification step can be used to increase bulk sample quantity.Ideally, this pre-amplification step insignificantly should change allele distributions.
In one embodiment, the inventive method can produce a large amount of target gene seat, specifically 1 from limited sample (such as from the unicellular of body fluid or DNA), 000 to 5,000 locus, 5,000 to 10,000 locus or be greater than 10,000 locus has specific PCR primer, for being carried out gene type or some other methods of genotyping by order-checking.Current, the composite PCR reaction being performed for more than 5 to 10 targets proposes a significant challenge and is usually subject to the obstruction of the primer by product of such as primer dimer and other artefact.When using microarray hybridization probe to detect target sequence, primer dimer and other artefact can be ignored, because these materials can't detect.But when using order-checking as detection method, overwhelming majority order-checking reading checks order to target sequence desired in this kind of artefact instead of sample.Compound reaction more than 50 or 100 in a reaction volume described in prior art, the method that then checks order depart from target sequence reading by producing more than 20% and usually more than 50%, in many cases more than 80% and in some cases more than 90% usually.
In general, in order to perform the target order-checking of multiple targets (more than 50, more than 100, more than 500 or more than 1,000) of sample, sample splitting can be become the parallel reaction of multiple amplifications independent target.This has performed or can carry out in commercial podium in PCR porous plate, such as Fu Luda accessed array (FLUIDIGM ACCESS) (in micro-fluid chip the reaction of 48, each sample) or drop PCR (DROPLET PCR), by reyn when this technology (hundreds of to thousands of target).Regrettably, these split-and-merge methods are problematic for the sample containing finite quantity DNA, because usually there is not enough genome copies to guarantee to exist in each hole a copy in genomic each region.When target polymorphic locus and when needing the allelic relative proportion at polymorphic locus place, this is especially serious problem, because the random noise introduced by split-and-merge will cause the measurement of the allelic ratio be present in initial DNA sample very inaccurate.There is described herein a kind of method that multiple PCR that can effectively and efficiently increase reacts, described method be applicable to only finite quantity DNA can situation.In one embodiment, described method goes for analysis list cell, body fluid, DNA mixture (being such as found in the DNA of the unmanaged flexibility in maternal blood plasma), biopsy, environment and/or forensic samples.
In one embodiment, target order-checking can relate in following steps one, multiple or all.A) to produce and amplification has the storehouse of adapter sequence on DNA fragmentation two ends.B) after the amplification of storehouse, multiple reaction is divided into.C) produce and optionally increase there is the storehouse of adapter sequence on DNA fragmentation two ends.D) each target uses 1000 to 10,000 of target selected by desired specificities " forward " primer and the execution of a mark Auele Specific Primer heavily to increase.E) use " oppositely " desired specificities primer and one (or more) to have specific primer to the common tags introduced with the form of a part for the desired specificities forward primer in the first round, increase from this product execution second.F) the 1000 heavy pre-amplifications performing selected target continue a limited quantity cycle.G) product is divided into multiple aliquots containig and in the independent reaction (such as, 50 to 500 weights) amplification target subpool, but this can use until substance always.H) product of parallel subpool reaction is merged.I) during these amplifications, primer can carry out compatibility mark (part or total length) order-checking to make it possible to check order to product.
Height composite PCR
There is disclosed herein the method allowing to increase from the target of the target sequence (such as, SNP locus) of nucleic acid samples (such as from the genomic dna that blood plasma obtains) to tens thousand of more than 100.Amplification sample can relatively not contain primer dimer product and have hypomorph deviation at target gene seat.If during increasing or after amplification, product be attached with order-checking compatible adapter, and so the analysis of these products can by order-checking execution.
The primer dimer product using method as known in the art to perform height composite PCR amplification generation exceedes desired amplified production and is not suitable for order-checking.These can by eliminating the primer of these products of formation or being selected by the computer simulation performing primer and reduce by rule of thumb.But amount detection is larger, this problem becomes more difficult.
A solution be by 5000 heavily reaction split into the lower amplification of several tuple, such as 100 50 heavy or 50 100 heavily react, or use microfluid or even sample splitting become independent PCR reaction.But, if sample DNA is limited, such as, in the non-invasive prenatal diagnosis of pregnancy plasma, so should be avoided between multiple reaction and divide sample, because this will produce bottleneck effect.
There is described herein the sample that first generally increases plasma dna and then sample is divided into multiple complex target enrichment reaction method, each reaction has the target sequence of more moderate number.In one embodiment, the inventive method may be used for the DNA mixture at the multiple locus place of priority enrichment, described method comprise in following steps one or more: produce and amplification bank from DNA mixture, the molecule wherein in storehouse has the adapter sequence be bonded on DNA fragmentation two ends; Amplification bank is divided into multiple reaction; Each target uses desired specificities " forward " primer and one or more adapter specificity general " oppositely " primer, performs the first round composite amplification of selected target.In one embodiment, the inventive method comprises use " oppositely " desired specificities primer further and one or more common tags to introducing with the form of a part for the desired specificities forward primer in the first round has specific primer, performs the second amplification.In one embodiment, described method can relate to full nested type, half side nested type, half nested type, monolateral full nested type, monolateral half side nested type or monolateral half nested PCR approach.In one embodiment, the inventive method is used for the DNA mixture at the multiple locus place of priority enrichment, described method comprises the compound performing selected target and increases in advance a lasting limited quantity cycle, product is divided into multiple aliquots containig and the target subpool that increases in independent reaction, and merges the product of parallel subpool reaction.It should be noted that this method may be used for for 50 to 500 locus, for 500 to 5,000 locus, for 5,000 to 50,000 locus or even for 50,000 to 500,000 locus, to perform target amplification by the mode producing low-level allelotrope deviation.In one embodiment, primer carries part or the compatible mark of total length order-checking.
Workflow can need (1) to extract DNA, such as plasma dna; (2) fragment library fragment two ends with general adapter is prepared in; (3) use has specific universal primer amplification bank to adapter; (4) sample " storehouse " that will increase is divided into multiple aliquots containig; (5) compound (such as about 100 heavy, 1,000 or 10,000 weights, wherein an each target desired specificities primer and mark Auele Specific Primer) amplification is performed to aliquots containig; (6) aliquots containig of a sample is merged; (7) barcode of sample is identified; (8) biased sample and adjust concentration; (9) sample is checked order.Workflow can comprise multiple sub-step (such as step (2) prepares storehouse step can need three enzymatic steps (end flat end, dA tailing and adapter engage) and three purification steps) containing in listed step.The step of workflow can combine, split or by different order (such as identify barcode and merge sample) perform.
Be important to note that, storehouse amplification can be performed by the mode of short-movie section of being partial to more efficiently to increase.In this way, the sequence that likely preferential amplification is shorter, such as mononucleosome DNA fragmentation, as in the circulation of pregnant woman the acellular foetal DNA (placenta source) that finds.It should be noted that PCR detection can have mark, the mark that such as checks order (normally the clipped form of 15 to 25 bases).After compound, merge the PCR compound result of sample and then complete (comprise and identify barcode) mark by mark specific PCR (also can be undertaken by joint).In addition, complete order-checking can be added in the reaction same with compound phase to mark.In the period 1, can desired specificities primer amplification target be used, then be taken over to complete SQ-adapter sequence by mark Auele Specific Primer.PCR primer can not carry mark.Order-checking mark can by bond attachments to amplified production.
In one embodiment, about the various application of the detection of such as fetus dysploidy, in succession can use height composite PCR, estimate amplification material by cloning and sequencing.Although traditional composite PCR estimates nearly 50 locus simultaneously, but method described herein may be used for realizing being estimated to exceed 50 locus, is estimated to exceed 100 locus, is estimated to exceed 500 locus, is estimated to exceed 1 simultaneously simultaneously simultaneously simultaneously, 000 locus, be estimated to exceed 5 simultaneously, 000 locus, be estimated to exceed 10 simultaneously, 000 locus, be estimated to exceed 50 simultaneously, 000 locus and be simultaneously estimated to exceed 100,000 locus.Experiment shows, can nearly 10,000 be estimated simultaneously with enough good efficiency and specificity, comprise 10 in single reaction, 000 and more than 10,000 different locus, thus make the antenatal dysploidy diagnosis of the Noninvasive with high accuracy and/or copy number interpretation.To be able to detect in single reaction and whole sample combination, another treated derivative of the cfDNA sample that described sample is such as separated from maternal blood plasma, its part or cfDNA sample.Sample (such as, cfDNA or derivative) can also be split into the reaction of multiple parallel composition.Best sample splitting and compound are by weighing various specification to determine.Because material quantity is limited, so become by sample splitting multiple part can introduce sampling noise, Deal with Time, and increase wrong possibility.Otherwise higher compound can produce more substantial spuious amplification and larger amplification is unequal, and both all can reduce test performance.
Two crucial related consideration in the application of described method are in this article finite quantities of initial sample (such as, blood plasma) and will obtain the quantity of the initial molecular in the material of gene frequency or other measuring result.If the quantity of initial molecular drops to lower than certain level, so stochastic sampling noise becomes remarkable, and can affect test accuracy.Usually, if measured the sample that each target gene seat comprises an equal 500-1000 initial molecular, the data that quality is enough to make the antenatal dysploidy diagnosis of Noninvasive so can be obtained.There is the quantity that various ways increases different measuring, such as, increase sample volume.The each manipulation being applied to sample also creates material damage potentially.The loss caused by various manipulation must be characterized and avoided, or optionally improving some result manipulated to avoid reducing testability loss of energy.
In one embodiment, likely in subsequent step, potential loss is alleviated by all or a part of initial sample (such as, cfDNA sample) of amplification.Multiple method in order to all genetic material increased in sample, can increase the amount that can be used for downstream program.In one embodiment, after the joint of a different adapter, two different adapters or multiple different adapter, engaged PCR (LM-PCR) DNA fragmentation of mediation by pcr amplification.In one embodiment, composite permutation amplification (MDA) phi-29 polysaccharase is used to carry out all DNA of isothermal duplication.In DOP-PCR and variant, use and cause the parent material DNA that increases at random.Often kind of method has some feature, such as, allly present the homogeneity increased in region, the efficiency of catching and increasing of initial DNA, and the amplification capability become with fragment length genomic.
In one embodiment, LM-PCR can use together with the single heteroduplex adapter with 3 ' tyrosine.Heteroduplex adapter can use two not homotactic single adapter molecules that can be converted into during first round PCR on 5 ' and 3 ' end of initial DNA fragmentation.In one embodiment, likely pass through size separation or product (such as peace cuts open (AMPURE), tower this (TASS)) or other similar approach and classification is carried out to amplification bank.Before splicing, sample DNA end flat end can be made, and then add single adenosine base to 3 ' end.Before splicing, restriction enzyme or some other cleavage method crack DNAs can be used.Between joint conditioning time, 3 ' adenosine of sample fragment and the complementarity 3 ' tyrosine overhang of adapter can strengthen joint efficiency.From time viewpoint, the extension step of pcr amplification can be limited, to reduce the amplification of the fragment of being longer than about 200bp, about 300bp, about 400bp, about 500bp or about 1,000bp.Because the longer DNA found in maternal blood plasma is almost special in female parent, so this can impel fetal DNA enrichment 10%-50% and retrofit testing performance.Use the multiple reaction of conditional operation as illustrated by commercially available test kit; Create the successful joint being less than 10% sample DNA molecule.Joint is brought up to about 70% by the series of optimum about the reaction conditions of this point.
Miniature PCR
Below miniature PCR method be containing short nucleic acid, through digesting nucleic acid or fragmented nucleic acids (such as cfDNA) sample desired by.Traditional PCR detection design causes different fetal molecule to lose in a large number, but can be called that the extremely short PCR detection that miniature PCR detects greatly reduces loss by designing.Make the height fragmentation of the fetus cfDNA in maternal serum and clip size is shown greatly Gauss (Gaussian) mode and distributed, wherein mean value is 160bp, and standard deviation is 15bp, and minimal size is about 100bp, and largest amount is about 220bp.Although fragment beginning and end position not necessarily random relative to the distribution of target polymorphism, significantly to change in independent target and in all targets of entirety and the polymorphic site of a specific objective locus can occupy any position from the outset to end in each fragment deriving from described locus.It should be noted that the miniature PCR of term can refer to not have the regular-PCR of additional restraint or restriction equally.
During PCR, increase and the template DNA fragment in self-contained forward and reverse primer site will only occur.Because fetus cfDNA fragment is short, so the likelihood that two primer sites exist, the likelihood comprising the fetus fragment of the length L in forward and reverse primer site is the ratio of amplicon length and fragment length.Under ideal conditions, amplicon be 45,50,55,60,65 or 70bp detection by successfully increase respectively from 72%, 69%, 66%, 63%, 59% or 56% Available templates fragments molecules.Amplicon length is the distance between forward and 5 ' end of reverse priming site.Short amplicon length more normally used than those skilled in the art can produce the more effective measurement of desired polymorphic locus by only needing short data records reading.In one embodiment, the substantial part of amplicon should be less than 100bp, is less than 90bp, is less than 80bp, is less than 70bp, is less than 65bp, is less than 60bp, is less than 55bp, is less than 50bp or is less than 45bp.
Should note, in method known in the prior art, usually such as described short detection is avoided herein, because they are not required and they are applied with a large amount of restriction by restriction primer length, the distance of annealing between characteristic sum forward and reverse primer to design of primers.
It shall yet further be noted that if 3 ' holding in about 1-6 base of polymorphic site of any one primer, so there is the potentiality of deviation amplification.Can cause an allelotrope preferential amplification at this single base difference of initial polymerization enzyme binding site, this can change viewed gene frequency and reduce performance.All these restrictions all make discriminating by the primer of the specific gene seat that successfully increases and in addition, design a large amount of primer sets compatible in same complex reaction and become very challenging.In one embodiment, 3 ' end of inner forward and reverse primer is designed to the region of DNA territory hybridizing to polymorphic site upstream, and is separated by a small amount of base and polymorphic site.Ideally, base quantity can between 6 and 10 bases, but between 4 and 15 bases, between three and 20 bases, between two and 30 bases or between 1 and 60 bases, and object identical in fact can be realized equally.
Composite PCR can relate to the single-wheel PCR of all targets that increases or it can relate to take turns PCR, be then some variants of one or many wheel nested PCR or nested PCR.Nested PCR is made up of follow-up one or many wheel pcr amplification, and described pcr amplification uses one or more to be incorporated into the new primer of the primer in previous round by least one base pair internal junction.Nested PCR decreases the quantity by the spuious amplification target caused that increases in subsequent reactions, is only those amplified productions with correct internal sequence from last reaction.Reduce spuious amplification target and improve the quantity being suitable for measuring result that can obtain, that especially can obtain in order-checking.Nested PCR needs to design the primer completely in last primer binding site inside usually, must increase the required minimum region of DNA section size of amplification.About the sample of such as maternal blood plasma cfDNA, wherein DNA is by height fragmentation, and larger detected magnitude decreases the quantity of the different cfDNA molecules that therefrom can obtain measuring result.In one embodiment, in order to offset this effect, can use part nested type method, wherein second to take turns one or two in primer overlapping with the first binding site, the Inner elongate base of some amount, thus obtain additional specificities simultaneously minimally increase total detected magnitude.
In one embodiment, the compound pond that PCR detects be designed to increase potentially heterozygosis SNP on one or more karyomit(e) or other polymorphic or non-polymorphic locus and these detections is used to carry out DNA amplification in single reaction.The quantity that PCR detects can between 50 and 200 PCR detect, between 200 and 1, between 000 PCR detects, between 1,000 and 5, between 000 PCR detects or between 5,000 and 20, between 000 PCR detects (be respectively 50 to 200 heavy, 200 to 1,000 heavy, 1,000 to 5,000 heavy, 5,000 to 20,000 heavy, more than 20,000 weight).In one embodiment, about 10, the compound pond that 000 PCR detects (10,000 weight) is designed to the heterozygosis SNP locus that increases potentially on X, Y, 13,18 and 21 and 1 or No. 2 karyomit(e)s and these detections are used to increase in single reaction the cfDNA obtained from following material: plasma sample, fine hair sample, amniocentesis sample, single or a small amount of cell, other body fluid or tissue, cancer or other genetic material.The SNP frequency of each locus can be determined by some other sequence measurements of clone or amplicon.The gene frequency distribution of all detections or the statistical study of ratio may be used for determining that whether sample is containing one or more the trisomy in karyomit(e) included in test.In another embodiment, initial cfDNA sample splitting become two samples and perform parallel 5,000 re-detection.In another embodiment, initial cfDNA sample splitting is become n sample and performs parallel (about 10,000/n) re-detection, wherein n is between 2 and 12 or between 12 and 24 or between 24 and 48 or between 48 and 96.Collect data and analyze with the similar fashion described.It should be noted that this method is equally applicable to detect transposition, lacks, copies and other chromosome abnormalty.
In one embodiment, the afterbody with target gene group without homology can also be added to 3 ' of any one in primer or 5 ' end.These afterbodys promote follow-up manipulation, program or measurement.In one embodiment, tail sequence can be identical for forward and reverse desired specificities primer.In one embodiment, different afterbody can be used for forward and reverse desired specificities primer.In one embodiment, multiple different afterbody can be used for different genes seat or locus group.Some afterbody can be shared in all locus or in locus subgroup.For example, use the forward of the forward needed for any one corresponding in current order-checking platform and reverse sequence and reverse afterbody can be implemented in amplification after direct Sequencing.In one embodiment, afterbody can be suitable for common priming site in all amplification targets of sequence as may be used for adding other.In certain embodiments, internal primer can hybridize the upstream of target gene seat (such as polymorphic locus) or the region in downstream containing being designed to.In certain embodiments, primer can contain molecular barcode.In certain embodiments, primer can containing being designed to the general initiation sequence allowing pcr amplification.
In one embodiment, create 10,000 heavy PCR detection cell has afterbody corresponding to the forward needed for high-flux sequence instrument and reverse sequence to make forward and reverse primer, and described high-flux sequence instrument such as can purchased from HISEQ, GAIIX or MYSEQ of Yi Lu meter Na (ILLUMINA).In addition, order-checking 5 ' included by afterbody is the additional sequences that can be used as the priming site in follow-up PCR, for adding Nucleotide bar code sequence to amplicon, realizes the compound order-checking carrying out multiple sample in the single swimming lane of high-flux sequence instrument.
In one embodiment, create 10,000 heavy PCR detection cell, to make reverse primer, there is afterbody corresponding to the reverse sequence needed for high-flux sequence instrument.With first 10, after 000 re-detection amplification, another 10,000 heavy pond with the part nested type forward primer (such as 6 base nested types) for all targets and the reverse primer corresponding to backward sequencing afterbody included in the first round can be used to perform subsequent PCR amplification.This part nested type amplification taken turns with only a desired specificities primer and universal primer carry out below limits required detected magnitude, decreases sampling noise, but greatly reduces the quantity of spuious amplicon.Can add order-checking mark and/or as the part of PCR probe to attached joint adapter, be the part of final amplicon to make described mark.
Fetus mark affects test performance.There is the fetus mark of the DNA found in the maternal blood plasma of various ways enrichment.Fetus mark can be increased by the previously described LM-PCR method discussed and by the long maternal fragment of target removal.In one embodiment, before the composite PCR amplification of target gene seat, extra composite PCR reaction can be performed with the length optionally removed corresponding to the locus of institute's target in follow-up composite PCR and come from maternal fragment to a great extent.Compared with existing for additional primers is designed to estimate with acellular fetal DNA fragments, distance polymorphism farther distance annealing site.These primers can in one-period composite PCR reaction before the composite PCR of target polymorphic locus.These end primer are marked with molecule or the part of the DNA fragment that Selective recognition can be allowed labeled.In one embodiment, these DNA moleculars can use biotin molecule covalent modification, and described biotin molecule allows after a PCR cycle, remove the new double-stranded DNA comprising these primers formed.The double-stranded DNA formed during the described first round may come from female parent.Can by the removal using Streptavidin MagneSphere to realize hybridization material.Other marking method that existence can be worked equally.In one embodiment, size system of selection can be used to carry out DNA chain shorter in enriched sample; Such as those DNA chains being less than about 800bp, being less than about 500bp or being less than about 300bp.Then the amplification of short-movie section can be carried out as usual.
It is hundreds of to thousands of or even height composite amplification in single reaction of millions of locus and analysis that miniature PCR method described in the present invention achieves from simple sample.Similarly, can the detection of composite amplification DNA; Can by using barcode PCR compound in an order-checking swimming lane tens of to hundreds of samples.This compound detection has successfully tested nearly 49 weights, and the compound of more much higher degree is possible.In fact, this allows hundreds of samples to carry out gene type at thousands of SNP place in single sequencing procedures.About these samples, described method allows to measure genotype and heterozygosis rate and Simultaneously test copy number, and both all may be used for dysploidy testing goal.The dysploidy of the fetus during the DNA detection that this method is particularly useful for the unmanaged flexibility found from maternal blood plasma breeds.This method can be used as a part for the method differentiating sex of foetus and/or prediction fetus parental right.It can be used as a part for mutagens metering method.This method may be used for DNA or RNA of any amount, and target area can be SNP, other Polymorphic Regions, non-Polymorphic Regions and its combination.
In certain embodiments, the general pcr amplification that the joint of fragmentation DNA can be used to mediate.The general pcr amplification engaging mediation may be used for the plasma dna that increases, and then can be divided into multiple parallel reaction.It can also be used for preferential amplification short-movie section, thus enriches fetal part.In certain embodiments, in fragment, adding the detection that mark can realize more short-movie section by engaging, using the shorter target sequence specificity portion of primer and/or annealing under the higher temperature reducing nonspecific reaction.
Method described herein may be used for the multiple objects that there is the target dna group mixed with a certain amount of DNA polluted.In certain embodiments, target dna can be individual from genetic correlation with the DNA polluted.For example, fetus (target) gene unconventionality can be detected from the maternal blood plasma containing fetus (target) DNA and female parent (pollution) DNA; Described exception comprises whole chromosome abnormalty (such as dysploidy), chromosome dyad abnormal (such as lack, copy, be inverted, transposition), polynucleotide polymorphism (such as STR), single nucleotide polymorphism and/or other gene unconventionality or difference.In certain embodiments, target and the DNA that polluted can from same individualities, but wherein target is different because one or more sudden change with the DNA that polluted, such as, when cancer.(referring to the apoptosis DNA of people's preferential amplification such as such as H. Ma Meng (H.Mamon) from blood plasma: for strengthening the potential (Preferential Ampiification of Apoptotic DNA from Plasma:Potential for Enhancing Detection of Minor DNA Alterations in Circulating DNA) of the detection of secondary dna change in Circulating DNA. clinical chemistry (Clinical Chemistry) 54:9 (2008)).In certain embodiments, DNA can be found in cell cultures (apoptosis) supernatant liquor.In certain embodiments, likely in biological sample (such as, blood), cell death inducing is used for the preparation of follow-up storehouse, amplification and/or order-checking.The multiple workflow for realizing this purpose and scheme is proposed in other place of the present invention.
In certain embodiments, target dna can be derived from unicellular, be derived from be made up of the target gene group being less than a copy DNA sample, be derived from low amounts DNA, (the such as pregnancy plasma: placenta and maternal DNA that is derived from the DNA of mixed source; Cancer patients's blood plasma and tumour: healthy with mixing between cancer DNA, transplanting etc.), be derived from other body fluid, be derived from cell culture, be derived from culture supernatant, be derived from forensic dna sample, be derived from ancient DNA sample (being such as trapped within the insect in amber), be derived from other DNA sample and it combines.
In certain embodiments, short amplicon size can be used.Short amplicon size be particularly suitable for fragmentation DNA (detect the acellular foetal DNA (Detection of increased amounts of cell-free fetal DNA with short PCR amplicons) of increasing amount with short pcr amplification referring to people such as such as A. Xi Kela (A.Sikora). clinical chemistry .2010 January; 56 (1): 136-8).
The use of short amplicon size can produce some remarkable benefits.Short amplicon size can produce the amplification efficiency of optimization.Short amplicon size produces shorter product usually, and therefore the probability of non-specific initiation is lower.Shorter product can be gathered on order-checking wandering cells more thick and fast, because bunch will be less.It should be noted that pcr amplification that method described herein can be equally applicable to more grow.Amplicon length can be increased if desired, such as, when checking order to larger sequence elongated portion.To unicellular and to the 146 heavy targets amplification experiments of genomic dna operation using the detection of length 100bp to 200bp as the first step in nested PCR scheme, obtain positive findings.
In certain embodiments, method described herein may be used for increasing and/or detecting SNP, copy number, methylation of nucleotides, mRNA level in-site, the rna expression level of other type, other hereditary feature and/or epigenetic feature.Miniature PCR method described herein can use together with next generation's order-checking; It can use together with other downstream processes, such as microarray, by digital pcr counting, PCR in real time, mass spectroscopy etc.
In certain embodiments, described herein miniature PCR amplification method can be used as a part for the method for accurate quantitative analysis minority group.It may be used for using spike calibrator to carry out absolute quantitation.It may be used for being undertaken by profound order-checking suddenling change/and secondary allelotrope is quantitative, and can operate by height complex method.It may be used for standard parental right and the identity test of relatives in the mankind, animal, plant or other biology or ancestors.It may be used for legal medical expert's test.It may be used for rapid gene somatotype and the copy number analysis (CN) of any type materials, described material such as amniotic fluid and CVS, sperm, product of becoming pregnant (POC).It may be used for single cell analysis, such as, from the gene type of the biopsy samples of embryo.It may be used for carrying out quick embryo's analysis (in biopsy less than in a day, a day or two days) by using miniature PCR target to check order.
In certain embodiments, it may be used for tumor analysis: the mixture of tumor biopsy normally healthy cell and tumour cell.Target PCR allows SNP and locus are carried out to degree of depth order-checking and keep off background sequence.It may be used for copy number and the loss of heterozygosity,LOH analysis of Tumour DNA.Described Tumour DNA can come across tumour patient multiple different body fluid or tissue in.It may be used for detecting tumor recurrence and/or tumour screening.It may be used for the quality control test of seed.It may be used for breeding or fishing object.It should be noted that in order to ploidy interpretation, any one in these methods can equally for the non-polymorphic locus of target.
Some documents describing some basic skills on the basis as method disclosed herein comprise: (1) king HY (Wang HY), sieve M (Luo M), Jie Lie SAST IV (Tereshchenko IV), Fu Like DM (Frikker DM), Cui X (Cui X), Lee JY (Li JY), G (Hu G) recklessly, Zhu Y (Chu Y), A Zhaluo MA (Azaro MA), woods Y (Lin Y), Shen L (Shen L), poplar Q (Yang Q), Kan Bolisi ME (Kambouris ME), high R (Gao R), execute W (Shih W), Lee H (Li H). genome research (Genome Res.) in February, 2005, 15 (2): 276-83.Molecular genetics, microbiology and immunology department/New Jersey's ICR, Robert's Wood Johnson medical college (Robert Wood Johnson Medical School), New Brunswick (New Brunswick), New Jersey (New Jersey) 08903, the U.S..(2) with hypersensitivity, high-throughput gene type (High-throughput genotyping of single nucleotide polymorphisms with high sensitivity) is carried out to single nucleotide polymorphism. Lee H, king HY, Cui X, sieve M, recklessly G, Greenawalt DM (Greenawalt DM), Jie Lie SAST IV, Lee JY, Zhu Y, high R. molecular biology method (Methods Mol Biol.) 2007; 396-PubMed PMID:18025699.(3) comprise average 9 detections of compound to be described in for the method checked order: nested type patch PCR achieves height complex mutation in candidate gene and finds (Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes). Valley KE (Varley KE), Mi Tela RD (Mitra RD). genome research in November, 2008; 18 (11): 1844-50.Electronic edition on October 10th, 2008.It should be noted that method disclosed herein allows the order of magnitude of compound to exceed above reference.
Target PCR change basis-nested
Multiple possible workflow is had when carrying out PCR; Describe typical a few thing stream in method disclosed herein.The step summarized herein do not intend to get rid of other may step neither imply in described step herein any one to be that described method works rightly required.Quantity of parameters change or other amendment are known in the literature, and can make when not affecting essence of the present invention.Following present a concrete general workflow, after and then multiple possible variant.Variant typically refers to second possible PCR reaction, dissimilar nested (step 3) that such as can carry out.Importantly it should be noted that variant can from clearly describe the different time herein or carry out with different orders.If desired, for illustration of the example of use polymorphic locus can easily be adapted for the non-polymorphic locus that increases.
1. the DNA in sample can have attached joint adapter, is commonly referred to storehouse mark or engages adapter mark (LT), and wherein engaging adapter and contain general initiation sequence, is then universal amplification.In one embodiment, this can use the standard scheme being designed to create order-checking storehouse after fragmentation to carry out.In one embodiment, DNA sample end flat end can be made, and then can add A at 3 ' end.Can add and engage the Y adapter containing T overhang.In certain embodiments, other sticky end except A or T overhang can be used.In certain embodiments, other adapter can be added, such as, become engagement of loops adapter.In certain embodiments, adapter can have the mark being designed to pcr amplification.
2. specific targets amplification (STA): can in a reaction volume compound hundreds of to thousands of to tens thousand of and pre-amplification that is even hundreds thousand of targets.STA runs 10 to 30 cycles usually, although it can run 5 to 40 cycles, 2 to 50 cycles and even 1 to 100 cycles.Can to primer tailing, such as, for simpler workflow or avoid checking order to vast scale dimer.It should be noted that two dimers carrying the primer of same tag can not be increased or check order usually effectively.In certain embodiments, the PCR between 1 and 10 cycles can be performed; In certain embodiments, the PCR between 10 and 20 cycles can be performed; In certain embodiments, the PCR between 20 and 30 cycles can be performed; In certain embodiments, the PCR between 30 and 40 cycles can be performed; In certain embodiments, the PCR more than 40 cycles can be performed.Amplification can be linear amplification.Can optimize PCR cycle life to produce the best readings degree of depth (DOR) profile.Different object can need different DOR profiles.In certain embodiments, reading is evenly it is desired for being distributed between all detections; If some DOR detected are too small, so random noise may be too high so that data are inapplicable; And if the reading degree of depth is too high, the marginal utility of so each additional readings is relatively little.
Primer tail can improve the detection of the fragmentation DNA from common tags storehouse.If storehouse mark and primer tail contain homologous sequence, so hybridize (such as, the melting temperature(Tm) (T that can be improved m) reduce) and if only a part of primer target sequence, in sample dna fragment, so can extend primer.In certain embodiments, 13 or more desired specificities base pairs can be used.In certain embodiments, 10 to 12 desired specificities base pairs can be used.In certain embodiments, 8 to 9 desired specificities base pairs can be used.In certain embodiments, 6 to 7 desired specificities base pairs can be used.In certain embodiments, STA can be performed to pre-DNA amplification, the universal PC R that such as MDA, RCA, other whole genome amplification or adapter mediate.In certain embodiments, to enrichment or the sample execution STA that run out of some sequence and colony, such as, can be degraded by size selection, target acquistion, orientation.
3. in certain embodiments, likely perform second composite PCR or primer extension reaction to increase specificity and to reduce undesirable product.For example, nested, half nested, half side parallel reaction that is nested and/or that be subdivided into less detection cell is all may be used for increasing specific technology entirely.Experiment shows, and in the identical situation of primer, heavily reaction creates specificity and is greater than one 1, the 200 product D NA heavily reacted sample splitting to be become three 400.Similarly, experiment shows, and in the identical situation of primer, sample splitting is become four 2,400 heavily reaction create specificity and be greater than one 9, the 600 product D NA heavily reacted.In one embodiment, the desired specificities with identical and reverse direction and mark Auele Specific Primer is likely used.
4. in certain embodiments, likely applying marking Auele Specific Primer and " universal amplification " are increased the DNA sample (dilution, purifying or other) produced by STA reaction, and multiple or all warps that namely increases increase and the target marked in advance.Primer can containing extra functional sequence, such as barcode, or for the full adapter sequence on high-flux sequence platform needed for order-checking.
These methods may be used for analyzing any DNA sample, and work as DNA sample special hour, or when it is that when wherein DNA is derived from the DNA sample of more than one individuality, such as when maternal blood plasma, these methods are especially applicable.These methods may be used for following DNA sample: such as single or a small amount of cell, genomic dna, plasma dna, amplification blood plasma storehouse, amplifying cells apoptosis supernatant liquor storehouse or other hybrid dna sample.In one embodiment, these methods may be used for exist in single individuality the cell with different genes composition, such as, when cancer or transplanting.
The bodiment variant variant of the above workflow (and/or supplement)
The miniature PCR of direct combination: specific targets amplification (STA) of the multiple target sequences carried out with labeled primer illustrates in FIG.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification by the PCR primer of hybridization.104 represent final PCR primer.In certain embodiments, can to more than 100, more than 200, more than 500, more than 1,000, more than 2,000, more than 5,000, more than 10,000, more than 20,000, more than 50,000, more than 100,000 or more than 200,000 target carries out STA.In subsequent reactions, mark all target sequences of primer amplified and extend mark to comprise all sequences needed for order-checking, comprising sample index.In one embodiment, can not labeled primer or can only mark some primer.Can be engaged by conventional adapter and add order-checking adapter.In one embodiment, initial primers can carry mark.
In one embodiment, the DNA length becoming to increase by design of primers is short unexpectedly.Prior art shows, those of ordinary skill in the art designs 100+bp amplicon usually.In one embodiment, amplicon can be designed to be less than 80bp.In one embodiment, amplicon can be designed to be less than 70bp.In one embodiment, amplicon can be designed to be less than 60bp.In one embodiment, amplicon can be designed to be less than 50bp.In one embodiment, amplicon can be designed to be less than 45bp.In one embodiment, amplicon can be designed to be less than 40bp.In one embodiment, amplicon can be designed to be less than 35bp.In one embodiment, amplicon can be designed between 40 and 65bp.
Use this scheme to use 1200 heavily to increase and perform experiment.Use genomic dna and pregnancy plasma; The sequence reads of about 70% is mapped to target sequence.Details provides other place in the document.Without design with detect 1042 sequences of resurveying selected and cause the sequence of > 99% to be primer dimer product.
Consecutive PCR: after STA1, multiple aliquots containigs of the amplified production that can walk abreast with the same primers pond that complicacy reduces.First time amplification can obtain the material enough split.This method for especially good a small amount of sample, such as about 6-100pg, about 100pg to 1ng, about 1ng to 10ng or about 10ng to 100ng sample.Heavily become three 400 by 1200 and heavily perform described scheme.The mapping of order-checking reading from independent 1200 heavy about 60% to 70% increase to above 95%.
The miniature PCR:(of half nested type is referring to Fig. 2) after STA 1, perform second STA, comprise nested inside formula forward primer (103B, 105b) and (or a small amount of) mark specific reverse primers (103A) of one group of compound.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification with the forward primer B of hybridizing and reverse primer A.104 represent the PCR primer from 103.105 represent the product from 104, and it contains the nested type forward primer b of hybridization and oppositely marks A, and oppositely mark A has been a part for the molecule from the PCR occurred between 103 and 104.106 represent final PCR primer.When this workflow, the sequence mapping usually more than 95% is to intended target.Nested primers can be the introduction of extra 3 ' hold base with outside forward primer overlapping sequences.In certain embodiments, the extra 3 ' base between and 20 is likely used.Experiment shows, and resets use 9 or more extra 3 ' base in meter applicable equally 1200.
The miniature PCR:(of full nested type is referring to Fig. 3) after STA step 1, likely with carrying mark, (A, a, B, two kinds of nested primers b) perform second composite PCR (or complicacy reduce parallel composition PCR).101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification with the forward primer B of hybridizing and reverse primer A.104 represent the PCR primer from 103.105 represent the product from 104, and it contains nested type forward primer b and the nested type reverse primer a of hybridization.106 represent final PCR primer.In certain embodiments, two groups of a complete set of primers are likely used.Use the experiment of full nested type miniature PCR scheme to be used to perform 146 to single and three cells heavily to increase, and do not use the step 102 of the general joint adapter of attachment and amplification.
The miniature PCR:(of half side nested type is referring to Fig. 4) be likely used in the target dna that fragment ends has adapter.Perform STA, comprise forward primer (B) and (or a small amount of) mark specific reverse primers (A) of one group of compound.Common tags specific forward primer and desired specificities reverse primer can be used to perform second STA.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification with the reverse primer A of hybridization.104 represent the PCR primer from 103, and it uses reverse primer A and engages adapter labeled primer LT amplification.105 represent the product from 104, and it contains the forward primer B of hybridization.106 represent final PCR primer.In this workflow, desired specificities forward and reverse primer are used in reaction separately, thus reduce the complicacy of reaction and prevent from being formed the dimer of forward and reverse primer.Should notice that in this example, primer A and B can think the first primer, and primer ' a ' and ' b ' can think internal primer.This method is the vast improvement of Direct PCR, because it is equally good with Direct PCR, but is that it avoids primer dimer.After the first round half side nested type scheme, usually see about 99% non-targeted DNA, but after second takes turns, usually have huge improvement.
The miniature PCR:(of triple half side nested types is referring to Fig. 5) be likely used in the target dna that fragment ends has adapter.Perform STA, comprise forward primer (B) and (or a small amount of) mark specific reverse primers (A) and (a) of one group of compound.Common tags specific forward primer and desired specificities reverse primer can be used to perform second STA.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification with the reverse primer A of hybridization.104 represent the PCR primer from 103, and it uses reverse primer A and engages adapter labeled primer LT amplification.105 represent the product from 104, and it contains the forward primer B of hybridization.106 represent the PCR primer from 105, and it uses reverse primer A and forward primer B to increase.107 represent the product from 106, and it contains the reverse primer ' a ' of hybridization.108 represent final PCR primer.Should note in this example, primer ' a ' and B can think internal primer, and A can think the first primer.Optionally, A and B all can think the first primer, and ' a ' can think internal primer.Title that is reverse and forward primer can be changed.In this workflow, desired specificities forward and reverse primer are used in reaction separately, thus reduce the complicacy of reaction and prevent from being formed the dimer of forward and reverse primer.This method is the vast improvement of Direct PCR, because it is equally good with Direct PCR, but is that it avoids primer dimer.After the first round half side nested type scheme, usually see about 99% non-targeted DNA, but after second takes turns, usually have huge improvement.
The miniature PCR:(of monolateral nested type is referring to Fig. 6) be likely used in the target dna that fragment ends has adapter.Can also with the nested type forward primer of one group of compound and use engage adapter mark perform STA as reverse primer.Then one group of nested type forward primer and general reverse primer can be used to perform second STA.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA carrying out universal amplification with the forward primer A of hybridization.104 represent the PCR primer from 103, and it uses forward primer A and engages adapter mark reverse primer LT amplification.105 represent the product from 104, and it contains the nested type forward primer a of hybridization.106 represent final PCR primer.This method can by using overlapping primers to detect the target sequence shorter than Standard PC R in first and second STA.Usually described method is performed to the DNA sample that experienced by above STA step 1 (attachment common tags and amplification); Two nested primers are only on side, and opposite side uses storehouse mark.Described method is performed to apoptosis supernatant liquor and pregnancy plasma storehouse.When this workflow, the sequence mapping of about 60% is to intended target.It should be noted that the reading containing reverse adapter sequence is not mapped, if those readings therefore containing reverse adapter sequence are mapped, so estimate that this number is higher.
Monolateral miniature PCR: be likely used in the target dna (referring to Fig. 7) that fragment ends has adapter.STA can be performed by the forward primer of one group of compound and (or a small amount of) mark specific reverse primers.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA containing the forward primer A of hybridizing.104 represent the PCR primer from 103, and it uses forward primer A and engages adapter mark reverse primer LT amplification, and is final PCR primer.This method can detect the target sequence shorter than Standard PC R.But its meeting relative nonspecificity, because only employ a desired specificities primer.The validity of this scheme is the half of the miniature PCR of monolateral nested type.
The miniature PCR of reverse half nested type: be likely used in the target dna (referring to Fig. 8) that fragment ends has adapter.STA can be performed by the forward primer of one group of compound and (or a small amount of) mark specific reverse primers.101 represent the double-stranded DNA at X place with relevant polymorphic locus.102 expressions with the addition of the double-stranded DNA of the joint adapter for universal amplification.103 represent the single stranded DNA containing the reverse primer B of hybridizing.104 represent the PCR primer from 103, and it uses reverse primer B and engages adapter mark forward primer LT amplification.105 represent the PCR primer from 104, and it contains forward primer A and the inner reverse primer ' b ' of hybridization.106 represent the PCR primer having used forward primer A and reverse primer ' b ' to increase from 105, and it is final PCR primer.This method can detect the target sequence shorter than Standard PC R.
Can also there is the more changeable body of simple iteration as above method or combination, such as dual nested PCR, which uses three groups of primers.Another kind of variant is a miniature PCR of half of nested type, wherein also can perform STA by the nested type forward primer of one group of compound and (or a small amount of) mark specific reverse primers.
It should be noted that in all these variants, the identity of forward primer and reverse primer can be exchanged.Should note in certain embodiments, nested type variant can operate when preparing without initial storehouse equally, and described initial storehouse preparation comprises attachment adapter mark and universal amplification step.Should note in certain embodiments, can comprise by extra forward and/or reverse primer and amplification step and additionally severally take turns PCR, these additional steps can be especially applicable when needing the per-cent increasing the DNA molecular corresponding to target gene seat further.
Nested workflow
There is various ways to perform amplification, with different nested degree with different Compound Degrees.In fig .9, the schema containing some possibility workflows is given.It should be noted that 10, the use of 000 heavy PCR only means an example; These schemas will be equally applicable to other Compound Degree.
Become engagement of loops adapter
When such as in order to manufacture order-checking storehouse used and add common tags adapter time, there is the mode of multiple joint adapter.Mode is the end flat end making sample DNA, performs A tailing, and engages with the adapter with T overhang.There is other the mode of joint adapter multiple.Also there is multiple adapter that can be engaged.For example, Y adapter can be used when adapter is made up of two DNA chains, the wherein chain region that there is double-stranded region and specified by forward primer district, and wherein another chain by with the double stranded region of the double stranded region complementation on Article 1 chain and the Region specification containing reverse primer.When annealing, double stranded region can containing T overhang to join double-stranded DNA to by A overhang.
In one embodiment, adapter can be DNA circle, wherein stub area is complementary, and forward primer mark zone (LFT), reverse primer mark zone (LRT) and marginal cracking site (referring to Figure 10) are contained in its middle annular zone.101 target dnas referring to double-strand, end flat end.102 target dnas referring to A tailing.103 refer to and become engagement of loops adapter containing T overhang ' T ' and cracking site ' Z '.104 refer to the target dna being attached with into engagement of loops adapter.105 refer to be attached with and engage adapter, target dna in cracking site cracking.LFT refers to and engages adapter forward mark, and LRT refers to that engaging adapter oppositely marks.Complementary district or can may be used for joining the further feature of target dna to for end with T overhang.Cracking site can be a series of uridylics by UNG cracking, maybe can pass through restriction enzyme or other cleavage method or only substantially increase to identify and the sequence of cracking.These adapters may be used for the preparation of any storehouse, such as, for order-checking.These adapters can combinationally use with any one in other method described herein, such as miniature PCR amplification method.
Inner marker primer
When determining the allelotrope of specifying polymorphic locus place to exist when using order-checking, sequence reads starts from the upstream of primer binding site (a) usually, and then arrives polymorphic site (X).Usually configuration flag as shown in left side in Figure 11.101 refer to containing relevant polymorphic locus ' X ' and the single-stranded target DNA of primer ' a ' being attached with mark ' b '.In order to avoid non-specific hybridization, the length of primer binding site (region complementary with ' a ' in target dna) normally 18 arrives 30bp.Sequence mark ' b ' normally about 20bp; These can be any length of being longer than about 15bp in theory, but a lot of people uses the primer sequence sold by order-checking platform company (sequencing p1atform company).Distance ' d ' between ' a ' and ' X ' can be at least 2bp to avoid allelotrope deviation.When using method disclosed herein or other method to perform composite PCR amplification, wherein need careful design of primers to avoid too much primer-primer use mutually mutually, the scope of the allowed distance ' d ' between ' a ' and ' X ' can alter a great deal: from 2bp to 10bp, from 2bp to 20bp, from 2bp to 30bp or even from 2bp to more than 30bp.Therefore, when using the configuration of the primer in Figure 11 shown in left side, sequence reads must be minimum 40bp to obtain the reading that length is enough to measurement polymorphic locus, and depend on the length of ' a ' and ' d ', sequence reads may need to reach 60 or 75bp.Usually, sequence reads is longer, the cost that the reading of specified quantity is checked order and the time higher, therefore, required reading length is dropped to minimum can saving time and money.In addition, because on average, the base that described reading comparatively early reads reads more accurate than those of reading more late on described reading, sequence reads length must can also improve the accuracy of Polymorphic Regions measuring result so reduce.
In one embodiment, about inner marker primer, primer binding site (a) be split into multiple section (a ', a ", a " ' ...), and sequence mark (b) is being in the region of DNA section in the middle of two primer binding sites, as Figure 11, shown in 103.This configuration allows sequenator to make shorter sequence reads.In one embodiment, a '+a " should be at least about 18bp, and can reach 30,40,50,60,80,100 or more than 100bp.In one embodiment, a " should be at least about 6bp, and in one embodiment, between about between 8 and 16bp.When all other factorses are identical, inner marker primer can be used the length of required sequence reads to be cut away at least 6bp, nearly 8bp, 10bp, 12bp, 15bp, and even reach 20 or 30bp.This can obtain significant money, time and accuracy advantage.Give an example of inner marker primer in fig. 12.
Containing the primer engaging adapter land
The problem of fragmentation DNA is because its length is short, thus polymorphism close to the probability of DNA chain end higher than long-chain (such as 101, Figure 10).Because the PCR of polymorphism catches the primer binding site requiring to have appropriate length on the both sides of polymorphism, so a large amount of DNA chains with target polymorphism miss due to the overlapping deficiency between primer with target binding site.In one embodiment, target dna 101 can be attached with and engage adapter 102, and target primer 103 can have region (cr) (referring to Figure 13) of marking (1t) complementation with the joint adapter of the upstream being attached at designed land (a); Therefore, in land (with the region of a complementation in 101) than for hybridizing usually required short 18bp, the region (cr) primer mark complementation with storehouse can increase with can carry out that of PCR in conjunction with energy.It should be noted that any specificity of losing due to shorter land can be made up by other PCR primer containing suitably long target land.Should note, the present embodiment can combinationally use with any one in Direct PCR or other described herein method, and other method described is nested PCR, half nested PCR, half side nested PCR, monolateral nested type or half or half side nested PCR or other PCR scheme such as.
When use sequencing data and relate to more viewed allelotrope data with for the analytical procedure of each expectation allele distributions supposed to measure ploidy time, will information more more than the allelotrope reading with the high scale degree of depth be provided from allelic each additional readings with the low scale degree of depth.Therefore, ideally, hope is seen the even reading degree of depth (DOR), wherein each locus will have the representative series reading of similar quantity.Therefore, need that DOR variance is dropped to minimum.In one embodiment, the coefficient of variation (this can be defined as the standard deviation/average DOR of DOR) of DOR is likely reduced by increasing annealing time.In certain embodiments, annealing temperature can be longer than 2 minutes, be longer than 4 minutes, be longer than ten minutes, be longer than 30 minutes and be longer than one hour or even longer.Because annealing is an equilibrium process, so improve DOR variance by increase annealing time to there is not restriction.In one embodiment, increase primer concentration and can reduce DOR variance.
exemplary whole genome amplification method
In certain embodiments, the inventive method can relate to DNA amplification, such as, before the target gene seat that only increases, use full-length genome should be used for amplification of nucleic acid sample.The amplification of DNA is a process a small amount of genetic material being converted into the greater amount genetic material comprising one group of similar genetic data, can be undertaken by various method, described method includes, but is not limited to polymerase chain reaction (PCR).A kind of method of DNA amplification is whole genome amplification (WGA).There is the multiple method that can be used for WGA: engage the PCR (LM-PCR) of mediation, degenerate oligonucleotide primed PCR (DOP-PCR) and composite permutation amplification (MDA).In LM-PCR, the short dna sequence being called adapter is engaged to the flat end of DNA.These adapters contain universal amplification sequence, and they are used to pass through pcr amplified dna.In DOP-PCR, the random primer also containing universal amplification sequence is used in first round annealing and PCR.Then, second take turns PCR other contains the sequence of universal primer sequence for increasing.MDA uses phi-29 polysaccharase, and this is a kind of repetition DNA and has been used to the height Progressive symmetric erythrokeratodermia of single cell analysis and non-specific enzyme.Amplification is the necessity that (1) uses the reaction mixture of extremely rare DNA concentration or minimum volume from the main restriction of single celled material, and (2) are in full-length genome, the difficulty of the DNA that reliably dissociates from protein.In any case unicellular whole genome amplification has been successfully used to various application for many years.There is other method from DNA sample DNA amplification.But initial DNA sample is converted into similar quantity in sequence set by DNA cloning goes out greatly a lot of DNA sample.In some cases, amplification can not be needed.
In certain embodiments, the universal amplification of such as WGA or MDA can be used to carry out DNA amplification.In certain embodiments, can be increased by target, such as, use target PCR or cyclisation middle probe to carry out DNA amplification.In certain embodiments, target amplification method can be used or cause the desired method be separated wholly or in part with undesirable DNA, such as, catch priority enrichment DNA by hybridizing method.In certain embodiments, DNA amplification can be carried out by using the combination of universal amplification method and priority enrichment method.The description more fully of some in these methods can be found in other place in the document.
exemplary enrichment and sequence measurement
In one embodiment, method disclosed herein uses selective enrichment technology, described technology remain resident in initial DNA sample from one group of target gene seat (such as, polymorphic locus) the relative allele frequencies at each target gene seat (such as, each polymorphic locus) place.Although enrichment for being particularly advantageous for the method analyzing polymorphic locus, if desired, these enriching methods can easily be adapted for non-polymorphic locus.In certain embodiments, amplification and/or selective enrichment technology can relate to PCR (such as engaging the PCR of mediation), by hybridizing, the fragment of being undertaken is caught, molecular inversion probes or other cyclisation middle probe.In certain embodiments, for increase or the method for selective enrichment can relate to use probe, wherein after correctly hybridizing to target sequence, 3 ' end of nucleotide probe or 5 ' is held and is separated by a small amount of Nucleotide and allelic polymorphic site.This interval reduces an allelic preferential amplification (being called allelotrope deviation).This is the one improvement being better than relating to the method using probe, 3 ' end of the probe of wherein correct hybridization or 5 ' end and allelic polymorphic site direct neighbor or very close.In one embodiment, get rid of wherein hybridization region and the probe containing polymorphic site or can be determined.Some allelic inequality hybridization can be caused at the polymorphic site of hybridization site or suppress overall hybridization, causing some allelotrope preferential amplification.The improvements that these embodiments are better than other method relating to target amplification and/or selective enrichment are, they maintain the initial gene frequency of sample at each polymorphic locus place better, and no matter sample is the pure genomic samples from single individuality or individual mixture.
Be used in one group of target gene seat place enrichment DNA sample, the technology that then checks order can bring multiple beyond thought advantage as a part for the method for Noninvasive antenatal allelotrope interpretation or ploidy interpretation.In some embodiments of the invention, described method relates to the method, the such as PARENTAL SUPPORT that measure and be applicable to based on information tM(PS) genetic data.The net result of some in described embodiment is the feasible genetic data of embryo or fetus.There is multiple method can be used to measure the part of individual and/or individual relevant genetic data as implementation method.In one embodiment, disclosed herein a kind of method for enrichment one group of allelic concentration of target, described method comprise in following steps one or more: target amplification genetic material, add locus-specific oligonucleotide probe, engage specify DNA chain, be separated desired by DNA group, remove undesirable reactive component, by some DNA sequence dna of hybridization check and the sequence of one or more chain being detected DNA by DNA sequencing method.In some cases, DNA chain can feeling the pulse with the finger-tip mark genetic material; In some cases, they can guide thing; In some cases, they can refer to composition sequence, or its combination.These steps can perform by multiple different order.
For example, before target amplification, the universal amplification step of DNA can bring several advantage, such as, remove the risk of bottleneck effect and reduce allelotrope deviation.DNA can mix with oligonucleotide probe, and described probe can be hybridized with two of a target sequence both sides adjacent area.After hybridization, probe end can by adding polysaccharase, for the component that engages and for allowing any of circularizing probes to connect by reagent.After cyclisation, exonuclease can be added to digest non-cyclizing genetic material, then detect circularizing probes.DNA can mix with PCR primer, and described PCR primer can be hybridized with two of a target sequence both sides adjacent area.After hybridization, probe end by adding polysaccharase, for the component that engages and must can connect by reagent for completing any of pcr amplification.Can by the hybrid capture probe target of target one group of locus to the DNA increasing or do not increase; After hybridization, can position probe and make itself and mixture separation to provide the DNA mixture of target sequence enrichment.
The method using some locus of target then to check order can bring multiple beyond thought advantage as a part for the method for allelotrope interpretation or ploidy interpretation.The certain methods of target or priority enrichment DNA can comprise and use cyclisation middle probe, chain inversion probes (LIP, MIP), catch by the hybridizing method of such as Xiu Er Selleck spy (SURESELECT) and target PCR or engage the pcr amplification strategy mediated.
In certain embodiments, the inventive method relates to measuring and is applicable to the further described method based on information, such as PARENTAL SUPPORT herein tM(PS) genetic data.PARENTAL SUPPORT tMa kind of method based on information for manipulating genetic data, described by its each side has in this article.The net result of some in described embodiment is the feasible genetic data of embryo or fetus, is then the clinical decision based on described feasible data.The genetic data of target individual, normally embryo or fetus measured by PS method algorithm below adopts, and the genetic data of measured related individuals, and the accuracy of the genetic state knowing target individual can be improved.In one embodiment, measured genetic data is used for making ploidy decision during prenatal gene diagnosis.In one embodiment, when measured genetic data is used for making ploidy decision or allelotrope interpretation to embryo between receptive period in vitro.There is multiple method can be used to measure individuality and/or individual relevant genetic data in these cases.Different methods comprises multiple step, those steps be usually directed to increase genetic material, add oligonucleotide probe, engage specify DNA chain, be separated desired by DNA group, remove undesirable reactive component, by some DNA sequence dna of hybridization check, the sequence being detected one or more chain of DNA by DNA sequencing method.In some cases, DNA chain can feeling the pulse with the finger-tip mark genetic material; In some cases, they can guide thing; In some cases, they can refer to composition sequence, or its combination.These steps can perform by multiple different order.
It should be noted that in theory likely in target gene group from a locus to far more than any amount of locus 1,000,000 locus.If DNA sample experience target and then checking order, the allelic per-cent so read by sequenator is by relative to its natural abundance enrichment to some extent in the sample to which.Enrichment can be from one of percentage (or even less) to ten times, hundred times, thousand times or any number even millions of times.In human genome, there is about 3,000,000,000 base pairs and Nucleotide, comprise about 0.75 hundred million polymorphic locus.The locus of institute's target is more, and enrichment may be less.The quantity of the locus of institute's target is fewer, and enrichment may be larger, and about the sequence reads of specified quantity, the reading degree of depth that can obtain at those locus places is larger.
In one embodiment of the invention, target or preferentially can concentrate on SNP completely.In one embodiment, target or preferentially can concentrate on any polymorphic site.Multiple commercial target can in order to enrichment exon to product.Unexpectedly, when use depend on allele distributions carry out the method for NPD time, exclusively target SNP or exclusively target polymorphic locus be particularly advantageous.Also disclose the method using order-checking to carry out NPD, such as United States Patent (USP) 7,888,017, relate to reading analysis of accounts, wherein reader number concentrates on and specifies the quantity of chromosomal reading to count to being mapped to, and wherein analyzed sequence reads is not concentrate on genomic Polymorphic Regions.Not that the method for those types concentrating on polymorphic allele can not benefit from target or priority enrichment one group of allelotrope so largely.
In one embodiment of the invention, likely use concentrates on SNP with the targeted approach of the genetic material in the genomic Polymorphic Regions of enrichment.In one embodiment, likely concentrate on a small amount of SNP, such as, between 1 and 100 SNP, or larger quantity, such as between 100 and 1, between 000, between 1,000 and 10, between 000, between 10,000 and 100, between 000 or more than 100,000 SNP.In one embodiment, likely concentrate on one or a small amount of and to be born relevant karyomit(e) with three bodies of living, such as No. 13, No. 18, No. 21, X and Y chromosome or its some combine.In one embodiment, likely make the multiple that the SNP enrichment one of institute's target is little, such as, between 1.01 times and 100 times, or the multiple that enrichment one is larger, such as, between 100 times and 1, between 000,000 times, or even more than 1,000,000 times.In one embodiment of the invention, the targeted approach of the DNA sample for being created in priority enrichment in genomic Polymorphic Regions is likely used.In one embodiment, likely make to create in this way the DNA mixture of any one had in these features, wherein said DNA mixture contains the foetal DNA of maternal DNA and unmanaged flexibility.In one embodiment, the DNA mixture creating any combination with these factors in this way is likely made.For example, method described herein may be used for manufacturing and comprises maternal DNA and foetal DNA and priority enrichment corresponds to the DNA mixture of the DNA of 200 SNP, and all DNA are all positioned on No. 18 or No. 21 karyomit(e)s, and are on average enriched 1000 times.In another example, likely use described method to be created in 10, the DNA mixture of 000 SNP priority enrichment, all or major part in described DNA is positioned at No. 13, No. 18, No. 21, on X and Y chromosome, and the average enrichment of each locus is more than 500 times.Any one in targeted approach described herein may be used for the DNA mixture being created in some locus priority enrichment.
In certain embodiments, the inventive method comprises the DNA using the measurement of high-throughput DNA sequencer to represent with mixed number further, wherein contain from the out-of-proportion sequence of one or more chromosomal quantity with the DNA that mixed number represents, one or more karyomit(e) wherein said is taken from and is comprised No. 13 karyomit(e)s, No. 18 karyomit(e)s, No. 21 karyomit(e)s, X chromosome, Y chromosome and its groups combined.
There is described herein three kinds of methods: composite PCR, by hybridization target catch and chain inversion probes (LIP), they may be used for obtaining measuring result from the polymorphic locus of the sufficient amount from maternal plasma sample and being analyzed, thus detect fetus dysploidy; This does not also mean that other method getting rid of selective enrichment target gene seat.Other method can be used equally and not change the essence of described method.In each case, the polymorphism analyzed can comprise single nucleotide polymorphism (SNP), small segment insertion and deletion or STR.Preferred method relates to use SNP.Often kind of method produces gene frequency data; The gene frequency data of each target gene seat can be analyzed and/or distribute with the ploidy measuring fetus from the associating gene frequency of these locus.The fact that often kind of method is made up of the mixture of the DNA of maternal and fetus due to limited source material and maternal blood plasma and there is himself Consideration.This method can provide with other Combination of Methods and measure more accurately.In one embodiment, this method can with such as United States Patent (USP) 7,888, the sequence count Combination of Methods described in 017.Described method can also be used for non-invasively detecting fetus parental right from maternal plasma sample.In addition, often kind of method can be applied to other DNA mixture or pure dna sample to detect presence or absence aneuploid karyomit(e), gene type is carried out to a large amount of SNP of the DNA sample from degraded, detector segments copy number variation (CNV), detects other relevant genotypic state or its some combinations.
The distribution in the sample to which of Measurement accuracy allelotrope
Current sequence measurement may be used for estimating allelotrope distribution in the sample to which.These class methods relate to carries out a stochastic sampling to the sequence of a pond DNA, is called shotgun sequencing.Specific allelic ratio in sequencing data is usually extremely low and can be measured by simple statistics.Human genome is containing 3,000,000,000 base pairs of having an appointment.Therefore, if sequence measurement used makes 100bp reading, so every 3,000 ten thousand sequence reads are measured specific allelotrope about once.
In one embodiment, the inventive method is used for determining DNA sample presence or absence from the allele distributions of the locus of described Chromosome measurement two or more contains the different haplotypes of homologous genes seat group.Different haplotype can represent from two of body one by one different homologous chromosomess, from three of three body individualities different homologous chromosomess, from three kinds of mother and fetus different homology haplotypes, a kind of wherein in haplotype shares between mother and fetus, from three kinds or four kinds of haplotypes of mother and fetus, one or both wherein in haplotype are shared or other combination between mother and fetus.Polymorphic allele between haplotype often can provide more information, but wherein M & F is not that any allelotrope isozygotied will obtain being better than the useful information of the information that can obtain from simple reading analysis of accounts by measured allele distributions for same allelotrope.
But the shotgun sequencing efficiency of this sample is extremely low, because the non-Polymorphic Regions it created in sample between different haplotype or be not multiple sequences in region of relative chromosome, and therefore do not disclose the information of the ratio about target haplotype.There is described herein in the genome of selectively targeted and/or priority enrichment sample may be more that polymorphic region of DNA section is to increase the method for the acquisition amount of the allelic information obtained by order-checking.Should note, about actual amount existing in allele distributions authentic representative target individual measured in enriched sample,, there is allelotrope allelic compared to another few or without priority enrichment in the appointment locus place crucially in target section.The current method becoming known for target polymorphic allele in this area is designed to guarantee at least some in existing any allelotrope to be detected.But these methods are not in order to the non-deviation allele distributions measuring polymorphic allele existing in original mixture designs.This point also non-obvious below: any ad hoc approach of target enrichment can produce enriched sample, and wherein measured allele distributions accurately will represent allele distributions existing in the initial sample without amplification better than any other method.Although can estimate multiple enriching method to realize this kind of object in theory, those of ordinary skill in the art is fully recognized that, there is a large amount of random or determinacy deviation in current amplification, target and other priority enrichment method.An embodiment of described method allows multiple allelotrope of the appointment locus corresponded in genome found in DNA mixture to increase or priority enrichment in the mode that the enrichment of each in allelotrope is almost identical herein.Another kind of saying is, described method allows allelic relative populations existing in mixture to increase on the whole, and the ratio between the allelotrope corresponding to each locus keeps identical with them substantially in initial DNA mixture.About reported certain methods, the priority enrichment of locus can produce more than 1%, allelotrope deviation more than 2%, more than 5% and even more than 10%.This priority enrichment may be due to use caught by hybridizing method time catch deviation, but or littlely may can become large amplification deviation when being mixed for more than 20,30 or 40 cycles concerning each cycle.In order to the present invention, ratio substantially keeps identical and means allelic ratio in original mixture divided by the allelic ratio in gained mixture between 0.95 and 1.05, between 0.98 and 1.02, between 0.99 and 1.01, between 0.995 and 1.005, between 0.998 and 1.002, between 0.999 and 1.001 or between 0.9999 and 1.0001.It should be noted that the calculating of the allele ratio presented at this cannot be used for determining the ploidy state of target individual and can be only the tolerance for weighing allelotrope deviation.
In one embodiment, after target gene seat group priority enrichment mixture, former generation, the present age or the next generation can be used to clone sample (from the sample that single molecule produces; Example comprises Yi Lu meter Na GAIIx, Yi Lu meter Na HISEQ, life technology Suo Lide (LIFE TECHNOLOGIES SOLiD), 5500XL) any one of carrying out in the order-checking instrument checked order check order to it.Ratio can be estimated by carrying out order-checking to the specific alleles in target district.These order-checking readings can be analyzed according to allelic gene type and the not homoallelic ratio therefore determined and be counted.Be a variation to several base about length, allelic detection will be performed by order-checking and the reading that checks order must cover discussed allelotrope to estimate the allelotrope composition of institute's capture molecules.The sum for analyzing gene type institute capture molecules can be increased by the length increasing order-checking reading.The complete order-checking of all molecules will guarantee the set of the maximum data that can be used in enrichment pond.But it is expensive for checking order current, and the method that less amount sequence reads can be used to measure allele distributions will be of great value.In addition, there is the technical limitation of maximum possible reading length and the accuracy restriction along with the increase of reading length.The allelotrope with maximum utility will to be length be one to several base, but any allelotrope shorter than the length of order-checking reading can be used in theory.Although allelic variation appears in all types, example provided in this article concentrates on and contains the SNP or varient that only several adjacent base is right.In many cases, the more Big mutation rate body of such as blockiness copy number varient can be detected by the aggregate of these less variations, because the whole set of the SNP of described intra-segment is all replicated.Varient (such as STR) needs being greater than several base consider especially and some targeted approach work and other by inoperative.
There is the targeted approach of the multiple one or more varient positions that may be used in specific isolation and enrichment genome.Usually, these depend on the constant series utilizing side joint variant sequence.There is other people report relevant to the target in order-checking situation, wherein substrate be maternal blood plasma (referring to people such as such as Liao (Liao), clinical chemistry (Clin.Chem.) 2011; 57 (1): the 92-101 pages).But these methods use the targeted probes of target exon, and are not the Polymorphic Regions concentrating on target gene group.In one embodiment, the inventive method relates to use exclusively or almost concentrate on the targeted probes of Polymorphic Regions completely.In one embodiment, the inventive method relates to use exclusively or almost concentrate on the targeted probes of SNP completely.In some embodiments of the invention, target polymorphic site by least 10%SNP, at least 20%SNP, at least 30%SNP, at least 40%SNP, at least 50%SNP, at least 60%SNP, at least 70%SNP, at least 80%SNP, at least 90%SNP, at least 95%SNP, at least 98%SNP, at least 99%SNP, at least 99.9%SNP or exclusively SNP form.
In one embodiment, the inventive method may be used for genotype (the DNA essentially consist at specific gene seat place) and those genotypic relative proportions of measuring DNA molecular mixture, and wherein those DNA moleculars can derive from individualities different in one or more heredity.In one embodiment, the inventive method may be used for the genotype being determined at one group of polymorphic locus place, and is present in the relative ratios of not homoallelic amount of those locus.In one embodiment, polymorphic locus can be made up of SNP completely.In one embodiment, polymorphic locus can comprise SNP, unitary series of operations repeats and other polymorphism.In one embodiment, the inventive method may be used for measuring the allelic Relative distribution at one group of polymorphic locus place in DNA mixture, and wherein said DNA mixture comprises the DNA being derived from mother and the DNA being derived from fetus.In one embodiment, the associating allele distributions of the DNA mixture be separated from pregnant woman blood can be measured.In one embodiment, may be used for measuring one or more chromosomal ploidy state of the fetus in breeding in the allele distributions at one group of locus place.
In one embodiment, DNA molecular mixture can derive from the DNA from multiple cell extraction of body one by one.In one embodiment, the initial sets therefrom obtaining the cell of DNA can comprise the mixture with identical or different genotypic diploid or haploid cell, if described individuality is mosaic (reproductive tract or somatocyte).In one embodiment, DNA molecular mixture can also derive from the DNA from unicellular extraction.In one embodiment, DNA molecular mixture can also derive from the DNA extracted from the mixture of two or more cells of same individuality or Different Individual.In one embodiment, DNA molecular mixture can derive from the known DNA containing being separated the biomaterial of Cell-free DNA discharged from the cell from such as blood plasma.In one embodiment, this biomaterial can be the DNA mixture from one or more individuality, as the situation of pregnancy duration, has wherein shown and there is foetal DNA in the mixture.In one embodiment, biomaterial can from the cell mixture found in maternal blood, and some in wherein said cell come from fetus.In one embodiment, biomaterial can be the cell being enriched fetal cell from pregnant woman blood.
Cyclisation middle probe
Some embodiments of the present invention relate to use in composite PCR method of the present invention be not LIP primer amplification before or after, use " chain inversion probes " (LIP) be previously described in the literature to increase target gene seat.LIP is a generic term, intend to contain the technology relating to and produce ring-shaped DNA molecule, wherein said probe is designed to the region of DNA hybridizing to institute's target on the allelic both sides of target, to make the interpolation of appropriate polysaccharase and/or ligase enzyme and felicity condition, damping fluid and other reagent, the complementation completed on target allelotrope, inversion type region of DNA is created the cyclic DNA ring being captured in the information found in target allelotrope.LIP can also be referred to as circularizing probes, pre-cyclisation middle probe or cyclisation middle probe in advance.LIP probe can be the linear DNA molecule of length between 50 and 500 Nucleotide, and in one embodiment, length is between 70 and 100 Nucleotide; In certain embodiments, it can than length or short described herein.Other embodiments of the invention relate to the different incarnations of the LIP technology of such as padlock probe and molecular inversion probes (MIP).
Targeting specific position is synthesising probing needle for a kind of method checked order, 3 ' of wherein said probe and 5 ' holds in the position adjacent with target district and be annealed to target dna with upside down on the both sides in target district, cause to make the interpolation of archaeal dna polymerase and DNA ligase and extend from 3 ' end, base (gap-fill) is added to the single-stranded probe with target molecule complementation, then be the 5 ' end new 3 ' termination being incorporated into initial probe, produce ring-shaped DNA molecule, described molecule can be separated with background dna subsequently.Probe end is designed to side joint target relevant range.An aspect of this method is commonly referred to as MIP and is used from the character measuring institute's padding sequence in conjunction with array technique one.Use when measuring allele ratio that MIP shortcoming is hybridization, cyclisation and amplification step do not occur with equal rates for the not isoallele at same gene seat place.This causes measured allele ratio not represent actual allele ratio existing in original mixture.
In one embodiment, cyclisation middle probe is constructed to make the probe area being designed to the upstream of hybridizing target polymorphic locus covalently bound by non-nucleic acid backbone with the probe area being designed to the downstream of hybridizing target polymorphic locus.This skeleton can be the combination of any biocompatiblity molecules or biocompatiblity molecules.Some examples of possible biocompatiblity molecules are PEG, polycarbonate, polyurethane(s), polyethylene, polypropylene, sulfone polymer, silicone, Mierocrystalline cellulose, fluoropolymer, acrylic compounds, styrene block copolymer and other segmented copolymer.
In one embodiment of the invention, this method has been revised easily stand a kind of means of order-checking as filling person in search sequence.In order to keep the initial allelotrope ratio of initial sample, at least one critical consideration must be considered.Variable position in gap-fill district not in isoallele must can not be too near to probe binding site, because can there is the initial deviation caused by archaeal dna polymerase, causes varient difference.Another Consideration may there is other at the probe binding site relevant to the varient in gap-fill district to make a variation, and these variations can cause not homoallelic unequal amplification.In one embodiment of the invention, the 3 ' end and 5 ' of circularizing probes holds the base being designed to one or several position hybridized to away from the allelic varient position (polymorphic site) of target in advance.3 ' end and/or the 5 ' quantity being designed to the base of hybridizing of the base between polymorphic site (SNP or other) and in advance circularizing probes can be bases, it can be two bases, it can be three bases, it can be four bases, it can be five bases, it can be six bases, it can be seven to ten bases, it can be ten one to ten five bases, or it can be ten six to two ten bases, 20 to three ten bases or 30 to six ten bases.Forward and reverse primer can be designed to the base away from polymorphic site of hybridizing different quantities.A large amount of cyclisation middle probe can be produced by current dna synthetic technology, allow to produce and merge huge amount probe potentially, realize inquiring about while multiple locus.Reported more than 300,000 probe works.The two sections of papers discussing the method for the cyclisation middle probe relating to the genomic data that may be used for measurement target individuality comprise: the people such as ripple Racal (Porreca), natural method (Nature Methods), 20074 (11), 931-936 page; And the people such as Tener (Turner), natural method, the 2009,6 (5), the 315-316 page.Method described in these papers can use with other described herein Combination of Methods.Some step of the method for these two sections of papers can combinationally use with other step of other described herein method.
In some embodiments of method disclosed in this article, the genetic material of the target individual that optionally increases, then circularizing probes is in advance hybridized, perform gap-fill to fill base between two ends of hybridization probe, engage two ends to form circularizing probes, and use such as rolling circle amplification to increase circularizing probes.By such as making after the oligonucleotide probe cyclisation of appropriate design captures desired target alleles information, the gene order of circularizing probes can be measured, obtain desired sequence data in LIP system.In one embodiment, the oligonucleotide probe of appropriate design can direct cyclisation on the genetic material that do not increase of target individual, and increases subsequently.It should be noted that and multiple amplification program can be used to initial genetic material or the cyclisation LIP of increasing, comprise rolling circle amplification, MDA or other amplification scheme.Can make differently to carry out the genetic information on measurement target genome, such as use high-flux sequence, mulberry lattice check order (Sanger sequencing), other sequence measurement, by hybrid capture, caught by cyclisation, composite PCR, other hybridizing method and its combination.
After one or more using in above method measures individual genetic material, then can use method (the such as PARENTAL SUPPORT based on information tMmethod) and appropriate gene surveying measure one or more individual chromosomal ploidy state and/or one or one group of allelotrope, those especially relevant to relative disease or genetic state allelic genetic states.It should be noted that and report that use LIP catches for the compound of gene order, is then carry out gene type with order-checking.But, not yet use the sequencing data produced by the strategy based on LIP of the genetic material for finding in unicellular, a small amount of cell that increases or extracellular DNA to measure the ploidy state of target individual.
Described by having the document reference document in the ploidy state that the method based on information of applying measures individuality from the genetic data such as measured by hybridised arrays (such as Yi Lu meter Na Yin flies Nimes (INFINIUM) array or high flies (AFFYMETRIX) gene chip) other place in the document.But method described herein illustrates the improvement being better than previously described method in the literature.For example, based on LIP method, be then that high-flux sequence unexpectedly provides better genotype data, because described method has better compound ability, better catches specificity, better homogeneity and hypomorph deviation.Larger compound allows target more multiple alleles, provides result more accurately.Better homogeneity causes more target allelotrope measured, provides result more accurately.Lower allelotrope deviation ratio creates lower misinterpretation rate, provides result more accurately.The result output improvement of clinical effectiveness more accurately, and create better medical treatment and nursing.
Being important to note that, LIP can being used as the method for the specific gene seat in target DNA sample for carrying out gene type by the method except order-checking.For example, LIP may be used for target DNA to use SNP array or other microarray based on DNA or RNA to carry out gene type.
Engage the PCR of mediation
The PCR engaging mediation may be used for increasing target gene seat before or after the unassembled primer of use carries out pcr amplification.The PCR engaging mediation is a kind of PCR method for being carried out priority enrichment DNA sample by the one or more locus in DNA amplification mixture, described method comprises: obtain one group of primer pair, each primer of wherein said centering contains desired specificities sequence and non-targeted sequence, wherein be designed to be annealed to target area: a upstream of polymorphic site and a downstream desired specificities sequence preference, and described target area can be separated by 0 with polymorphic site, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11-20, 21-30, 31-40, 41-50, 51-100 or be greater than 100, strand district between holding from 3 ' of upstream primer end polymerization DNA with fill it and downstream primer with the Nucleotide of target molecule complementation 5 ', the base of the last polymerization of upstream primer is joined to the adjacent 5 ' base of downstream primer, and use the 3 ' non-targeted sequence that contains of end place of 5 ' of upstream primer end and downstream primer only to increase the molecule of polymerization and joint.The primer pair of different target can be mixed in same reaction.Non-targeted sequence serves as universal sequence to make successfully to be polymerized and all primer pairs engaged can increase to amplimer with single.
Pass through hybrid capture
In certain embodiments, the inventive method can relate to except using composite PCR, also uses below by way of any one in the method for hybrid capture to the target gene seat that increases.The priority enrichment of one group of particular sequence in target gene group can realize with various ways.Other place in the document describes and how to use LIP to carry out target one group of particular sequence, but in all that application, in order to same object, can use other target and/or priority enrichment method equally.An example of another kind of targeted approach is the method by hybrid capture.The TRUSEQ of SURE SELECT and Yi Lu meter Na of Agilent (AGILENT) is comprised by some examples of the commercial technology of hybrid capture.By in hybrid capture, make one group with the oligonucleotide hybridization of desired target complementary or major part complementation to DNA mixture, and then for physically with described mixture separation.After desired sequence has hybridized to target oligonucleotide, take out the effect of target oligonucleotide for physically also for taking out target sequence.Taking out after hybridization oligonucleotide, they can be heated to above its melting temperature(Tm) and can increase to them.Some modes of taking out target oligonucleotide are for physically by making target oligonucleotide covalently bonded to solid carrier, such as magnetic bead or chip.The another kind of mode of taking out target oligonucleotide be for physically by by their covalently bondeds to the molecular moiety another molecular moiety to strong avidity.The right example of this molecule is vitamin H and Streptavidin, such as, in SURE SELECT.Therefore, described target sequence can be attached to biotin molecule by covalency, and after hybridization, the solid carrier being stained with Streptavidin can be used to leave behind biotin labeled oligonucleotide, and on described oligonucleotide, hybridization has target sequence.
Hybrid capture relates to be made with the probe hybridization of related objective complementation to target molecule.Initial exploitation hybrid capture probe is for relatively uniform ground target between target and enrichment most gene group.In the application, importantly increase all targets to make all regions can be detected by order-checking with sufficient homogeneity, but think little of the allelic ratio in initial sample.After the acquisition, existing in sample allelotrope can be measured by the direct Sequencing of capture molecules.These order-checking readings can carry out analyzing and counting according to allelic gene type.But use current techniques, the measured allele distributions of acquisition sequence does not represent initial allele distributions usually.
In one embodiment, allelic detection is performed by order-checking.In order to catch the allelotrope identity of polymorphic site, order-checking reading must cover discussed allelotrope to estimate the allelotrope composition of capture molecules.Because capture molecules has variable length usually when checking order, unless so check order to whole molecule, otherwise varient location overlap can not be guaranteed.But, cost considerations and to make the order-checking of whole molecule about the technical limitation of maximal possible length and the accuracy of order-checking reading and infeasible.In one embodiment, reading length can be increased to about 50 or about 70 bases from about 30, greatly can increase the quantity of the reading of varient location overlap in target sequence.
Another kind of mode for increasing the quantity of the reading of inquiry relevant position is the length reducing probe, as long as it does not cause the allelic deviation of basic enrichment.The length of synthesising probing needle should each allelotrope of will hybridizing to almost equal avidity in initial sample with two not homoallelic two probes making to be designed to hybridize to a locus place and exist of sufficiently long.Current, in this area, currently known methods describes the probe being usually longer than 120 bases.In a present example, if allelotrope is one or several base, so capture probe can be less than about 110 bases, be less than about 100 bases, be less than about 90 bases, be less than about 80 bases, be less than about 70 bases, be less than about 60 bases, be less than about 50 bases, be less than about 40 bases, be less than about 30 bases and be less than about 25 bases, and this is enough to guarantee all allelic equality enrichments.When the DNA mixture of hybrid capture technology to be used enrichment is the mixture of the DNA comprising the unmanaged flexibility be separated from the blood of such as maternal blood, the mean length of DNA is very short, is usually less than 200 bases.The use of shorter probe creates hybrid capture probe and will catch the larger probability of desired DNA fragmentation.More Big mutation rate may need more long probe.In one embodiment, covariation is one (SNP) length to several base.In one embodiment, target area in genome can use hybrid capture probe priority enrichment, wherein the length of hybrid capture probe is lower than 90 bases, and can be less than 80 bases, is less than 70 bases, is less than 60 bases, is less than 50 bases, is less than 40 bases, is less than 30 bases or is less than 25 bases.In one embodiment, in order to increase the probability that desired allelotrope is sequenced, the length being designed to the probe in the region hybridizing to side joint polymorphic allele position can from reducing to about 80 bases or about 70 bases or about 60 bases or about 50 bases or about 40 bases or about 30 bases or about 25 bases more than 90 bases.
Minimum overlay is there is to realize catching between synthesising probing needle and target molecule.The minimum overlay that this synthesising probing needle can do short as much as possible but still be greater than required for this.The effect of shorter probe length target Polymorphic Regions is used to be there is molecule that is more and target alleles region overlapping.The fragmentation state of initial DNA molecular also affects the quantity of the reading overlapping with target allelotrope.Some DNA sample of such as plasma sample are because esoteric bioprocess is by fragmentation.But the sample with more long segment will benefit from fragmentation by storehouse preparation and enrichment before order-checking.As probe and fragment all shorter (about 60-80bp), can obtain maximum specificity, relatively less sequence reads is failed overlapping with crucial relevant range.
In one embodiment, hybridization conditions can be adjusted and reach maximum to make not homoallelic homogeneity of catching existing in initial sample.In one embodiment, reducing hybridization temperature drops to minimum to make the hybridization deviation between allelotrope.Method as known in the art is avoided using lower temperature to hybridize, because cooling has increase the impact of probe hybridization to undesirable target.But, when object is in order to during with maximum fidelity of reproduction maintenance allele ratio, use the method for lower hybridization temperature to provide best allele ratio accurately, but the fact is that current techniques instruction is away from this method.Also hybridization temperature can be increased larger overlapping to require between target with synthesising probing needle, to make only to catch the target overlapping in fact with target area.In some embodiments of the invention, hybridization temperature is reduced to about 40 DEG C, about 45 DEG C, about 50 DEG C, about 55 DEG C, about 60 DEG C, about 65 DEG C or about 70 DEG C from normal hybridisation temperature.
In one embodiment, can become to make the DNA in capture probe closely not adjacent with polymorphic site with the region of the DNA complementation found in the region of side joint polymorphic allele hybrid capture probe design.In fact, capture probe can be designed so that the region of the DNA being designed to the polymorphic site hybridizing to side joint target in capture probe and the short range in capture probe, the spaced-apart length contacted with polymorphic site Van der Waals (van der Waals) being equaled one or a small amount of base.In one embodiment, hybrid capture probe is designed to hybridize to side joint polymorphic allele but the region do not intersected with it; This can be called as side joint capture probe.The length of side joint capture probe can be less than about 120 bases, be less than about 110 bases, be less than about 100 bases, be less than about 90 bases, and can be less than about 80 bases, be less than about 70 bases, be less than about 60 bases, be less than about 50 bases, be less than about 40 bases, be less than about 30 bases or be less than about 25 bases.By the genome area of side joint capture probe target can by 1,2,3,4,5,6,7,8,9,10,11-20 or be greater than 20 base pairs and polymorphic locus separates.
Target sequence capturing is used to describe the disease filler test of catching based on target.Customization target sequence capturing, such as current provided by Agilent (SURE SELECT), Roche-Nimes mine-laying root (ROCHE-NIMBLEGEN) or Yi Lu meter Na those.Capture probe can be custom-designed to for guaranteeing catching of dissimilar sudden change.About point mutation, one or more probe overlapping with point mutation should be enough to catch sudden change and check order to it.
About little insertion or disappearance, can be enough to catch with overlapping one or more probe of sudden change the fragment that comprises sudden change and it is checked order.Hybridization between probe restriction capture rate may be not too effective, is generally designed to for reference genome sequence.In order to ensure catching the fragment comprising sudden change, can design two probes, one mates normal allele and a coupling mutant allele.Longer probe can strengthen hybridization.Multiple overlapped probes can strengthen catches.Finally, probe is closely adjacent with sudden change but does not place overlappingly can allow normally and the capture rate of mutant allele relative similar.
Repeat (STR) about unitary series of operations, the probe overlapping with these highly divergent isolate sites unlikely catches fragment well.Catch to strengthen, can be adjacent with variant sites but do not place overlappingly by probe.Then can check order to fragment by normal way, disclose length and the composition of STR.
About large disappearance, a series of overlapped probes can work, and this is current for a kind of common method in exon trapping system.But, by this method, may be difficult to determine whether individual or heterozygosis.Target and the SNP estimating in capture region can disclose the loss of heterozygosity,LOH in described region potentially, points out that individuality is carrier.In one embodiment, in absent region potentially, likely place non-overlapped probe or Single probe and use the number of fragments of catching as the tolerance of heterozygosity.When individuality carries large disappearance, relative to non-disappearance (diploid) with reference to locus, estimate that the number of fragments of 1/2nd can be used for catching.Therefore, the quantity of the reading obtained from absent region should be from normal diploid locus obtain only about half of.Adding up to the order-checking reading degree of depth of the multiple Single probe in absent region potentially and be averaged can enhancing signal and improve diagnosis degree of confidence.Two kinds of methods can also be combined: target SNP is to differentiate loss of heterozygosity,LOH and to use multiple Single probe to obtain the quantitative measurment from the basic number of fragments of locus.Any one or two in these strategies can with other strategy combination better to obtain same object.
If at test period, cfDNA detect male fetus (as by catch in same test and as indicated in the existence of Y chromosome fragment check order) and wherein the M & F chain dominant mutation of all impregnable X or the wherein impregnable dominant mutation of mother, so by high for the risk of expression fetus.Detect in the homologous genes of impregnable mother two kinds of mutant Recessive alleles by hint fetus genetic from the mutant allele of father and may heredity from second mutant allele of mother.In all cases, follow-up test can be pointed out by amniocentesis or fine hair sampling.
The disease filler test of catching based on target can with the non-invasive prenatal diagnosis testing combination of catching based on target about dysploidy.
There is various ways to reduce the reading degree of depth (DOR) variability: for example, can primer concentration be increased, longer target amplification probe can be used maybe can to operate the more STA cycle (such as more than 25, more than 30, more than 35 or even more than 40).
the exemplary methods of the DNA molecular quantity in working sample
There is described herein a kind of by during first round DNA cloning, produce for each initial DNA molecular in sample the method that the molecule differentiated uniquely carrys out the DNA molecular quantity in working sample.There is described herein a kind of program for realizing above object, is then unit molecule or clone sequencing.
Described method needs one or more specific gene seat of target and produces the marked copies of initial molecular, and its mode is make the major part of each target gene seat or all tagged molecule will have unique mark and can be distinguishable from one another when using clone or single-molecule sequencing to check order to this barcode.The barcode of each uniqueness order-checking represents the unique molecular in initial sample.Meanwhile, sequencing data is used for determining described molecular source is from which locus.Use the quantity that this information can be determined for the unique molecular of each locus in initial sample.
This method may be used for any application needing the molecular amounts in initial sample to be carried out to quantitative predication.In addition, the quantity of the unique molecular of one or more target can be relevant to the quantity of the unique molecular of one or more other target, thus measure Relative copy number, allele distributions or allele ratio.Or, from each target detect to copy number can by distribution modeling to differentiate the most probable copy number of initial target.Application include, but is not limited to detect insert and disappearance, such as find in the carrier of Du Shi muscular dystrophy (Duchenne Muscular Dystrophy) those; Quantitative chromosomal disappearance or copy section, such as in copy number varient viewed those; From the chromosomal copy number of the sample of birth individuality; From the chromosomal copy number of the sample of individuality of not being born (such as embryo or fetus).
Described method can with estimate that the variation comprised in target sequence is combined simultaneously.This may be used for the quantity measuring each allelic molecule represented in initial sample.This copy number method can be combined with following each: the estimation of SNP or other sequence variations is born and the chromosomal copy number of unborn individuality to measure; From locus there is short data records variation but wherein PCR can increase the Identification and determination of copy of multiple target area, the carrier detction of such as Duchenne-Arandisease; The mensuration of the copy number of the molecule of different sources in the sample be made up of the mixture of Different Individual, such as, in the DNA detection fetus dysploidy of the unmanaged flexibility obtained from maternal blood plasma.
In one embodiment, described method is because it relates to single goal locus, so can comprise one or more in following steps: (1) design a pair standard oligo for the pcr amplification of specific gene seat.(2) between synthesis phase, the sequence with the appointment base that with the 5 ' target gene seat held of in desired specificities oligomer or genome, not there is complementarity or there is minimum complementarity is added.This sequence is called afterbody, is the known array for following amplification, after be random nucleotides.These random nucleotides comprise random areas.Random areas comprises the random nucleotide sequence produced, and the probability of described nucleic acid between each probe molecule is different.Therefore, after composition, tailing oligomer pond will be made up of oligomer set, described set from known array, be then unknown nucleotide sequence different between molecule, be then desired specificities sequence.(3) only use tailing oligomer to perform one and take turns amplification (sex change, annealing, extension).(4) to reaction in add exonuclease, effectively stop PCR reaction, and make reaction be incubated to remove at appropriate temperature be not annealed to template and extend with the forward single stranded oligonucleotide forming double-stranded products.(5) make reaction at high temperature be incubated to make Exonucleolytic enzyme denaturation and eliminate its activity.(6) in reacting to interpolation in reaction with first, the new oligonucleotide of the afterbody complementation of oligomer used and another desired specificities oligomer is to realize the pcr amplification of the product produced in first round PCR.(7) continue amplification to check order for cloned downstream to produce enough products.(8) measure by multiple method the PCR primer that increased, such as, base for the sufficient amount covering described sequence carries out cloning and sequencing.
In one embodiment, the inventive method relates to the parallel or otherwise multiple locus of target.The primer of different target locus can produce independently and mix to create composite PCR pond.In one embodiment, initial sample can be divided into subpool and can at the different genes seat of recombinating and before order-checking in each subpool of target.In one embodiment, can before dividing described pond again execution flag step and multiple amplification cycle to guarantee all targets of efficient targeting before fractionation, and improve following amplification by using again primer sets less in point pond to continue amplification.
An example of the application be especially suitable for is the antenatal dysploidy diagnosis of Noninvasive by this technology, wherein specifies the allele ratio at locus place or allelotrope to may be used for helping to measure chromosomal copy number existing in fetus in the distribution at multiple locus place.In the case, need existing DNA in amplification initial sample to maintain each allelic relative quantity simultaneously.In some cases, especially when there is minute quantity DNA, such as, being less than 5,000 genome copies, being less than 1,000 genome copies, be less than 500 genome copies and be less than 100 genome copies, the phenomenon being referred to as bottleneck effect may be run into.This is in a case where: there is the allelic a small amount of copy of any appointment in initial sample, and the deviation that increases can cause those allelic ratios in DNA cloning pond obviously different from initial DNA mixture.Apply one group of barcode that is unique or almost uniqueness by every bar chain of the forward direction DNA at standard PCR amplification, likely derive from the consistent molecule of the sequenced dna of identical initial molecular from one group of n and get rid of n-1 DNA copy.
For example, imagine genes of individuals group and from the heterozygosis SNP in the DNA mixture of described individuality, in initial DNA sample, wherein there are each allelic ten molecules.After amplification, can 100 be had, 000 DNA molecular corresponding to described locus.Due to stochastic process, DNA ratio can be from any ratio between 1: 2 to 2: 1, but, because each in initial molecular marks by unique tag, so will likely measure the DNA deriving from each allelic just in time 10 DNA moleculars in amplification pond.Therefore, this method is more accurate than not making method in this way by the measuring result of each allelic relative quantity provided.About wherein needing, the relative quantity of allelotrope deviation is dropped to minimum method, this method will provide data more accurately.
The combination of sequenced fragments and target gene seat can obtain with various ways.In one embodiment, the sequence with sufficient length is obtained from target fragment with the unique base corresponding to target sequence covering molecular barcode and there is sufficient amount to allow clearly to differentiate target gene seat.In another embodiment, the molecular barcode primer containing the random molecular barcode produced can also contain locus-specific barcode (locus barcode), and it differentiates the target combined with it.This locus barcode in all molecular barcode primers for each independent target and therefore all gained amplicons will be consistent but be different from other targets all.In one embodiment, described herein marking method can combine with monolateral nested scheme.
In one embodiment, molecular barcode primer design and produce and can become following practice: molecular barcode primer can by the sequence not complementary with target sequence, be then random molecular the barcode size or text field, be then that desired specificities sequence forms.The sequence 5 ' of molecular barcode may be used for subsequence pcr amplification and can comprise the sequence being applicable to be converted into by amplicon for the storehouse of checking order.Random molecular bar code sequence can produce with various ways.Preferred method to be included in the mode synthetic molecules labeled primer of reaction between the synthesis phase of the barcode size or text field by all four kinds of bases.All or each combination of base can use IUPAC DNA code obfuscation to specify.In this way, synthesized elements collection is by the random mixture containing the sequence in molecular barcode region.The length of the barcode size or text field will determine that how many primers will containing unique bar code.The quantity of unique sequences is relevant with the length of the barcode size or text field, as N l, wherein N is base number, normally 4, and L is Barcode Length.The barcode of five bases can produce nearly 1024 unique sequences; Eight base barcodes can produce 65536 unique bar codes.In one embodiment, can measure DNA by sequence measurement, wherein sequence data represents the sequence of individual molecule.But this can comprise the method for individual molecule direct Sequencing or increases to be formed by the detection of sequence instrument to individual molecule still represent the method for the clone of individual molecule, is called cloning and sequencing herein.
for exemplary methods and the reagent of quantitative amplification product
The quantitatively usual of relative specific nucleotide sequence is undertaken by quantitative real time pcr, such as Plutarch graceful (life technology), INVADER probe (the 3rd wave technology (THIRD WAVE TECHNOLOGIES)) etc.This kind of technology suffers many shortcomings, such as parallel (compound) is analyzed the ability of multiple sequence and is provided the limited in one's ability of accurate quantitative analysis data for may increase the cycle (such as, when the logarithm contrast cycle life of pcr amplification output is in linearity range) of only narrow range simultaneously.DNA sequencing technology, especially high-throughout sequencing technologies of future generation (being commonly referred to extensive parallel sequencing technology), such as used those in MYSEQ (Yi Lu meter Na), HISEQ (Yi Lu meter Na), ion torrent (ION TORRENT) (life technology), gene element analyzer ILX (Yi Lu meter Na), GSFLEX+ (Roche 454) etc., may be used for the copy number of correlated series existing in quantitative measurment sample, thus provide the quantitative information about initial substance, such as copy number or transcriptional level.High-throughput gene sequencer can use barcode (that is, carrying out sample mark with the nucleotide sequence of uniqueness) to differentiate the specificity sample from individuality, thus allows to analyze multiple sample in the single operation of DNA sequencer simultaneously.The number of times (quantity of reading) checked order to the genomic designated area in storehouse goods (or other relevant core goods) is by proportional with the copy number of sequence in genes involved group (or when the goods containing cDNA, expression level).But preparation and the order-checking of gene pool (with coming from similar genomic goods) can introduce many deviations, and these deviations can disturb the quantitative readout accurately obtaining associated nucleic acid sequences.For example, different IPs acid sequence can increase with different efficiency during core amplification step, during described core amplification step occurs in gene pool preparation or sample preparation.
The problem of different amplification efficiency can some embodiment of the application of the invention be alleviated.The present invention includes various method and composition, described method and composition refer to use standard substance be included in may be used for improve dosing accuracy amplification procedure in.The present invention's foetal DNA be particularly useful for by analyzing the unmanaged flexibility in maternal blood detects the dysploidy of fetus, as described herein and as described elsewhere: United States Patent (USP) the 8th, 008, No. 018; United States Patent (USP) the 7th, 332, No. 277; PCT openly applies for WO 2012/078792A2; And PCT openly applies for WO 2011/146632A1, the mode that described patent is quoted separately is in full incorporated herein.Embodiments of the invention are also applicable to the dysploidy of the embryo that detection bodies produces outward.The commercially available important dysploidy that can detect comprises the mankind No. 13, No. 18, No. 21, the dysploidy of X and Y chromosome.
Embodiments of the invention can use together with the mankind or non-human nucleic acid, and can be applied to the nucleic acid in animal and plant source.Embodiments of the invention can also be used for detecting and/or quantitatively with disappearance or the allelotrope of other genetic diseases being inserted as feature.The allelotrope containing disappearance can be detected in the doubtful carrier of associated alleles.
One embodiment of the present of invention comprise the standard existed with dose known amounts (relative or absolute).For example, consider to be diploid by No. 8 karyomit(e)s (containing locus A) and No. 21 karyomit(e)s (containing locus B) gene pool that to be triploid gene source obtained.Can produce gene pool from this sample, the quantity of the sequence contained is the function of chromosomal quantity existing in sample by it, 200 copies of such as locus A and 300 copies of locus B.But, if the amplification efficiency of locus A is far above locus B, so can there is 60 of A amplicon after PCR, 000 copy and B amplicon 30,000 copy, thus the real chromosomal copy number obscuring initial gene group sample when being analyzed by high-throughput DNA sequencing (or other quantitative nucleic acid detection technique).In order to alleviate this problem, adopt standard sequence for locus A, wherein the amplification efficiency of standard sequence is substantially the same with locus A.Similarly, create the standard sequence for locus B, wherein the amplification efficiency of standard sequence is substantially the same with locus B.Before PCR (or other amplification technique), in mixture, add the standard sequence of locus A and the standard sequence of locus B.These standard sequences exist with dose known amounts (relative populations or absolute quantity).If therefore add 1: 1 mixture of (before amplification) standard sequence A and standard sequence B in last example in mixture, so will produce 3000 copies of standard A amplicon and will 1000 copies of standard B amplicon be produced, illustrate under one group of the same terms, the amplification efficiency of locus A is 3 times of locus B.
In various embodiments, one or more selected areas containing related SNP (or other polymorphism) in genome can specific amplification and checking order subsequently.This desired specificities amplification can occur between the Formation period of the gene pool for checking order.Multiple target area for increasing can be contained in described storehouse.In certain embodiments, at least 10,100,500,1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 relevant range.The example of this class libraries has described in this article and can see in No. 2012/0270212nd, the U.S. Patent application submitted on November 18th, 2011, and the mode that described patent is quoted in full is incorporated herein.
Multiple high-throughput DNA sequencing technology needs to modify hereditary initial substance, such as, engage general priming site and/or barcode, to form the storehouse of the clonal expansion for the small nucleic acids fragment of carrying out before promoting to continue sequencing reaction after execution.In certain embodiments, one or more standard sequence adds or before the amplification of storehouse, adds the precursor component of gene pool between gene pool Formation period.Can the choice criteria sequence target gene pack section of simulating (but can distinguish based on nucleotide base sequence) to be prepared be checked order by high-throughput gene sequencing technology.In one embodiment, standard sequence, can be consistent with target gene pack section except one, two, three, four to ten or ten one to two ten Nucleotide.In certain embodiments, when target gene sequence contains SNP, standard sequence, can be consistent with SNP except except the Nucleotide at polymorphic base place, and the Nucleotide at polymorphic base place can be selected as in four Nucleotide do not observed in described position at occurring in nature.Standard sequence may be used in the height multiple analysis of multiple target gene seat (such as polymorphic locus).(before amplification) standard sequence of dose known amounts (relative or absolute) can be added to be provided in the gauge that in the amount of the related target sequences in determination and analysis sample, accuracy is larger during the forming process of storehouse.To the understanding of the dose known amounts of standard sequence combine may be used for calibrating the amplification characteristic of each standard sequence relative to the corresponding target sequence of each standard sequence to the combination of the understanding use formed for the ploidy level in the storehouse of checking order formed by the genome with the ploidy level (such as known all euchromosomes are diploids) previously characterized and consideration comprise multiple standard sequence mixture batch between change.Consider and usually must analyze lots of genes seat simultaneously, it is useful for producing the mixture comprising a large group standard sequence.Embodiments of the invention comprise the mixture comprising multiple standard sequence.Ideally, the amount of each standard sequence in mixture is known with pinpoint accuracy.But extremely this perfect condition of difficult acquisition, because as actual contents, there is a large amount of change in the quantitative aspects of each standard sequence in the mixture, especially comprises the mixture of synthetic oligonucleotides different in a large number.This change has many sources, and such as, batch change of oligonucleotides in vitro acid synthesis efficiency, volume measurements are inaccurate, change in pipetting.In addition, this change can appear at and contain between the different batches of definitely identical standard sequence group with definitely identical amount in theory.Therefore, what pay close attention to is calibrate the standard sequence of each batch independently.Can for there is the reference genome of known karyomit(e) composition to calibrate each batch of standard sequence.This batch of standard sequence can be calibrated by comprising minimum amplification step in order-checking scheme or carrying out order-checking without amplification step to this batch of standard sequence.Embodiments of the invention comprise different standards sequence through calibration mixture.The method that other embodiments of the invention comprise the mixture of calibration different standards sequence and the different standards sequence that obtained by the inventive method through calibration mixture.
Each embodiment of standard sequence mixture of the present invention and its using method can comprise at least 10,100,500,1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 or more standard sequence, and each intermediate quantity.The quantity of standard sequence can be identical with the selected quantity for carrying out the target sequence analyzed during the generation in the target storehouse for DNA sequencing.But in certain embodiments, the standard sequence using the quantity fewer than the quantity of the target area in constructed storehouse may be favourable.Use lower quantity may be favourable, to avoid the restriction of the order-checking ability running into high-throughput DNA sequencer used.The quantity of standard sequence can be 50% or be less than the quantity, 40% or be less than the quantity, 30% or be less than the quantity, 20% or be less than the quantity, 10% or be less than the quantity, 5% or be less than the quantity, 1% or be less than quantity and each intermediate value of target area of target area of target area of target area of target area of target area of target area.For example, if use 15,000 pair of target contains the primer establishment gene pool of the locus of specificity SNP, so can add the suitable mixture corresponding to the standard sequence of 1500 in 15,000 target gene seat containing 1500 before the amplification step of storehouse structure.
The amount of the standard sequence added between the tectonic epochs of storehouse in different embodiments can noticeable change.In certain embodiments, the amount of each standard sequence can be roughly the same with the premeasuring of target sequence existing in the genomic material sample prepared for storehouse.In other embodiments, the amount of each standard sequence can be greater than or less than the premeasuring of target sequence existing in the genomic material sample prepared for storehouse.Although the initial relative quantity of target sequence and standard sequence is not the key of function of the present invention, in the scope of large 100 times to little 100 times of the amount of the target sequence that described amount is preferably existing in than the genomic material sample prepared for storehouse.Excessive standard can use the too much order-checking ability of described instrument in the appointment of DNA sequencer runs.Used the standard sequence of low amounts that data deficiencies will be caused to change with assistant analysis amplification efficiency.
The standard sequence that nucleotide base sequence is very similar to relevant amplification region can be selected; Preferably, standard sequence has and analyzed genome area, the primer binding site that namely " target sequence " is definitely identical.Standard sequence must be distinguished with specifying the corresponding target sequence at locus place.For simplicity, this diacritic region of standard sequence will be referred to as " marker sequence ".In certain embodiments, Polymorphic Regions is contained in the marker sequence district of target sequence, such as SNP, and can by PBR side joint on both sides.The standard sequence of the GC content of the corresponding target sequence of close match can be selected.In certain embodiments, the PBR of standard sequence is by general priming site side joint.Select these general priming sites to mate general priming site used in the genomic library for analyzing.In other embodiments, standard sequence does not have general priming site and add general priming site during storehouse produces.Standard sequence provides with single stranded form usually.Target sequence relative to correspondence and the sequence-specific reagent for the target sequence that increases are to define standard sequence.In certain embodiments, related polymorphic is contained, such as SNP, disappearance or insertion for target sequence existing in the nucleic acid samples analyzed.Standard sequence is synthetic polyribonucleotides, but its nucleotide base sequence and target sequence is similar still distinguishes by least one nucleotide base differences and target sequence, thus the amplicon sequence providing difference to derive from standard sequence and the mechanism of amplicon sequence deriving from target sequence.Standard sequence is through selecting to make having the amplification characteristic substantially the same with corresponding target sequence when increasing with same group of amplifing reagent (such as PCR primer).In certain embodiments, standard sequence can have the primer sequence binding site identical with corresponding target sequence.In other embodiments, standard sequence can have the primer sequence binding site different from corresponding target sequence.In certain embodiments, standard sequence can through selecting to produce the length amplicon identical with the length of the amplicon produced from corresponding target sequence.In other embodiments, standard sequence can through selecting to produce the length amplicon slightly different from the length of the amplicon produced from corresponding target sequence.
After amplified reaction completes, high-throughput DNA sequencer checks order to storehouse, wherein clonal expansion and order-checking are carried out to independent molecule.The quantity of each allelic sequence reads of target sequence is counted, also the quantity of the sequence reads of the standard sequence corresponding to target sequence is counted.Also at least one pair of, other target sequence and the standard sequence of correspondence perform described process.Consider such as locus A, produce the X of the allelotrope 1 of locus A a1reading; Produce the X of the allelotrope 2 of locus A a2reading, and the X producing standard sequence A aCreading.Measure (the X of each genes involved seat a1add X a2) and X aCratio.As discussed previously, can to reference genome, such as known all karyomit(e) is that diplontic genome performs described process.Described process repeatedly to provide a large amount of reading value, thus can measure the standard deviation of average reading quantity and reading quantity.Described process is performed to comprising a large amount of mixture corresponding to the different standards sequence of different genes seat.By supposition (1) X a1add X a2corresponding to chromosomal dose known amounts, such as 2 for normal human subject women genome, and (2) standard sequence has similar amplification (and detectability) characteristic of the natural gene seat corresponding with it, can determine the relative quantity of different standards sequence in compound standard mixture.Then composite standard sequence mixture through calibration may be used for the variability adjusting amplification efficiency between different genes seat in composite amplification reaction.
Other embodiments of the invention comprise the method and composition of the copy number for measuring relative specific gene, comprise with copying and mutant gene of being feature by the quantitative a large amount of disappearances that check order by interference.Check order and have problem detecting in the allelotrope with this kind of disappearance.The amplification procedure comprising standard sequence may be used for reducing this problem.
In one embodiment of the invention, the target sequence for analyzing is have wild-type (i.e. function) form and take disappearance as the gene of mutant forms of feature.The example of this genoid is SMN1, and one has the allelotrope of the disappearance causing genetic diseases Duchenne-Arandisease (SMA).That paid close attention to detects by means of high-throughput gene sequencing technology the individuality carrying the mutant forms of gene.It may be problematic for this kind of technology being applied to and detecting deletion mutantion, especially because do not observe sequence (contrary with the simple point mutation of detection or SNP) in order-checking.This kind of embodiment adopts (1) to have specific pair for amplification primer to genes involved, and wherein said amplimer will increase genes involved (or it is a part of) and the mutant allele that will significantly not increase; (2) corresponding to the wild-type allele of genes involved, but the different standard sequence of nucleotide base (that is, target sequence) can be detected by least one; (3) to the second target sequence serving as reference sequences, there is specific pair for amplification primer; And (4) are corresponding to the standard sequence of reference sequences.
In one embodiment of the invention, provide a kind of method of the copy number for measuring genes involved, wherein genes involved has the significant allelotrope that one comprises disappearance.Described method can adopt has specific amplifing reagent to genes involved, such as by amplification genes involved at least partially or whole genes involved or the contiguous region of genes involved and the allelotrope comprising disappearance of the described genes involved that do not increase and there is specific PCR primer to genes involved.In addition, the inventive method adopts the standard sequence corresponding to genes involved, and wherein said standard sequence is difference (easily can distinguish with naturally occurring genes involved to make the sequence of standard sequence and come) by least one nucleotide base from genes involved.Usually, standard sequence will containing the primer binding site identical with genes involved in case make genes involved and correspond to genes involved standard sequence between any amplification discriminate against drop to minimum.React also will comprise, to reference sequences, there is specific amplifing reagent.Reference sequences is having in genome to be analyzed the sequence with known (or at least supposing known) copy number.Described reaction comprises the standard sequence corresponding to reference sequences further.Usually, corresponding to reference sequences standard sequence will containing the primer binding site identical with reference sequences in case make reference sequences and correspond to reference sequences standard sequence between any amplification discriminate against drop to minimum.
exemplary nucleic acid samples
In certain embodiments, can to prepare and/or purifying genetic material.The known multiple standard program for realizing this type of object in this area.In certain embodiments, can carry out sample centrifugal to be separated each layer.In certain embodiments, filtering separation DNA can be used.In certain embodiments, the preparation of DNA can relate to amplification, separation, by any one in other technology multiple known or described herein in chromatogram purification, Liquid liquid Separation, segregation, priority enrichment, preferential amplification, target amplification or this area.
In certain embodiments, method disclosed herein may be used for wherein such as there is the available legal medical expert's situation of the situation of minute quantity DNA or one of them or several cell (be usually less than ten cells, be less than 20 cells or be less than 40 cells) in fertilization in vitro.In these embodiments, method disclosed herein is in order to never to be made ploidy interpretation by a small amount of DNA of other DNA pollution, but wherein the ploidy interpretation of a small amount of DNA is very difficult.In certain embodiments, method disclosed herein may be used for wherein target dna by the situation of the DNA pollution of another individuality, such as when antenatal diagnosis, parental right test or become pregnant product test in maternal blood.Other situation more particularly advantageous will be when cancer is tested by these methods, in more substantial normal cell, wherein only there is one or a small amount of cell.Gene as a part for these methods is measured and can be carried out any sample comprising DNA or RNA, and described sample is (but being not limited to) such as: blood, blood plasma, body fluid, urine, hair, tear, saliva, tissue, skin, nail, blastomere, embryo, amniotic fluid, fine hair sample, ight soil, bile, lymph, cervical mucus, seminal fluid or comprise other cell or the material of nucleic acid.In one embodiment, method disclosed herein can be carried out with nucleic acid detection method, and described method such as checks order, microarray, qPCR, digital pcr or other method for measuring nucleic acid.If find that this is desired for some reason, so can calculate the ratio of the allelotrope counting probability at locus place, and described allele ratio may be used for combining with certain methods described herein determining ploidy state, assuming that described method is compatible.In certain embodiments, method disclosed herein relates on computers from the DNA measuring result obtained about treated sample to calculate the allele ratio of multiple polymorphic locus.In certain embodiments, method disclosed herein relates to and calculates allele ratio at multiple polymorphic locus from the DNA measuring result obtained about treated sample and any combination that other improves of the present invention on computers.
In certain embodiments, this method may be used for unicellular, a small amount of cell, two to five cells, six to ten cells, ten to two ten cells, 20 to five ten cells, 50 to one hundred cells, 1 hundred to one thousand cell, or a small amount of extracellular DNA, such as one to ten piks, ten to one hundred pik, 100 piks carry out gene type to a nanogram, one to ten nanograms, ten to one hundred nanogram or 100 nanograms to a microgram.
exemplary rna expression research
Composite PCR method of the present invention may be used for the quantity increasing the target gene seat can estimated at gene expression spectrum analysis experimental session.For example, the expression level simultaneously can monitoring thousands of genes is to determine whether a people has the sequence relevant to disease (such as cancer) or the disease risks that increases (such as polymorphism or other suddenly change).These methods may be used for differentiating with the genetic expression (the such as allelic expression of specific mRNA) in the sample of not ill patient and the disease increased or reduce (such as cancer) sequence that risk is relevant (such as polymorphism or other suddenly change) from ill by comparing.In addition, the impact on genetic expression of special processing, disease or etap can be determined.Similarly, these methods may be used for differentiating that it expresses the gene changed in response to pathogenic agent or other organism by comparing infection and non-infected cells or the genetic expression in organizing.In these methods, if the quantity of order-checking reading can be adjusted to make to have polymorphism to be detected based on having the frequency of polymorphism to be analyzed, so enough readings are performed for described polymorphism.
In certain embodiments, use the sample of reversed transcriptive enzyme (RT) amplification containing RNA (such as mRNA) and then use archaeal dna polymerase (PCR) to increase gained DNA (such as cDNA).RT and PCR step can perform successively or separately perform in same reaction volume.Any one in primer storehouse of the present invention may be used in this reverse transcriptase polymerase chain reaction (RT-PCR) method.In different embodiments, use the mixture of oligodeoxythymidylic acid, random primer, oligodeoxythymidylic acid and random primer or to target gene seat, there is specific primer and perform reverse transcription.In order to avoid the genomic dna that polluted of increasing, the design of primers of RT-PCR can be become make a part for a primer to hybridize to 3 ' end of an exon and another part of described primer hybridizes to 5 ' end of neighboring exons.This kind of primer annealing to the cDNA synthesized from montage mRNA, but unannealed to genomic dna.In order to detect the amplification of the DNA polluted, RT-PCR primer contains the region of at least one intron to being designed to side joint.The product increased from cDNA (intronless) is less than those amplification from genomic dna (containing intron).The difference in size of product is for detecting the existence of the DNA polluted.In certain embodiments, only have when mRNA sequence is known, the primer annealing sites of at least 300-400 the base pair of just selecting to be separated by, because may contain splice junction from the fragment of this size of eukaryotic DNA.Or, can by DNA enzymatic processing sample with the DNA polluted that degrades.
the exemplary methods of parental right test
Composite PCR method of the present invention may be used for the accuracy improving parental right test, because can the so many target gene seat of disposable analysis (disclose No. 2012/0122701 referring to the U.S. of such as submitting on December 22nd, 2011, its mode hereby quoted in full is incorporated herein).For example, composite PCR method can allow to be applicable to analyze in PARENTAL SUPPORT algorithm described in this article thousands of polymorphic locus (such as SNP) to determine to suppose that whether father is the natural father of fetus.In certain embodiments, described method relates to (i) and increases from the multiple polymorphic locuses on the genetic material of hypothesis father simultaneously, comprises at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different polymorphic locus, thus produce first group of amplified production; (ii) amplification simultaneously derives from multiple polymorphic locuses corresponding on the DNA biased sample of the blood sample of pregnant mothers to produce second group of amplified production; Wherein DNA biased sample comprises foetal DNA and maternal DNA; (iii) based on first and second groups of amplified productions, use genotype measuring result, measure the probability that hypothesis father is the natural father of fetus on computers; And (iv) uses the hypothesis father measured to be that the probability of the natural father of fetus determines to suppose that whether father is the natural father of fetus.In different embodiments, described method comprises further and increasing from multiple polymorphic locuses corresponding on the genetic material of mother to produce the 3rd group of amplified production simultaneously; Wherein based on first, second, and third group of amplified production, use genotype measuring result to measure and suppose that father is the probability of the natural father of fetus.
for the exemplary methods that embryo characterizes and selects
Composite PCR method of the present invention may be used for by allowing disposable analysis thousands of target locus to improve for Embryo selection (disclose No. 2011/0092763 referring to the U.S. that such as on May 27th, 2008 submits to, on December 22nd, 2011 submits to, its mode hereby quoted in full is incorporated herein) in vitro fertilization.For example, composite PCR method can allow to be applicable to analyze thousands of polymorphic locus (such as SNP) in PARENTAL SUPPORT algorithm described in this article to be used for selecting embryo in embryo in vitro fertilization from one group.
In certain embodiments, the invention provides the method for relative likelihood that each embryo in estimation one group of embryo will grow on demand.In certain embodiments, described method relate to make from each embryo sample with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different target gene seat is to produce reaction mixture for each embryo, and wherein said sample derives from one or more cell from embryo separately.In certain embodiments, each reaction mixture is made to experience primer extension reaction condition to produce amplified production.In certain embodiments, described method comprises based on amplified production, measures one or more feature of at least one cell from each embryo on computers; And one or more feature of at least one cell based on each embryo, estimates the relative likelihood that each embryo will grow on demand on computers.In certain embodiments, described method comprises using and measures at least one feature based on the method for information, such as described PARENTAL SUPPORT algorithm herein.In certain embodiments, described feature comprises ploidy state.In certain embodiments, described feature is selected from by the following group formed: the female parent source of aneuploid, euploid, mosaic, nullisomic, monosomy, Uniparental disomy, trisomy, tetrasomy, dysploidy type, unmatched copy errors trisomy, coupling copy errors trisomy, dysploidy, the paternal origin of dysploidy, presence or absence disease linked gene, any aneuploid chromosomal karyomit(e) identity, the abnormal hereditary patient's condition, lack or copy, the likelihood of feature and its combination.Described feature can be relevant to the karyomit(e) taken from by the following group formed: a karyomit(e), No. two karyomit(e)s, No. three karyomit(e)s, No. four karyomit(e)s, No. five karyomit(e)s, No. six karyomit(e)s, No. seven karyomit(e)s, No. eight karyomit(e)s, No. nine karyomit(e)s, No. ten karyomit(e)s, ride on Bus No. 11 karyomit(e), ten No. two karyomit(e)s, ten No. three karyomit(e)s, ten No. four karyomit(e)s, ten No. five karyomit(e)s, ten No. six karyomit(e)s, ten No. seven karyomit(e)s, ten No. eight karyomit(e)s, ten No. nine karyomit(e)s, two No. ten karyomit(e)s, two ride on Bus No. 11 karyomit(e)s, 20 No. two karyomit(e)s, X chromosome or Y chromosome and its combination.
exemplary methods for prenatal diagnosis
Composite PCR method of the present invention may be used for improving methods for prenatal diagnosis, such as, measure the ploidy state of fetal chromosomal.In view of a large amount of target gene seat that can increase simultaneously, can make and measuring more accurately.
In one embodiment, the invention provides for from from DNA biased sample (namely, DNA from fetus mother and the DNA from fetus) genotype data measured and optionally from from the ex vivo approach measuring the chromosomal ploidy state of the fetus breeding from mother and the genotype data that may also measure from the genetic material samples of father, wherein said mensuration is undertaken by following: use simultaneous distribution model to estimate allele distributions in view of parent genotype data pin creates one group to different possible fetus ploidy state, and compare and estimate allele distributions and the actual allele distributions measured in biased sample, and select to estimate the ploidy state of the most viewed allele distributions pattern of close match of allele distributions pattern.In one embodiment, biased sample derives from maternal blood or maternal serum or blood plasma.In one embodiment, DNA biased sample can at target gene seat (such as, multiple polymorphic locus) priority enrichment.In one embodiment, priority enrichment makes allelotrope deviation drop to minimum mode to carry out.In one embodiment, the present invention relates to a kind of DNA composition, it at multiple locus priority enrichment to make allelotrope deviation low.In one embodiment, allele distributions is measured by carrying out order-checking to the DNA from biased sample.In one embodiment, simultaneous distribution model adopts the allelotrope will distributed in binomialexpression mode.In one embodiment, consider to create from the existing recombination frequency in each source to estimate associating allele distributions group for genetic linkage locus simultaneously, such as, use the data from international HapMap cooperative groups.
In one embodiment, the invention provides the method for non-invasive prenatal diagnosis (NPD), specifically, by observing the aneuploid state determining fetus in the genotype data about the measurement of DNA mixture in the allelotrope measuring result at multiple polymorphic locus place, wherein some allelotrope measuring result instruction aneuploid fetus, and other allelotrope measuring result instruction euploid fetus.In one embodiment, genotype data is measured by carrying out order-checking to the DNA mixture deriving from maternal blood plasma.In one embodiment, DNA sample can priority enrichment corresponding to the DNA molecular of multiple locus having allele distributions to be calculated.In one embodiment, measure and only comprise or almost only only comprise from the DNA sample and may also measuring of the genetic material of mother or almost only from the DNA sample of the genetic material of father.In one embodiment, father and mother one or both gene measuring result and estimated fetus mark be used to create the multiple expectation allele distributions of different possible basic genetic states corresponding to fetus; Estimate that allele distributions can be called hypothesis.In one embodiment, maternal inheritance data are not measure by measuring the genetic material exclusively or almost exclusively coming from female parent natively; In fact, it is that the gene measuring result obtained from the maternal blood plasma of mixture about the DNA comprising maternal and fetus is estimated.In certain embodiments, which which section heredity chromosomal is described hypothesis can be included in the fetus ploidy at one or more karyomit(e) place, fetus from which parent and its combination.In certain embodiments, by the following ploidy state determining fetus: more viewed allelotrope measuring result and different hypothesis, at least some in wherein said hypothesis corresponds to different ploidy state; And in view of viewed allelotrope measuring result, select the ploidy state corresponding to the correct hypothesis of most probable.In one embodiment, this method relates to the allelotrope take off data used from some or all of measured SNP, no matter described locus be isozygoty or heterozygosis, and therefore do not relate to the allelotrope being used in only heterozygous genes seat place.This method may be not suitable for the situation that genetic data relates to an only polymorphic locus.When genetic data comprise from target chromosome more than ten polymorphic locuses or the data more than 20 polymorphic locuses time, this method is particularly advantageous.When genetic data comprise target chromosome more than 50 polymorphic locuses, target chromosome more than 100 polymorphic locuses or the data more than 200 polymorphic locuses time, this method is especially favourable.In certain embodiments, genetic data can comprise target chromosome more than 500 polymorphic locuses, target chromosome more than 1,000 polymorphic locus, more than 2,000 polymorphic locus or more than 5, the data of 000 polymorphic locus.
In one embodiment, method disclosed herein obtains the quantitative measure of each allelic independent observation number at polymorphic locus.This is different from most of method of such as microarray or qualitative PCR, and they provide the information about two kinds of allelic ratios, but does not carry out quantitatively arbitrary allelic independent observation number.When providing the method about the quantitative information of independent observation number, in ploidy calculates, only adopt ratio, and quantitative information itself is inapplicable.In order to the importance retained about the information of independent observation number is described, consider the sample gene seat with two kinds of allelotrope (A and B).In the first experiment, observe 20 A allelotrope and 20 B allelotrope; In the second experiment, observe 200 A allelotrope and 200 B allelotrope.In two experiments, described ratio (A/ (A+B)) is equal to 0.5, but the second experiment is tested than first and be conveyed about the deterministic information of the allelic frequency of A or B more.Other people certain methods relates to independent allelic allele ratio (channel ratio) (i.e. x i/ y i) be averaged or sue for peace and analyze this ratio, it is compared or use with reference to karyomit(e) about estimating the rule how this ratio works under specific circumstances.Not containing allelotrope weighting in these class methods, wherein supposition can guarantee that the identical and all allelotrope of the amount of each allelic PCR primer should work in the same manner.These class methods have multiple shortcoming, and more importantly, hamper the multiple use of improving one's methods described in other place of the present invention.
In one embodiment, the gene frequency distribution that method disclosed herein clearly simulates estimated disomy and the multiple gene frequencies distribution can estimated when trisomy, described trisomy is not by being separated and/or not being separated generation during the early stage mitotic division of fetation during not being separated during meiosis I, meiosis II.In order to illustrate why this is important, wherein there is not the situation of intersection in imagination: not being separated during meiosis I will produce trisomy, wherein from the homologue that father and mother one side heredity two is different; By contrast, two copies that will produce from the identical homologue of father and mother one side are not separated during meiosis II or during the early stage mitotic division of fetation.Each situation will produce at each polymorphic locus and the difference expectation gene frequency being considered to associating at all locus places due to genetic linkage.Intersection causes the exchange of the genetic material between homologue, makes hereditary pattern more complicated; In one embodiment, the inventive method adapts to this point by using the physical distance between recombination fraction information plus locus.In one embodiment, be not separated the difference between not being separated with meiosis II or mitotic division to improve meiosis I, the crossover probability that the distance along with distance kinetochore increases and increases by the inventive method is incorporated in described model.Meiosis II and mitotic division are not separated can be distinguished by the following fact: mitotic division be not separated usually produce a homologue unanimously or almost consistent copy and two homologues existing after meiosis II not departure event are usually different because one or more between the gamete emergence period intersects.
In certain embodiments, method disclosed herein relates to more viewed allelotrope measuring result and the theory hypothesis corresponding to possible fetus gene dysploidy, and does not relate to and carry out quantitative step to the allelic ratio at heterozygous genes seat place.When locus quantity lower than about 20 time, use to comprise and ploidy that quantitative method carries out is carried out to the allelic ratio at heterozygous genes seat place measure and use and comprise the ploidy that method that more viewed allelotrope measuring result and the allele distributions corresponding to possible fetus genetic state suppose carries out and measure and may obtain similar results.But when the quantity of locus is more than 50, these two kinds of methods may obtain visibly different result; When the quantity of locus is more than 400, more than 1,000 or more than 2, when 000, these two kinds of methods likely obtain more and more visibly different result.Caused by these differences are due to the fact that: comprise and carry out quantitatively and dependently measuring each allelic value to the allelic ratio at heterozygous genes seat place and described ratio to be added up to or the method that is averaged hampers the use comprising and use simultaneous distribution model, perform linkage analysis, use the technology of binomial distribution model and/or the statistical technique of other advanced person; And the method using the theoretical allele distributions comprising more viewed allelotrope measuring result and correspond to possible fetus genetic state to suppose can use these technology, this can improve in fact the accuracy of mensuration.
In one embodiment, method disclosed herein relates to use simultaneous distribution model and determines that the distribution of viewed allelotrope measuring result is instruction euploid or aneuploid fetus.The use of simultaneous distribution model is different from and measures the method for heterozygosis rate by processing polymorphic locus independently and significantly improve described method; Wherein difference is that the accuracy that gained determines is obviously higher.Be not bound by any particular theory, it is believed that a reason that they have a more high accuracy is the likelihood of intersection that simultaneous distribution model considers chain between SNP and occurred during reduction division, described reduction division produces gamete, and gamete forms embryo, and embryo growth becomes fetus.Use the object of chain concept to be that it allows to create than expectation allelotrope measuring result significantly realistic better when not using chain distribution when creating the expectation distribution of allelotrope measuring result for one or more hypothesis.For example, imagination existence two SNP: 1 of location adjacent one another are and 2, and mother homologue is A at SNP 1 and is A at SNP 2, and on second homologue, be B at SNP 1 and be B at SNP 2.If two SNP on father's two homologues are all A, and for measured by fetus SNP 1 being B, so this represents that second homologue is by fetus genetic, and therefore the likelihood that is present on the SNP 2 of fetus of B is much higher.Consider chain model and will predict this point, and do not consider that chain model will not be predicted.Or, if the SNP of mother 1 place is AB and nearby SNP2 place is AB, so can use two kinds of hypothesis of the maternal trisomy corresponding to described position: one relates to coupling copy errors (not being separated in meiosis II or the early stage mitotic division of fetation), and one relates to unmatched copy errors (not being separated in meiosis I).When mating copy errors trisomy, if fetus from the SNP 1 of mother heredity AA, so fetus is more likely from SNP 2 heredity AA or BB of mother, but be not AB.When unmatched copy errors, fetus is by two hereditary AB of SNP from mother.To make these predictions by considering the allele distributions hypothesis that chain ploidy interpretation method obtains, and the degree therefore corresponding to actual allelotrope measuring result is significantly greater than and does not consider chain ploidy interpretation method.It should be noted that chained approach is impossible when use depends on calculating allele ratio and adds up to the method for those allele ratio.
Believe to use and comprise more viewed allelotrope measuring result and the method for the theory hypothesis corresponding to possible fetus genetic state and carry out ploidy to measure a reason with more high accuracy be when using order-checking to measure allelotrope, this method can from from allelic data collection more information, and wherein reading sum is lower than other method; Such as, depend on and calculate and add up to the method for allele ratio to produce the random noise of disproportionately weighting.For example, imagination relates to use order-checking to measure allelic situation, and wherein there is one group of locus, wherein only detects five sequence reads for each locus.In one embodiment, about each in allelotrope, described data compared with hypothesis allele distributions, and can be weighted according to the quantity of sequence reads; Therefore the data measured from these will be carried out weighting rightly and be incorporated in whole mensuration.This carries out quantitative method to the allelic ratio at heterozygous genes seat place and is formed and contrast, because this method only can calculate the ratio of 0%, 20%, 40%, 60%, 80% or 100% of possible allele ratio with relating to; In the middle of these, neither one can close to estimating allele ratio.In this latter situation, the allele ratio calculated must be rejected because reading is not enough, otherwise will have out-of-proportion weighting and in mensuration, introduce random noise, thus reduce the accuracy of mensuration.In one embodiment, independent allelotrope measuring result can be processed into independent measurement result, and the relation between the measuring result that the relation between the measuring result wherein obtained about the allelotrope at same gene seat place and the allelotrope about different genes seat place obtain does not have different.
In one embodiment, method disclosed herein relate to when more not any tolerance with estimates be disome reference karyomit(e) on viewed allelotrope measuring result (being called RC method), determine that the distribution of viewed allelotrope measuring result indicates euploid or aneuploid fetus.This significantly improves method, such as, use the method for shotgun sequencing, described method is by estimating to suspect relative to one or more, karyomit(e) supposes that disome detects dysploidy with reference to the ratio of chromosomal random sequencing fragment.If assuming that in fact disome is not disome with reference to karyomit(e), so this RC method has drawn incorrect result.This may occur in wherein dysploidy than Single chromosome trisomy truly or wherein fetus be triploid and all euchromosomes are three bodies when.When Gynogenetic triploid (69, XXX) fetus, virtually completely there are not two Autosomes.Method described herein does not need with reference to karyomit(e) and correctly can differentiate three Autosomes in Gynogenetic triploid fetus.About every bar karyomit(e), suppose, child's mark and noise level; Simultaneous distribution model can when do not exist following any one be suitable for: estimate or fixed reference hypothesis according to, overall child's mark with reference to chromosome number.
In one embodiment, method disclosed herein confirm observe polymorphic locus place allele distributions can how for the accuracy larger than method of the prior art to measure the ploidy state of fetus.In one embodiment, described method uses target order-checking to obtain female parent-fetus genotype and optionally mother and/or father's genotype of the mixing at multiple SNP place, first to determine that each estimates gene frequency distribution under difference hypothesis, and then observe the quantitative allelic information that obtains on female parent-fetus mixture and estimate which hypothesis fitting data is best, wherein corresponding to and be read as correct genetic state with the genetic state of the hypothesis of the best-fit of data.In one embodiment, method disclosed herein also uses the genetic state of fitting degree interpretation to produce to be the degree of confidence of correct genetic state.In one embodiment, method disclosed herein relates to the algorithm of allelic distribution using and analyze the locus with different parent's background and find, and the viewed allele distributions of the Different Ploidy state of more different parent's background (different parent genotype pattern) with estimate allele distributions.This is different from the method for the quantity of each allelic standalone case at each locus place in the female parent-fetal samples not using and can estimate to mix and is a kind of improvement.In one embodiment, method disclosed herein relates to the viewed allele distributions that is used in and measures at locus place to determine the distribution instruction euploid or aneuploid fetus of viewed allelotrope measuring result, and wherein mother is heterozygosis.This is different from and does not use the method for the viewed allele distributions in locus place when mother is heterozygosis and be a kind of improvement, because wherein DNA for known be not the high information quantity of described specific objective individuality locus not priority enrichment or priority enrichment when, allow the gene take off data up to about twice used in ploidy measures from a data unit sequence, produce and measure more accurately.
In one embodiment, method disclosed herein uses simultaneous distribution model, and the gene frequency at described model assumption each locus place is polynomial (and being therefore binomial when SNP is diallele) at occurring in nature.In certain embodiments, simultaneous distribution model uses B-binomial distribution.When using measuring technology (such as check order) to provide quantitative measurment for each allelotrope being present in each locus, binomial model can be applied to each locus and can determine the degree of confidence in the basic degree of gene frequency and described frequency.By means of the method from the interpretation of allele ratio generation ploidy known in this area or the method wherein giving up quantitative allelic information, can not determine the determinacy of viewed ratio.The inventive method is different from and calculates allele ratio and add up to those ratios making the method for ploidy interpretation and be a kind of improvement because relate to the allele ratio that calculates specific gene seat place and then add up to any method of those ratios to suppose the intensity that the instruction DNA measured from any appointment allelotrope or locus measures or counting will distribute in a gaussian manner.Method disclosed herein does not relate to calculating allele ratio.In certain embodiments, method disclosed herein can relate to each allelic observation quantity being incorporated to multiple locus place in a model.In certain embodiments, method disclosed herein can relate to calculating estimates distribution itself, allows to use associating binomial distribution model, and described model can than supposing that allelotrope measuring result is that any model of Gaussian distribution is accurate.Binomial distribution model obviously than Gaussian distribution accurately likelihood along with locus quantity increase and increase.For example, when inquiry is less than 20 locus, obviously better likelihood is low for binomial distribution model.But, when use is more than 100 or especially more than 400 or especially more than 1,000 or especially more than 2, during 000 locus, binomial distribution model is obviously accurate than Gaussian distribution model, thus to produce the likelihood that ploidy measures more accurately will be very high.Binomial distribution model obviously than Gaussian distribution accurately likelihood also along with at each locus place observation quantity increase and increase.For example, when observing each locus place and being less than 10 different sequences, obviously better likelihood is low for binomial distribution model.But, when each locus uses more than 50 sequence reads or especially more than 100 sequence reads or especially more than 200 sequence reads or especially more than 300 sequence reads, binomial distribution model is obviously accurate than Gaussian distribution model, thus the likelihood producing ploidy mensuration more accurately will be very high.
In one embodiment, method disclosed herein uses order-checking to measure each allelic situation quantity at each locus place in DNA sample.Each order-checking reading can be mapped to specific gene seat and be processed into binary sequence reading; Or the probability of consistency of reading and/or mapping can be incorporated to the part as sequence reads, produce Probabilistic sequences reading, be namely mapped to possible integer or the mark of the sequence reads of specifying locus.Use binary counting or counting probability, likely using binomial distribution for often organizing measuring result, allowing to calculate the fiducial interval about count number.This ability of binomial distribution is used to allow ploidy more accurately to estimate and calculate more accurate fiducial interval.This is different from working strength measuring the method for existing allelic amount and be a kind of improvement, and described method such as uses the method for microarray or uses phosphor reader to carry out the method for the intensity measuring fluorescent label DNA in electrophoresis band.
In one embodiment, method disclosed herein uses each side of data set of the present invention to measure the parameter of the estimation gene frequency distribution for those group data.Which improve utilize set according to or previously each group data estimate the parameter that gene frequency distributes or can the method for calculable allele ratio to set the present invention.This is because the condition setting participating in the collection of each genetic material and measurement is different, and the method therefore using the data from data set of the present invention to measure to be ready to use in the parameter of the simultaneous distribution model in the ploidy mensuration of described sample will be often more accurate.
In one embodiment, method disclosed herein relates to use maximum likelihood technique and determines that the distribution of viewed allelotrope measuring result is instruction euploid or aneuploid fetus.The use of maximum likelihood technique is different from and uses the method for single hypothesis repulsion technique and significantly improve described method, and wherein difference is that gained measures and will have obviously higher accuracy.To be single hypothesis repulsion technique set cutoff threshold based on an only measuring result distribution instead of two for reason, means that threshold value is not best usually.Another reason is that maximum likelihood technique allows to optimize the cutoff threshold of each independent sample instead of the specific features regardless of each independent sample, measures the cutoff threshold being used for all samples.Another reason is the degree of confidence that the use of maximum likelihood technique allows to calculate each ploidy interpretation.The ability of making confidence calculations to each interpretation allows practitioner, and to know which interpretation be accurately, and which is more likely wrong.In certain embodiments, various method can with maximum likelihood estimation technical combinations to strengthen the accuracy of ploidy interpretation.In one embodiment, maximum likelihood technique can with United States Patent (USP) 7,888, Combination of Methods described in 017 uses.In one embodiment, maximum likelihood technique can with use target pcr amplification increase DNA in biased sample, then check order and uses the Combination of Methods of reading cytometric analysis to use, described reading counting rule is such as by as diagnosed (TANDEM DIAGNOSTICS) to use in October, 2011 in the series connection of the international human genetics conferences proposition in 2011 of Montreal (Montreal).In one embodiment, method disclosed herein relates to the fetus mark of the DNA estimated in biased sample and uses described estimation to calculate the degree of confidence of ploidy interpretation and ploidy interpretation.Should note, this from use estimated by fetus mark as enough fetus marks screening, then use single hypothesis repulsion technique to carry out ploidy interpretation method different and be distinguishing, described single hypothesis repulsion technique does not consider that fetus mark does not produce the confidence calculations for interpretation yet.
In one embodiment, method disclosed herein is by having noise for the additional probability of each measuring result considers data and contain vicious trend.Using maximum likelihood technique to select correct hypothesis from the one group of hypothesis using the take off data being attached with probability estimation to obtain makes incorrect measuring result to reduce, and the calculating for generation of ploidy interpretation becomes more likely by correct measuring result.In order to more accurate, the impact that the data reducing incorrect measurement measure ploidy this method system.Which improve the method that wherein supposes that all data are correct equally or wherein from calculation result, get rid of arbitrarily the method that peripheral data obtains ploidy interpretation.The existing method using channel ratio to measure requires, by being averaged independent SNP channel ratio, described method is extended to multiple SNP.Not reducing gained statistical accuracy by estimating that Measurement Variance to be weighted independent SNP based on SNP quality and the viewed reading degree of depth, causing the accuracy of ploidy interpretation obviously to reduce, especially under border condition.
In one embodiment, method disclosed herein not with to which SNP on fetus or other polymorphic locus be heterozygosis be appreciated that prerequisite.This method allows to carry out ploidy interpretation in the disabled situation of male parent gene type information wherein.Which improve following methods: the understanding which SNP being wherein heterozygosis to must shift to an earlier date known so that Select gene seat or explanation are measured the gene that the fetus/maternal DNA sample mixed is carried out rightly for target.
When to wherein a small amount of DNA can with or the low sample of the per-cent of wherein foetal DNA uses time, method described herein is particularly advantageous.This be corresponding higher allelic loss ratio owing to occurring when only a small amount of DNA is available and/or in the biased sample of fetus and maternal DNA, the per-cent of foetal DNA is low time corresponding higher foetal allele loss ratio.Hypermorph allel is lost ratio and is meaned that in target individual, unmeasured allelic per-cent is large, causes fetus mark to calculate inaccurate, and ploidy measures inaccurate.Because method disclosed herein can use the chain simultaneous distribution model of the hereditary pattern considered between SNP, measure so obvious ploidy more accurately can be made.Method described herein allows the per-cent of foetal DNA molecule to be in the mixture less than 40%, be less than 30%, be less than 20%, make ploidy accurately when being less than 10%, being less than 8% and being even less than 6% measures.
In one embodiment, when the DNA of individuality mixes with the DNA of related individuals, likely measure individual ploidy state based on measuring result.In one embodiment, DNA mixture is the DNA of the unmanaged flexibility found in maternal blood plasma, it can comprise the DNA with known karyotype and known type from mother, and it can mix with the foetal DNA with unknown karyotype and unknown gene type.Likely to use from one or both known type information of father and mother to the multiple potential genetic state of DNA in the biased sample predicting Different Ploidy state, each father and mother the coloured differently body contribution of fetus and optionally different in mixture foetal DNA marks.Each potential composition can be referred to as hypothesis.Then the ploidy state of fetus by checking actual measured results, and can determine which potential composition most probable provides viewed data to determine.
The further discussion of above each point is found in other place of this document.
Non-invasive prenatal diagnosis (NPD)
The process of non-invasive prenatal diagnosis relates to multiple step.Some in described step can comprise: (1) obtains genetic material from fetus; (2) in vitro enrichment may the genetic material of fetus in biased sample; (3) increase genetic material in vitro; (4) genetic material in vitro priority enrichment specific gene seat place; (5) genetic material is measured in vitro; And (6) on computers and ex vivo analyses genotype data.Be described herein the method for putting into practice these six and other correlation step.At least some in described method steps does not directly apply health.In one embodiment, the present invention relates to be applied to and to be separated with health and the treatment of the tissue separated and other biomaterial and diagnostic method.At least some in described method steps performs on computers.
Some embodiments of the present invention allow clinicist to be determined at the genetic state of the fetus bred in mother's body with non-invasive manner, to make the health of baby can not be in danger because collecting the genetic material of fetus, and mother does not need experience invasive program.In addition, in some aspects, the present invention allows with high accuracy, is obviously greater than such as based on the accuracy mensuration fetus genetic state of the screening (being such as widely used in the triple tests in prenatal care) of the maternal serum analysis thing of Noninvasive.
The high accuracy of method disclosed herein is as described herein for the result of the information law of analyzing gene type data.Modern technologies progress has created and has used these class methods to measure the ability of a large amount of genetic information as high-flux sequence and gene type array from genetic material.Method disclosed herein allows clinicist to utilize a large amount of data available better, and makes the more Accurate Diagnosis of fetus genetic state.The details of multiple embodiment is given in hereinafter.Different embodiments can relate to the various combination of above-mentioned steps.Each combination of the different embodiments of different step can exchange use.
In one embodiment, blood sample takes from pregnant mothers, and the DNA of unmanaged flexibility in the blood plasma of mother's blood contains the mixture with the maternal DNA in source and the DNA of fetal origin, through being separated and ploidy state for measuring fetus.In one embodiment, method disclosed herein relates to those DNA sequence dnas keeping corresponding in probably consistent mode priority enrichment DNA mixture polymorphic allele with allele ratio and/or allele distributions after the enrichment.In one embodiment, the amplification that method disclosed herein relates to based on the PCR of height efficient targeting corresponds to target gene seat to make the gained molecule of high per-cent.In one embodiment, method disclosed herein relates to and checking order to containing the DNA mixture with the maternal DNA in source and the DNA of fetal origin.In one embodiment, method disclosed herein relates to allele distributions measured by use to measure the ploidy state of the fetus bred in mother's body.In one embodiment, method disclosed herein relates to reports to clinicist the ploidy state measured.In one embodiment, method disclosed herein relates to takes clinical evolution, such as, perform the test of follow-up invasive, such as fine hair sampling or amniocentesis, and the birth or the selectivity that prepare three body individualities stop three body fetuses.
The U.S. utility application the 11/603rd that the application submitted to reference on November 28th, 2006, No. 406 (U.S. discloses No. 20070184467); The U.S. utility application the 12/076th that on March 17th, 2008 submits to, No. 348 (U.S. discloses No. 20080243398); No. PCT/US09/52730th, the PCT application (PCT discloses No. WO/2010/017214) that on August 4th, 2009 submits to; No. PCT/US10/050824th, the PCT application (PCT discloses No. WO/2011/041485) that on September 30th, 2010 submits to; The U.S. utility application the 13/110th that on May 18th, 2011 submits to, No. PCT/12/58578th, the PCT application that No. 685 and on October 3rd, 2012 submit to, the mode that described application is quoted separately is in full incorporated herein.Some in this filing in vocabulary used can have its antecedent in these references.Some herein in described concept can be understood better according to the concept found in these reference.
Screening comprises the maternal blood of the foetal DNA of unmanaged flexibility
Method described herein may be used for the genotype helping to measure child, fetus or other target individual, wherein finds that the genetic material of target exists with the quantity of other genetic material.In certain embodiments, genotype can refer to one or more chromosomal ploidy state, and it can refer to the chain allelotrope of one or more disease or its some combinations.In the present invention, the genetic state concentrating on and measure fetus is discussed, and wherein foetal DNA is found in maternal blood, but this example does not intend to limit this method adaptable possibility situation.In addition, the amount that described method goes for wherein target dna becomes any ratio with non-targeted DNA when; For example, target dna can account for any value between 0.000001% and 99.999999% of existing DNA.In addition, non-targeted DNA not necessarily needs from body one by one, or even from related individuals, as long as be known from the genetic data of some or all of relevant non-targeted individuality.In one embodiment, method disclosed herein may be used for the genotype data from the maternal blood measuring fetus containing foetal DNA.It can also be used in pregnant woman uterus, wherein there is multiple fetus or wherein can there is other DNA polluted in the sample to which, such as, from the situation of other siblings be born.
This technology can utilize Fetal blood cells by the phenomenon of placental villi close to maternal circulation.Usually, only the fetal cell of minute quantity enters maternal circulation (being not enough to produce hemorrhage positive Ke-Bei (Kleihauer-Betke) test of fetus-female parent) by this way.Can be classified to fetal cell by various technology and analyze to find specific dna sequence, but containing the risk that invasive program itself has.This technology can also utilize the foetal DNA of unmanaged flexibility by the DNA release after the apoptosis of placenta tissue the phenomenon close to maternal circulation, wherein discussed placenta tissue contains and has the genotypic DNA identical with fetus.The DNA of the unmanaged flexibility found in maternal blood plasma has shown containing the proportional foetal DNA up to 30%-40% foetal DNA.
In one embodiment, blood can be extracted from pregnant woman.Research shows, except the DNA of the unmanaged flexibility that maternal blood is originated except female parent, and can also containing the DNA from a small amount of unmanaged flexibility of fetus.In addition, except the multiple hemocytes usually not containing core DNA in female parent source, the stoning Fetal blood cells of the DNA comprising fetal origin can also be there is.Be known in the art the part that multiple method is carried out isolating fetal DNA or created fetal DNA enrichment.For example, chromatogram has shown some part creating fetal DNA enrichment.
To extract in relative non-invasive mode and after the sample of maternal blood, blood plasma or other body fluid containing a certain amount of foetal DNA having grasped, gene type can be carried out to the DNA found in described sample, described foetal DNA or cell or unmanaged flexibility or with the proportional enrichment of maternal DNA or in its initial ratio.In certain embodiments, pin can be used to extract blood out to extract blood from vein, such as basilic vein (basilica vein).Method described herein may be used for the genotype data measuring fetus.For example, it may be used for the ploidy state measuring one or more karyomit(e) place, and it may be used for the identity of mensuration one or one group SNP, comprises insertion, disappearance and transposition.It may be used for measuring one or more haplotype, comprises the parental source of one or more yielding characteristics.
Should note, this method works to any nucleic acid that may be used for any gene type and/or sequence measurement, and described gene type and/or sequence measurement such as Yi Lu meter Na Yin flies Nimes array Platform, high flies gene chip, Yi Lu meter Na gene element analyzer or the German system of life technology Suo Li.This comprises DNA or its amplified material (such as whole genome amplification, PCR) of the unmanaged flexibility extracted from blood plasma; From genomic dna or its amplified material of other cell type (such as from the human lymphocyte of whole blood).About the preparation of DNA, any extraction or the purification process that produce the genomic dna of be applicable in these platforms will be applicable equally.This method can be effective equally to RNA sample.In one embodiment, the storage of sample can be undertaken by by making degraded drop to minimum mode (such as subzero, at about-20 DEG C or at lower temperatures).
Parental Support
Some embodiments can with PARENTAL SUPPORT tM(PS) Combination of Methods uses, the embodiment of described PS method is described in U. S. application the 11/603rd, No. 406 (U.S. discloses No. 20070184467), U. S. application the 12/076th, No. 348 (U.S. discloses No. 20080243398), U. S. application 13/110,685, in PCT application PCT/US09/52730 (PCT discloses No. WO/2010/017214) and No. PCT/US10/050824th, PCT application (PCT discloses No. WO/2011/041485), the mode that described application quotes in full with it is incorporated herein.PARENTAL SUPPORT tMbe a kind of method based on information, may be used for analyzing genetic data.In certain embodiments, method disclosed herein can think PARENTAL SUPPORT tMa part for method.In certain embodiments, PARENTAL SUPPORT tMmethod may be used for measuring target individual, set from one of described individuality or a small amount of cell or the method by disease-related allelotrope, other associated alleles and/or the one or more chromosomal ploidy state in the DNA from target individual and genetic data, specifically the mensuration target individual of DNA mixture that forms from the one or more DNA in other individuality with high accuracy.PARENTAL SUPPORT tMany one in these methods can be referred to.PARENTAL SUPPORT tMit is an example of the method based on information.PARENTAL SUPPORT tMthe exemplary embodiment of method to illustrate in Figure 29-31G and is described in experiment 19.
PARENTAL SUPPORT tMmethod utilizes known parent's genetic data, the i.e. haplotype of mother and/or father and/or diploid genetic data, and the understanding that machine-processed to reduction division and target dna and one or more related individuals of possibility imperfection is measured, and based on the chiasma frequency of colony so that in computer simulation with high confidence level reconstruct multiple allelotrope place genotype and/or containing the embryo of any target cell ploidy state and there is the target dna of position of key gene seat.PARENTAL SUPPORT tMmethod not only can the bad single nucleotide polymorphism (SNP) of reconfigurable measurement, and can reconstruct insert and disappearance and complete measument less than the whole region of SNP or DNA.In addition, PARENTAL SUPPORT tMmethod can be measured multiple disease linked gene seat and screen single celled dysploidy.In certain embodiments, PARENTAL SUPPORT tMmethod may be used for characterizing one or more cell from embryo biopsy during the IVF cycle to measure the hereditary patient's condition of one or more cell.
PARENTAL SUPPORT tMmethod allows to remove noisy genetic data.This by using the genotype of related individuals (father and mother) as a reference, can infer that the correct gene allelotrope in target gene group (embryo) carries out.When only a small amount of genetic material can with time (such as PGD) and when genotypic direct measuring result due to limited amount genetic material inherently with noise time, PARENTAL SUPPORT tMcan be especially appropriate.When only sub-fraction genetic material (such as NPD) can be obtained from target individual and when genotypic direct measuring result due to the DNA signal polluted from another individuality inherently with noise time, PARENTAL SUPPORT tMcan be especially appropriate.PARENTAL SUPPORT tMmethod can reconstruct the copy number of the orderly diploid allelic sequences of pin-point accuracy and chromosome segment on embryo, although the feature that conventional unordered diploid is measured may be the allelic loss of height ratio, insertion, variable amplification deviation and other mistake.Described method can adopt basic genetic model and basic Measuring error model.Genetic model can measure the crossover probability between the allelotrope probability at each SNP place and SNP.Can modeling be carried out at each SNP to allelotrope probability based on the data obtained from parent and based on the data obtained from the HapMap database such as planning exploitation by international HapMap, modeling be carried out to the crossover probability between SNP.In view of appropriate basic genetic model and Measuring error model, maximum a posteriori probability (MAP) can be used to estimate, to counting yield correct, thus estimate the correctly orderly allelotrope value at each SNP place in embryo.
In some cases, above summarized technology can measure the genotype of described individuality when specifying and deriving from individual minute quantity DNA.This can be the DNA from one or a small amount of cell, or it can from a small amount of foetal DNA found in maternal blood.
Suppose
In the present case, suppose to refer to possible genetic state.It can refer to possible ploidy state.It can refer to possible allele status.One group of hypothesis can refer to one group of possible gene appearance, one group of possible allele status, one group of possible ploidy state or its combination.In certain embodiments, one group of hypothesis can be designed and will correspond to the actual genetic state of any specified individual to make a hypothesis from described group.In certain embodiments, one group of hypothesis can be designed to make it possible to describe each possible genetic state by least one hypothesis from described group.In some embodiments of the invention, an aspect of method be measure which hypothesis correspond to institute discuss individuality actual genetic state.
In another embodiment of the present invention, a step relates to establishment hypothesis.In certain embodiments, it can be copy number hypothesis.In certain embodiments, it can relate to the hypothesis of which section (if present) corresponding to other related individuals about which section chromosomal from each in related individuals in heredity.Create hypothesis and can refer to set the behavior of variable limits to make to contain the possible gene appearance of whole group of studying by those variablees.
" copy number hypothesis " also referred to as " ploidy hypothesis " or " ploidy state hypothesis ", can refer to about in target individual for the hypothesis of possible ploidy state of specifying chromosome copies, chromosome type or chromosome segment.It can also refer to the ploidy state at more than one chromosome type places in individuality.One group of copy number hypothesis can refer to that wherein often kind of hypothesis corresponds to one group of hypothesis of possible ploidy state different in individuality.One group of hypothesis can relate to the possible foetal DNA per-cent of in one group of possible ploidy state, one group of possible parent's haplotype contribution, biased sample one group or its combination.
Normal individual contains often kind of karyomit(e) from each parent.But due to reduction division and mitotic division mistake, individual 0,1,2 or more of likely having from each parent specify chromosome type.In fact, be difficult to see that more than two from a parent are specified karyomit(e).In the present invention, some embodiments are only considered wherein to specify chromosomal 0,1 or 2 copy may suppose from a parent; Small extension is more or less may the copying considering to derive from a parent.In certain embodiments, for appointment karyomit(e), having nine kinds may suppose: may suppose for 0,1 or 2 chromosomal three kinds that originate about female parent, 0,1 or 2 chromosomal three kinds of being multiplied by about paternal origin may be supposed.(m, f) refers to hypothesis, and wherein m is from the chromosomal quantity of the appointment of mother's heredity, and f is from the chromosomal quantity of the appointment of father's heredity.Therefore, nine kinds of hypothesis are (0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1) and (2,2).These can also be write as H 00, H 01, H 02, H 10, H 12, H 20, H 21and H 22.Different hypothesis corresponds to Different Ploidy state.For example, (1,1) refers to normal two Autosomes; (2,1) refers to maternal trisomy, and (0,1) refers to male parent monosomy.In certain embodiments, wherein can be divided into two kinds of situations from parent heredity two karyomit(e)s further from the situation of another parent's heredity item chromosome: a kind of is that wherein two karyomit(e)s are consistent (coupling copy errors), and a kind of be that wherein two karyomit(e) is homology but is not consistent (unmatched copy errors).In these embodiments, have 16 kinds may suppose.Should be appreciated that, likely use the hypothesis of other hypothesis group and different quantities.
In some embodiments of the invention, ploidy hypothesis refers to about which karyomit(e) from other related individuals corresponding to the chromosomal hypothesis found in the genome of target individual.In certain embodiments, the key of described method to estimate that related individuals shares this fact of single times of territory, and the genetic data from related individuals measured by using, and to the understanding which single times of territory is mated between target individual with related individuals, likely with than the correct genetic data only using the high degree of confidence of the gene measuring result of target individual to infer target individual.Therefore, in certain embodiments, ploidy hypothesis not only can relate to chromosomal quantity, and can to relate to which karyomit(e) in related individuals be consistent with one or more karyomit(e) in target individual or almost consistent.
After defining hypothesis group, when described algorithm is to the operation of input genetic data, they can export the statistical probability for the determination of each in the hypothesis in research.The probability of each hypothesis can by using appropriate genetic data as input, for each in each hypothesis, the value that mathematical computations probability is equal is determined, as described in by one or more in method as described in other place in technical skill, algorithm and/or the present invention.
Such as measured by multiple technologies, after estimating the probability of different hypothesis, they can combined.For often kind of hypothesis, this may need the probability multiplication by often kind of technical measurement.Can to the product normalization method of hypothetical probabilities.It should be noted that a kind of ploidy hypothesis refers to chromosomal a kind of possible ploidy state.
The concept that the technician that " combined probability " is linear algebra field also referred to as the process of result of " the combined ansatz method " or combination technical skill should be familiar with.The one of combined probability may mode as follows: when given one group of genetic data, one group when supposing to use technical skill to estimate, the output of described method is one group and supposes relevant probability in mode one to one to often kind in described hypothesis group.When the one group of probability measured by the first technical skill (relevant to the one in the hypothesis in described group separately) is combined with the one group of probability measured by the second technical skill (supposing relevant separately with same group), by two groups of probability multiplications.This means, for often kind of hypothesis in described group, by as measured by two kinds of special methods to as described in relevant two probability multiplications of hypothesis together, and the product of correspondence is output probability.This process can extend to any amount of technical skill.If only use a kind of technical skill, so output probability is identical with input probability.If use two or more technical skill, so dependent probability can be multiplied simultaneously.Can be normalized to make the probability supposed in hypothesis group add up to 100% to product.
In certain embodiments, if the combined probability of appointment hypothesis is greater than the combined probability of any one in other hypothesis, so can think described hypothesis determines it is most possible.In certain embodiments, if normalization method probability is greater than threshold value, so can determine to suppose it is most possible, and can interpretation ploidy state or other genetic state.In one embodiment, this can mean, the chromosomal quantity relevant to described hypothesis and identity can be called ploidy state.In one embodiment, this can mean, the allelic identity relevant to described hypothesis can be called allele status.In certain embodiments, threshold value can between about 50% and about between 80%.In certain embodiments, threshold value can between about 80% and about between 90%.In certain embodiments, threshold value can between about 90% and about between 95%.In certain embodiments, threshold value can between about 95% and about between 99%.In certain embodiments, threshold value can between about 99% and about between 99.9%.In certain embodiments, threshold value can exceed about 99.9%.
Parent's background
Parent's background refers in each in one or both two relative chromosome in two parents of target, specifies allelic genetic state.Should notice that in one embodiment, parent's background does not refer to the allele status of target, in fact it refers to the allele status of parent.Specify parent's background of SNP can be made up of four base pairs (two male parents maternal with two); They each other can be identical or different.It is write as usually " m 1m 2| f 1f 2", wherein m 1and m 2the genetic state of the appointment SNP on two maternal chromosomes, and f 1and f 2it is the genetic state of the appointment SNP on two male parent karyomit(e).In certain embodiments, parent's background can be write as " f 1f 2| m 1m 2".It should be noted that subscript " 1 " and " 2 " refer to the genotype at the first and second chromosomal appointment allelotrope places; It shall yet further be noted that and which bar karyomit(e) is labeled as " 1 " and the selection which bar being labeled as " 2 " is arbitrary.
Should note in the present invention, A and B is generally used for representing base pair identity in general manner; A or B can represent C (cytosine(Cyt)), G (guanine), A (VITAMIN B4) or T (thymus pyrimidine) equally well.For example, if at the allelotrope place based on appointment SNP, the item chromosome of mother is T in the genotype at described SNP place, and homologous chromosomes is G at described SNP place, and the genotype at described allelotrope place father's two at described SNP place on homologous chromosomes is G, so can say, the allelotrope of target individual has parent background AB|BB; Can also say, described allelotrope has parent background AB|AA.It should be noted that in theory, any one in four kinds of possibility Nucleotide can appear at appointment allelotrope place, and therefore likely at appointment allelotrope place, the genotype of such as mother is AT, and the genotype of father is GC.But rule of thumb data shows, in most of the cases, two in four possibility base pairs are only observed at appointment allelotrope place.Likely, such as, when using unitary series of operations to repeat, there are two or more, more than four kinds and even more than ten kinds parent's backgrounds.In the present invention, supposition is discussed and will observes only two possibility base pairs at appointment allelotrope place, but embodiment disclosed herein can be revised to consider that this supposes invalid situation.
" parent's background " can refer to that one group or subgroup have the target SNP of identical parent's background.For example, if want 1000 allelotrope on the appointment karyomit(e) of measurement target individuality, so background AA|BB can refer to 1, in 000 allelic group, wherein the genotype of mother of target is isozygotied, and the genotype of the father of target is isozygotied, but wherein female genotype and male parent gene type are at described locus and dissimilar all allelic set.If the uncertain phase of parent's data, and therefore AB=BA, so there are nine kinds of possible parent's backgrounds: AA|AA, AA|AB, AA|BB, AB|AA, AB|AB, AB|BB, BB|AA, BB|AB and BB|BB.If parent's data phasing, and therefore AB ≠ BA, possible parent's background that so existence 16 kinds is different: AA|AA, AA|AB, AA|BA, AA|BB, AB|AA, AB|AB, AB|BA, AB|BB, BA|AA, BA|AB, BA|BA, BA|BB, BB|AA, BB|AB, BB|BA and BB|BB.Each SNP allelotrope (not comprising some SNP on sex chromosome) on karyomit(e) has the one in these parent's backgrounds.Parent's background of one of them parent is that the SNP group of heterozygosis can be referred to as heterozygosis background.
The purposes of parent's background in NPD
Non-invasive prenatal diagnosis is the important technology that may be used for measuring from the genetic material obtained with non-invasive manner (such as at the blood that pregnant mothers extracts) fetus genetic state.Can separating blood and separated plasma, then separated plasma DNA.Size can be used to select to isolate the DNA with appropriate length.Described DNA can at one group of locus priority enrichment.Then this DNA can be measured by multiple means, such as, by hybridizing to gene type array and measuring fluorescence, or by checking order on high-flux sequence instrument.
When non-invasive prenatal diagnosis, when using order-checking to carry out the ploidy interpretation of fetus, there is various ways to use sequence data.Modal mode to use sequence data to specify chromosomal reading quantity to count simply to being mapped to.For example, imagination is when you are attempting No. 21 chromosomal ploidy state measuring fetus.DNA in further imagination sample carrys out source DNA by 10% fetal origin DNA and 90% female parent and forms.In the case, you can check that may estimate is the average reading quantity on the karyomit(e) (such as No. 3 karyomit(e)s) of disome, and the reading quantity in described average reading quantity and No. 21 karyomit(e)s is compared, wherein for the quantity adjustment reading as the base pair of a part for unique sequences on described karyomit(e).If fetus is euploid, so can estimate that in the amount of the genomic DNA of all positions (experience random variation) per unit be roughly equal.On the other hand, if fetus is three bodies at No. 21 karyomit(e)s, so can estimate that No. 21 karyomit(e)s are compared to other position on genome, the DNA of every Gene unit is slightly more.Specifically, can estimate that the DNA in mixture on No. 21 karyomit(e)s has more about 5%.When using order-checking to measure DNA, can estimate that No. 21 karyomit(e)s are compared to other karyomit(e), the reading that each unique section about 5% can map more uniquely.The basis that the observations from specific chromosomal a certain amount of DNA is diagnosed as dysploidy can be used, the amount of described DNA when adjusting for the quantity that can be mapped to described chromosomal sequence uniquely higher than certain threshold value.The another kind of method that may be used for detecting dysploidy is similar to above method, and difference is to consider parent's background.
When considering which allelotrope of target, can consider that some parent's background may provide the likelihood of more information than other parent's background.For example, AA|BB and asymmetric background BB|AA is to provide the maximum background of information, since it is known fetus carries the allelotrope being different from mother.For symmetric reason, AA|BB and BB|AA background can be referred to as AA|BB.Another group provides parent's background of information to be AA|AB and BB|AB, because in these cases, the probability that fetus has 50% carries the allelotrope that mother does not have.For symmetric reason, AA|AB and BB|AB background can be referred to as AA|AB.The 3rd group of parent's background providing information is AB|AA and AB|BB, because in these cases, fetus carries known paternal allele, and described allelotrope is also present in female genome.For symmetric reason, AB|AA and AB|BB background can be referred to as AB|AA.4th kind of parent's background is AB|AB, and wherein fetus has unknown allele status, and regardless of allele status, it is all that mother has mutually homoallelic parent's background.5th kind of parent's background is AA|AA, and wherein M & F is heterozygosis.
The present invention the different embodiments of embodiment opened
There is disclosed herein the method for the ploidy state for measuring target individual.Target individual can be blastomere, embryo or fetus.In some embodiments of the invention, the method for measuring one or more the chromosomal ploidy state in target individual can comprise any one and its combination in step described in this document.
In certain embodiments, the source being ready to use in the genetic material in the genetic state measuring fetus can be the fetal cell be separated from maternal blood, such as, have core fetus red blood cell.Described method can relate to and obtains blood sample from pregnant mothers.Described method can relate to use visualization technique, based on the incoherent idea of other cell any existed in the unique relevant and similar color combination of certain color combination and erythroblast and maternal blood, and isolating fetal red blood cell.The color combination relevant to erythroblast can comprise the redness of circumnuclear hemochrome, and described color can become more obvious by dyeing, and the color of nuclear substance such as can dye blueness.By making them be diffused on slide glass from maternal blood separation cell, and then differentiate which some place to see redness (from hemochrome) and blue (from nuclear substance) at, the position of erythroblast can be differentiated.Then micromanipulator can be used to extract those erythroblasts, use gene type and/or sequencing technologies to measure the genotypic aspect of the genetic material in those cells.
In one embodiment, with the mould that only there is fluoresces at hgF fetal hemoglobin instead of maternal hemochrome, erythroblast can be dyeed, and therefore remove the uncertainty that erythroblast derives from mother or fetus.Some embodiments of the present invention can relate to dyeing or otherwise mark nuclear substance.Some embodiments of the present invention can relate to use fetal cell specific antibody mark fetal nucleus material specifically.
There is multiple alternate manner from maternal blood separation fetal cell, or from maternal blood separation foetal DNA, or under the existence of maternal inheritance material enriches fetal genetic material samples.List some in these methods at this, but this does not intend to be exhaustive list.For simplicity, list some appropriate technology at this: use with fluorescence mode or the antibody otherwise marked, size exclusion chromatography, with magnetic means or the avidity mark otherwise marked, epigenetic difference (such as at specific alleles, the differential methylation between maternal and fetal cell), density gradient centrifugation, be then CD45/14 consume and from CD45/14 negative cells carry out CD71 positive select, have different osmotic volumetric molar concentration list or two Percoll (Percoll) gradient or semi-lactosi specific agglutination element method.
In one embodiment of the invention, target individual is fetus, and carries out different genotype measurement to the multiple DNA sample from fetus.In some embodiments of the invention, foetal DNA sample is hung oneself isolation of fetal cells, and wherein fetal cell can mix with parental cells.In some embodiments of the invention, foetal DNA sample is from the foetal DNA of unmanaged flexibility, and wherein foetal DNA can mix with the maternal DNA of unmanaged flexibility.In certain embodiments, foetal DNA sample can derive from the maternal blood plasma of the mixture containing maternal DNA and foetal DNA or maternal blood.In certain embodiments, foetal DNA can by the female parent in following scope: fetus ratio mixes with maternal DNA: 99.9%: 0.1% to 99%: 1%, 99%: 1% to 90%: 10%, 90%: 10% to 80%: 20%, 80%: 20% to 70%: 30%, 70%: 30% to 50%: 50%, 50%: 50% to 10%: 90% or 10%: 90% to 1%: 99%, 1%: 99% to 0.1%: 99.9%.
The genetic data of target individual and/or related individuals can convert electronic state by using the instrument and/or the appropriate genetic material of commercial measurement of taking from following group to from molecularity, and described group includes, but is not limited to: gene type microarray and high-flux sequence.Some high-flux sequence methods comprise mulberry lattice DNA sequencing, Manganic pyrophosphate complex initiation, Yi Lu meter Na Suolaisa (SOLEXA) platform, the gene element analyzer of Yi Lu meter Na or 454 order-checking platform, the real single-molecule sequencing platform of He Likesi (HELICOS), the electron microscope sequencing of kingfisher molecule (HALCYON MOLECULAR) or other sequence measurements any of applying biological system (APPLIED BIOSYSTEM).The genetic data be stored in DNA sample is all converted to one group usually at the genetic data that will be stored in processed way in storage arrangement by all these methods for physically.
The genetic data of related individuals can be measured by analyzing the material taking from following group, described group includes, but is not limited to: individual block diploid tissue, from one or more diploid cell of individuality, from one or more haploid cell of individuality, from one or more blastomere of target individual, the extracellular genetic material that individuality finds, from the extracellular genetic material be found in maternal blood of individuality, from the cell be found in maternal blood of individuality, always one or more embryo gametogenic of autocorrelation individuality, from one or more blastomere that this type of embryo obtains, the extracellular genetic material that related individuals finds, known genetic material and its combination deriving from related individuals.
In certain embodiments, one group of at least one ploidy state hypothesis can be created for each in the relative chromosome type of target individual.Each in ploidy state hypothesis can refer to the karyomit(e) of target individual or the one possibility ploidy state of chromosome segment.It is some or all of that this group is supposed in the karyomit(e) possible ploidy state that may have that can comprise scheduled target individuality.Some in possible ploidy state can comprise nullisomic, monosomy, disomy, Uniparental disomy, orthoploidy, trisomy, coupling trisomy, not mate trisomy, maternal trisomy, male parent trisomy, tetrasomy, balance (2: 2) tetrasomy, imbalance (3: 1) tetrasomy, five body constituents, hexasomic, other dysploidy and its combination.Any one in these aneuploid states can be mixing or part dysploidy, such as unbalanced translocation, balanced translocation, Roche transposition (Robertsonian translocation), restructuring, disappearance, insertion, intersection and its combination.
In certain embodiments, may be used for making clinical decision to the understanding of measured ploidy state.This understanding is stored in storage arrangement with the physical arrangements of content usually, then can convert report to.Then can according to described report action.For example, clinical decision can be termination of pregnancy; Or clinical decision can be continue gestation.In certain embodiments, clinical decision can relate to the intervention of the severity of the phenotype expression being designed to reduce inherited disorder, or takes correlation step to be the decision having the child of special requirement to prepare.
In one embodiment of the invention, any one in described method herein can be revised to allow multiple target from same target individual, such as, extract many parts of blood from same position pregnant mothers.This can the accuracy of improved model, because repeatedly heredity is measured and can be provided the data that may be used for measuring target gene type more.In one embodiment, one group of Multi-Objective Genetic data serves as reported general data, and other the data of serving as checking major objective genetic data.In one embodiment, think that the many groups genetic data respectively measured since the genetic material taking from target individual is parallel, and therefore two groups of Multi-Objective Genetic data are all determined to constitute Fetal genome with which part in parent's genetic data that high accuracy is measured in order to help.
In one embodiment, described method may be used for parental right test purpose.For example, in view of from mother and from may be or may not be genetic father man the genotype information based on SNP and from biased sample measure genotype information, likely determine whether the genotype information of described man really represents the real genetic father of the fetus in breeding.Accomplish that the plain mode of this point checks that wherein mother is AA simply, and possibility father is the background of AB or BB.In these cases, estimate at this moment to see that father contributes half (AA|AB) or all (AA|BB) respectively.Consider to estimate ADO, directly can determine that whether viewed fetus SNP is relevant with those SNP of possibility father.
One embodiment of the present of invention can be as follows: pregnant woman wants the fetus knowing her whether to suffer from Down's syndrome and/or whether it can suffer from cystic fibrosis, and she does not wish to bear the child of any one suffered from these patient's condition.Doctor obtains her blood, and dyes to make it clearly present redness to hemochrome with a kind of mark, and dyes to make it clearly present blueness to nuclear substance with another kind of mark.Known maternal red blood cell is normally seedless, and a high proportion of fetal cell contains nucleus, and doctor can by differentiating that those cells of the red and blue two kinds of colors of display visually isolate multiple erythroblast.Doctor picks up these cells with micromanipulator from slide glass and they is delivered to laboratory, increases and gene type to ten separate cells.By using gene to measure, PARENTAL SUPPORT tMmethod can determine that six in ten cells are maternal blood cells, and in ten cells four are fetal cells.If pregnant mothers has given birth to child, so PARENTAL SUPPORT tMthe reliable allelotrope interpretation by making fetal cell can also be used for and show that they are dissimilar and determine that fetal cell is different from the cell going out to bear child with those having gone out to bear child.It should be noted that this method is conceptually similar with male parent testing example of the present invention.The quality of the genetic data measured from fetal cell is owing to being difficult to, to the unicellular gene type and may be very bad of carrying out, comprise multiple allelic loss.Clinicist can use PARENTAL SUPPORT tMuse the reliable DNA measuring result of measured foetal DNA and parent to infer each side of Fetal genome with high accuracy, thus convert the storage fetus genetic state predicted on computers by being included in from the genetic data on the genetic material of fetus.Clinicist can determine ploidy state and the multiple relevant disease linked gene of presence or absence of fetus.Result shows that fetus is euploid, and is not the carrier of cystic fibrosis, and mother determines to continue gestation.
In one embodiment of the invention, pregnant mothers wants the fetus determining her whether to suffer from any whole chromosome extremely.She goes for her doctor, and provides her blood sample, and she and her husband smear from cheek the DNA sample that examination gives themselves.Laboratory researchers uses MDA scheme amplification parent DNA, and uses Yi Lu meter Na Yin to fly parent's genetic data at a large amount of SNP place of Nimes array measurement, carries out gene type to parent DNA.Then blood is rotated reduction of speed by researchist, takes out blood plasma, and uses size exclusion chromatography to be separated the DNA sample of unmanaged flexibility.Or researchist uses one or more fluorescence antibody, such as, to hgF fetal hemoglobin, there is specific antibody, thus separation there is core fetus red blood cell.Researchist then obtain through be separated or enrichment fetal genetic material and use the 70-mer oligonucleotide library of the flanking sequence through designing to make the two ends of each oligonucleotide to correspond on target alleles both sides rightly to increase to it.After interpolation polysaccharase, ligase enzyme and appropriate reagent, the cyclisation of oligonucleotide experience gap-fill, catches desired allelotrope.Add exonuclease, hot deactivation, and product is directly used as pcr amplification template.Yi Lu meter Na gene element analyzer checks order to PCR primer.Sequence reads is used as PARENTAL SUPPORT tMthe input of method, then described method predicts the ploidy state of fetus.
In another embodiment, wherein mother is conceived and it is abnormal to be that the Mr. and Mrs of lying-in woman advanced in years want to know whether the fetus in breeding suffers from Down's syndrome, Turner syndrome, pula moral-Willie syndromes or some other whole chromosomes.Tocologist obstetrician extracts blood from M & F.Blood is delivered to laboratory, and in the lab, technician carries out centrifugal with separated plasma and leukocytic cream to maternal sample.DNA in leukocytic cream and male parent blood sample is by amplification conversion and be coded in genetic data in increased genetic material and convert by running genetic material measure parent genotype on high-flux sequence instrument the genetic data electronically stored further to from the genetic data stored with molecular form.Use 5,000 heavy half side nested type target PCR method, at one group of locus priority enrichment plasma sample.DNA fragmentation mixture is made the DNA library being applicable to check order.Then use high-flux sequence method, such as Yi Lu meter Na GAIIx gene element analyzer checks order to DNA.Described order-checking is by with molecular form, the information be coded in DNA converts the information be electronically coded in computer hardware to.Technology based on information comprises disclosed embodiment of this invention, such as PARENTAL SUPPORT tM, may be used for the ploidy state measuring fetus.This can relate on computers, calculates the allelotrope counting probability of multiple polymorphic locus from the DNA measuring result obtained about prepared sample; Create multiple separately about the ploidy hypothesis of chromosomal different possibility ploidy state on computers; For often kind of ploidy hypothesis, be that the expectation allelotrope counting at the multiple polymorphic locus places on karyomit(e) builds simultaneous distribution model on computers; Use simultaneous distribution model and the allelotrope of prepared sample measurement is counted, measuring the relative probability of each in ploidy hypothesis on computers; And by selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of interpretation fetus.After measured, fetus suffers from Down's syndrome.By reporting printing out, or electronically send to the tocologist obstetrician of pregnant woman, the tocologist obstetrician of pregnant woman sends diagnostic result to this women.This women, her husband and doctor sit down and their selection are discussed.Based on understanding fetus being suffered to the three body patient's condition, this determines termination of pregnancy to Mr. and Mrs.
In one embodiment, a company can determine to provide the diagnostic techniques of the dysploidy of the fetus be designed to from extracted maternal blood testing is bred.Their product can relate to a mother and appear in face of her tocologist obstetrician, and her tocologist obstetrician can extract her blood.Tocologist obstetrician can also collect the genetic material of fetus father.Clinicist can isolate blood plasma from maternal blood, and from plasma purification DNA.Clinicist can also isolate leukocytic cream from maternal blood, and prepares DNA from leukocytic cream.Clinicist can also from paternal inheritance sample preparation DNA.Clinicist can use the Protocols in Molecular Biology additional universal amplification mark in the DNA in the DNA deriving from plasma sample described in the present invention.Clinicist can increase through the DNA of common tags.Clinicist can pass through multiple technologies priority enrichment DNA, and described technology comprises is caught by hybridization and target PCR.Target PCR can relate to nested, half side nested or half nested or any other method for causing blood plasma to carry out effective enrichment of source DNA.Target PCR can compound on a large scale, such as compound 10 in a reaction volume, 000 primer, No. 13, wherein said primer target, No. 18, No. 21, X chromosome and those locus that X and Y is shared and the target SNP optionally on other karyomit(e).Selective enrichment and/or amplification can relate to different marks, molecular barcode, amplification marks and/or order-checking mark marks each independent molecule.Then clinicist to plasma sample, and also may can check order to prepared female parent and/or male parent DNA.Molecular biology step completely or partially can be performed by diagnosis box.Sequence data can be fed into the computing platform of single computer or another kind of type, such as, can find in " cloud ".Computing platform can be calculated from the measuring result obtained by sequenator and count at the allelotrope of target polymorphic locus.Computing platform can for No. 13, No. 18, No. 21, each in X and Y chromosome creates about nullisomic, monosomy, disomy, coupling trisomy and multiple ploidies hypothesis of not mating trisomy.Computing platform can, for the often kind of ploidy hypothesis of each had in five karyomit(e)s to be checked, be the expectation allelotrope counting structure simultaneous distribution model at target gene seat on karyomit(e).Computing platform can use simultaneous distribution model and measure each the correct probability in ploidy hypothesis about the allelotrope counting measured by the priority enrichment DNA deriving from plasma sample.Computing platform can for No. 13, No. 18, No. 21, each in X and Y chromosome, by selecting corresponding to the ploidy state having the hypothesis of substantial connection with maximum probability, and the ploidy state of interpretation fetus.The report of the ploidy state comprising institute's interpretation can be produced, and described report electronically can be sent to tocologist obstetrician, represent on an output device, or the printed hard copy of described report can be submitted to tocologist obstetrician.Tocologist obstetrician can notify the father of patient and optionally fetus, and they can determine that they are ready to accept which selection of clinical and which is the most desirable.
In another embodiment, pregnant woman (hereinafter referred to as " mother ") can determine that she wants to know whether her fetus carries any genetic abnormality or other patient's condition.She may want to guarantee to there is not any overall exception before she be sure of to continue gestation.She can go for her tocologist obstetrician, and her tocologist obstetrician can obtain her blood sample.He can also obtain genetic material, such as, smear examination from her cheek of cheek.He can also obtain genetic material from the father of fetus, and the such as cheek smears examination, sample of sperm or blood sample.Sample can be sent to clinicist by him.Clinicist can the mark of the foetal DNA of unmanaged flexibility in the maternal blood sample of enrichment.Clinicist can the mark of stoning Fetal blood cells in the maternal blood sample of enrichment.Clinicist can use all respects of described method herein to measure the genetic data of fetus.Described genetic data can comprise the chain allelic identity of one or more diseases in the ploidy state of fetus and/or fetus.A report can be produced, sum up antenatal diagnosis result.Described report can transmit or be mailed to doctor, and doctor can tell the genetic state of mother fetus.Mother can have the fact of one or more karyomit(e) or genetic abnormality or undesirable patient's condition based on fetus and determine to interrupt gestation.She can also not have the fact of any gross chromosome or genetic abnormality or any relevant hereditary patient's condition based on fetus and determine to continue gestation.
Another example can relate to by sperm donor artificial insemination and the pregnant woman of pregnancy.The risk that she wants the fetus making her cherish to suffer from genetic diseases drops to minimum.She in the blood drawing of bleeder place, and uses the technology described in the present invention to be separated three to have core fetus red blood cell, and from mother and genetic father collection organization sample.To from fetus with increase from the genetic material of M & F and use Yi Lu meter Na Yin to fly Nimes micro-sphere array and carry out gene type time suitable, and method described herein carries out clean and phasing with high accuracy to parent and fetus genotype, and makes ploidy interpretation to fetus.Find that fetus is euploid, and from the fetus genetype for predicting phenotype susceptibility through reconstruct, and produce report and it sent to the doctor of mother so that they can determine which type of clinical decision may be best.
In one embodiment, but the original genetic material of M & F by means of amplification convert to a certain amount of in sequence similar quantitatively larger DNA.Then, by means of methods of genotyping, be converted into can be stored in by physics mode and/or electronics mode the gene measuring result that memory device is set up, such as mentioned above those by the genotype data of nucleic acid encoding.Use programming language, PARENTAL SUPPORT will be formed tMthe related algorithm of algorithm (its relevant portion discusses in detail in this article) translates into computer program.Then, by performing computer program on computer hardware, instead of the bit of encoding for physically and byte, to represent the pattern arrangement of raw measurement data, they are converted into the pattern representing that the high confidence level of fetus ploidy state measures.The details of this conversion will depend on data itself and be used for performing machine language and the hardware system of described method herein.Then, the data-switching that the high quality ploidy through being configured to for physically for representing fetus measures is become report, and described report can send to health care practitioner.This conversion can use printer or graphoscope to perform.Described report can be the copy printed on paper or other suitable medium, or it can be electronics.When electronic report, it can be transmitted, and it can be stored in memory device by physics mode and be set up by the addressable position of health care practitioner computer; It can also be presented on screen to read it.When screen display, by causing the physical transformation of pixel on the display apparatus, described data-switching can be become readable format.By means of for physically to screen radiant electronics, by means of change electric charge, the transparency of one group of specific pixel on the screen in the substrate front that can be positioned at transmitting or absorb photons can be changed for physically, realizes conversion.This conversion can realize by means of the nano level orientation changing liquid crystal Middle molecule, such as, one group of specific pixel, from nematic phase to cholesteryl phase or smectic phase.This conversion can realize by means of electric current, and described electric current causes launches photon from one group of specific pixel, and described pixel is made up of multiple photodiode arranged with significant pattern.This conversion can be realized by any alternate manner for showing information, such as computer screen or some other take-off equipment or information transmission mode.Then health care practitioner according to described report action, can be converted into action to make the data in report.Described action can be continue or interrupt gestation, and when interrupting gestation, the fetus had in the breeding of genetic abnormality is converted into abiotic fetus.Listed conversion can be assembled herein, to make such as to convert the genetic material of pregnant mothers and father to be made up of genetic abnormality fetal abortion or be made up of continuation gestation medical treatment decision by the multiple steps summarized in the present invention.Or, one group of genotype measuring result can be converted to the report helping doctor to treat his pregnant patients.
In one embodiment of the invention, method described herein may be used for the ploidy state measuring fetus, even when the mother (i.e. pregnant woman) that lodges be not she the natural mother of pregnant youngster time still can measure.In one embodiment of the invention, described herein method may be used for only using maternal blood sample and does not need paternal inheritance sample namely to measure the ploidy state of fetus.
Some mathematical methods in disclosed embodiment of this invention are for setting up the hypothesis of the aneuploid state about limited quantity.In some cases, estimate that such as only zero, one or two karyomit(e)s are derived from each parent.In some embodiments of the invention, mathematical derivation can expand to the dysploidy considering other form, such as tetrasomy (wherein three karyomit(e)s are derived from a parent), five body constituents, hexasomic etc., but does not change key concept of the present invention.Meanwhile, likely concentrate on the ploidy state of more small number, such as only trisomy and disomy.It should be noted that the chromosomal ploidy measurement result of instruction non-integer bar can indicate being fitted together in genetic material samples.
In certain embodiments, genetic abnormality is a class dysploidy, such as Down's syndrome (or 21-tri-Signs), Edward (Edwards syndrome) (18-tri-Signs), handkerchief pottery syndrome (Patau syndrome) (13-tri-Signs), Turner syndrome (45X), kirschner syndromes (male sex containing 2 X chromosomes), pula moral-Willie syndromes and enlightening George syndromes (DiGeorge syndrome) (UPD 15).Congenital disorder, such as to go up in one listed those, normally undesirable, and foundation can be provided for following decision to the understanding that fetus suffers from one or more phenotype exception: termination of pregnancy, to take required preventive measures to think to have the birth of the child of special requirement to prepare or adopt some for alleviating the methods for the treatment of of chromosome abnormalty severity.
In certain embodiments, method described herein can use in pregnant age early in pole, such as far back as surrounding, as far back as five weeks, as far back as six weeks, as far back as seven weeks, as far back as eight weeks, as far back as nine weeks, as far back as ten weeks, as far back as 11 weeks and as far back as 12 weeks.
In certain embodiments, for the Embryo selection in vitro between receptive period when method disclosed herein is gene diagnosis before implantation (PGD), wherein target individual is embryo, and parent genotype data may be used for the sequencing data of the biopsy from unicellular or two cell biopsies from the embryo of the 3rd day or the embryonic trophoblasts from the 5th day or the 6th day, and the ploidy made about embryo measures.In PGD situation, only measure child DNA, and only test a small amount of cell, be generally one to five, but also can reach ten, 20 or 50.Then by child's genotype and cell quantity, the starting copies sum of A and B allelotrope (at SNP) is determined simply.In NPD, starting copy number is high and therefore estimate that the allele ratio after PCR accurately reflects starting ratio.But a small amount of starting copies in PGD means, pollution and incomplete PCR efficiency have material impact to the allele ratio after PCR.This impact may measure after checking order than prediction the deviation of allele ratio time the reading degree of depth important.Can PCR-based probe efficiency and contamination probability, the distribution of the allele ratio measured for known child's genotype is created by the Monte Carlo simulation (Monte Carlo simulation) of PCR process.In view of for the genotypic allele ratio distribution of each possible child, each likelihood supposed can be calculated as described in for NIPD.
Maximum likelihood estimation
The most of method becoming known for detecting presence or absence biological phenomenon or medical condition in this area relates to the single hypothesis of use and repels test, wherein measures the metric relevant with the patient's condition, and if metric is in the side of specifying threshold value, so there is the patient's condition; And if metric drops to the opposite side of threshold value, so there is not the patient's condition.When making a decision between null hypothesis and alternative hypothesis, single hypothesis is repelled test and is only checked zero cloth.When do not consider standby select distribution, can not often plant the likelihood of hypothesis in view of viewed data estimation and therefore can not calculate the degree of confidence of described interpretation.Therefore, when single hypothesis repels test, when unclear relevant to particular case degree of confidence, be or no answer.
In certain embodiments, method disclosed herein can use maximum likelihood method to detect presence or absence biological phenomenon or medical condition.This substantially improves the method using single hypothesis repulsion technique because suitably time can for often kind of situation adjustment to the threshold value of interpretation that there is not or exist the patient's condition.This diagnostic techniques being determined at presence or absence dysploidy in the fetus in breeding for the genetic data being intended to obtain from the mixture of the fetus that can certainly be present in the DNA of the unmanaged flexibility found maternal blood plasma and maternal DNA is especially appropriate.This is because change along with the mark of foetal DNA in blood plasma Origination section, the optimal threshold of interpretation dysploidy contrast orthoploidy changes.Along with fetus mark declines, the distribution of the data relevant to dysploidy becomes the distribution being more and more similar to the data of being correlated with orthoploidy.
Maximum likelihood estimate uses the likelihood supposing relevant distribution to estimate the data being assumed to be condition with often kind to often kind.Then these conditional probabilities can be converted into hypothesis interpretation and degree of confidence.Similarly, maximum a-posteriori estimation method uses the conditional probability identical with maximum likelihood estimation, and supposes when selection is best and is incorporated with colony's priori when measuring degree of confidence.
Therefore, the use of maximum likelihood estimation (MLE) technology or closely-related maximum a posteriori probability (MAP) technology obtains two advantages, the first probability which increasing correct interpretation, and it also allows the degree of confidence calculating each interpretation.In one embodiment, the ploidy state corresponding to the hypothesis with maximum probability is selected to use maximum likelihood estimation or maximum a-posteriori estimation to perform.In one embodiment, disclose a kind of method of the ploidy state for measuring the fetus in breeding, described method relates to any method of current known use single hypothesis repulsion technique in employing this area and is again formulated it to make its use MLE or MAP technology.United States Patent (USP) 8 can be found in, 008,018, United States Patent (USP) 7,888,017 or United States Patent (USP) 7,332, in 277 by some examples of application these technology significantly improved method.
In one embodiment, describe a kind of for be determined at comprise fetus and female parent gene group DNA maternal plasma sample in the method for presence or absence fetus dysploidy, described method comprises: obtain maternal plasma sample; The DNA fragmentation found in plasma sample is measured with high-flux sequence instrument; Described sequence mapping is measured be mapped to the quantity of the chromosomal sequence reads of every bar to karyomit(e); Calculate the mark of foetal DNA in plasma sample; Use fetus mark and be mapped to and estimate it is euploid one or more with reference to the quantity of chromosomal sequence reads, if calculating the second target chromosome is in that event euploid, so will estimate the expectation distribution of the amount of the target chromosome existed; If be aneuploid with described karyomit(e), the one or more expectation distributions so will estimated; And use MLE or MAP to measure in described distribution any most likely correct, thus instruction presence or absence fetus dysploidy.In one embodiment, measure DNA from blood plasma and can relate to execution large-scale parallel shotgun sequencing.In one embodiment, measure DNA from plasma sample and can relate to such as being increased by target, check order at the DNA of multiple polymorphic or non-polymorphic locus priority enrichment.Multiple locus can be designed to target one or a small amount of doubtful aneuploid karyomit(e) and one or a small amount of reference karyomit(e).The object of priority enrichment measures to increase to ploidy the quantity providing the sequence reads of information.
Ploidy interpretation information law
There is described herein a kind of method in view of sequence data mensuration fetus ploidy state.In certain embodiments, this sequence data can be measured on high-flux sequence instrument.In certain embodiments, sequence data can be measured the DNA of the DNA deriving from the unmanaged flexibility be separated from maternal blood, and wherein the DNA of unmanaged flexibility comprises the DNA in some maternal sources and the DNA in some fetuses/placenta source.This part will describe one embodiment of the present of invention, and the mark wherein having carried out foetal DNA in the mixture analyzed in supposition is not known and by from the ploidy state measuring fetus when data estimation.It is also by description embodiment, and wherein in mixture, the mark (" fetus mark ") of foetal DNA or the per-cent of foetal DNA can be measured by another kind of method, and supposition is known when measuring fetus ploidy state.In certain embodiments, fetus mark can only use the gene type measuring result obtained about maternal blood sample itself to calculate, and described maternal blood sample is the mixture of fetus and maternal DNA.In certain embodiments, described mark can also use measured or otherwise known maternal gene type and/or measured or otherwise known father's genotype to calculate.In another embodiment, the mark of foetal DNA that the ploidy state of fetus can only calculate compared to the reference karyomit(e) for supposition disome based on the mark of the foetal DNA calculated for discussed karyomit(e) measures.
In a preferred embodiment, if about specific karyomit(e), we observe and analyze N number of SNP, to this we:
● the DNA sequence dna measuring result S=(s of setting NR unmanaged flexibility 1..., s nR).Because this method utilizes SNP measuring result, so all sequences data corresponding to non-polymorphic locus are negligible.In simple version, wherein we have (A, the B) counting on each SNP, and wherein A and B is corresponding to being present in two kinds of allelotrope of specifying locus, and S can be write as S=((a 1, b 1) ..., (a n, b n)), wherein a ithe A counting on SNP i, b ithe B counting on SNP i, and
● by the following parent's data formed
Zero from SNP microarray or other genotype based on the gene type platform of intensity: mother M=(m 1..., m n), father F=(f 1..., f n), wherein m i, f i∈ (AA, AB, BB).
Zero and/or sequence data measuring result: mother NRM measuring result SM=(sm 1..., sm nrm), father NRF measuring result SF=(sf 1..., sf nrf).Be similar to above simplification, if we have (A, the B) counting on each SNP, so SM=((am 1, bm 1) ..., (am n, bm n)), SF=((af 1, bf 1) ..., (af n, bf n)).
In general, mother, father child's data are expressed as D=(M, F, SM, SF, S).It should be noted that parent's data are desired and improve the accuracy of algorithm, but optional, especially father's data.Even if this means, when there are not mother and/or father's data, still likely to obtain copy number results very accurately.
Likely by this data maximizing of log-likelihood to thought all hypothesis (H), derive best copy number and estimate.Specifically, likely use simultaneous distribution model and the allelotrope of prepared sample measurement counted, measuring the relative probability of each in ploidy hypothesis, and use those relative probabilities to measure most likely correct hypothesis as follows:
Similarly, in view of described data, posteriority hypothesis likelihood can be write as:
Wherein prior probability (H) is based on pattern layout and priori, distributes to the prior probability of often kind of hypothesis H.
Priori also may be used to find maximum a-posteriori estimation:
In one embodiment, admissible copy number hypothesis is:
● monosomy:
Zero maternal H10 (copy from mother)
Zero male parent H01 (copy from father)
● disomy: H11 (each copy of M & F)
● simple trisomy, do not consider to intersect:
Zero is maternal: H21_ coupling (from two consistent copyings of mother, a copy from father), H21_ do not mate (from two copies of mother, a copy from father)
Zero male parent: H12_ mates (from a copy of mother, two consistent copyings from father), H12_ does not mate (from a copy of mother, two copies from father)
● compound trisomy, allows to intersect (using simultaneous distribution model):
Zero maternal H21 (from two copies of mother, one from father),
Zero male parent H12 (from a copy of mother, two copies from father)
In other embodiments, other ploidy state can be considered, such as nullisomic (H00), Uniparental disomy (H20 and H02) and tetrasomy (H04, H13, H22, H31 and H40).
If there is no intersect, no matter so each trisomy originates is mitotic division, meiosis I or meiosis II, all by the one being coupling or do not mate in trisomy.Owing to intersecting, the normally combination of two of real trisomy.First, the method for a kind of derivation about the hypothesis likelihood of simple hypothesis is described.Then, the method for a kind of derivation about composite hypothetic hypothesis likelihood is described, by independent SNP likelihood and combined crosswise.
About the LIK (D|H) of simple hypothesis
In one embodiment, for simple hypothesis, LIK (D|H) can measure as follows.About simple hypothesis H, the log-likelihood LIK (H) of the hypothesis H on whole chromosome may be calculated the summation of the log-likelihood of independent SNP, assuming that child's mark that is known or that derive is cf.In one embodiment, likely cf is derived from data.
It is any chain that this hypothesis does not suppose between SNP, and therefore do not utilize simultaneous distribution model.
In certain embodiments, log-likelihood can be measured based on each SNP.On specific SNP i, assuming that fetus ploidy is assumed to be H and foetal DNA per-cent is cf, the log-likelihood of viewed data D is defined as:
Wherein in view of hypothesis H, m are the real maternal gene types of possible, f is possible real father's genotype, wherein m, f{AA, AB, BB}, and c is possible child's genotype.Specifically, about monosomy, c; About disomy, c; About trisomy, c.
Genotype priori frequency: p (m|i) is based on SNP I place and is expressed as pA iknown colony frequency, the general prior probability of the maternal gene type m on SNP i.Specifically:
;;
Father's genotype probability p (f|i) can measure by similar fashion.
Real child's probability: be in view of parent m, f, and supposition hypothesis H (these can easily calculate), obtain the probability of real child genotype=c.For example, do not mate about H11, H21 coupling and H21, provide p (c|m, f, H) below.
Data likelihood: be in view of real maternal gene type m, real father genotype f, real child genotype c, suppose H and child's mark cf, the probability of specific data D on SNP i.It can be decomposed into the probability of mother, father and child's data as follows:
Mother SNP array data likelihood: assuming that SNP array gene type is correct, at SNP i, mother SNP array gene type data compared to the probability of real gene type m are simply
Mother's sequence data likelihood: at counting S i=(am i, bm i) when, when not relating to additional noise or deviation, being binomial probability at the probability of mother's sequence data of SNP i, being defined as P (SM|m, i)=P x|m(am i), wherein X|m ~ Binom (p m(A), am i+ bm i), be wherein defined as
m AA AB BB A B Without interpretation
p(A) 1 0.5 0 1 0 0.5
Father's data likelihood: similar equation is applicable to father's data likelihood.
It should be noted that likely when measuring child's genotype without when parent's data, especially father's data.For example, if father's genotype data F can not be obtained, so can only use.If father's sequence data SF can not be obtained, so can only use P (SF|f, i)=1.
In certain embodiments, described method relates to for often kind of ploidy hypothesis, for the expectation allelotrope counting at the multiple polymorphic locus places on karyomit(e) builds simultaneous distribution model; There is described herein a kind of method realizing this object.Free foetal DNA data likelihood: be in view of real maternal gene type m, real child genotype c, child's copy number are supposed H and suppose that child's mark is the probability of free fetal DNA sequence data on cf, SNP i.It is actually the true probability in view of the A content on SNP i, the probability of sequence data S on SNP I
About counting, wherein S i=(a i, b i), do not relate to extra data noise or deviation,
Wherein X ~ Binom (p (A), a i+ b i), wherein p (A)=.In more complicated cases, wherein (A, B) counting of precise alignment and each SNP is unknown, is comprehensive binomial combination.
The true probability of A content:, assuming that real maternal gene type=m, real child genotype=c, and overall child mark=cf, the true probability of the A content in this mother/child's mixture on SNP i is defined as
The quantity of A in wherein #A (g)=genotype g, is the karyomit(e) of mother and is the ploidy (monosomy is 1, and disomy is 2, and trisomy is 3) of child under hypothesis H.
Use simultaneous distribution model: composite hypothetic LIK (D|H)
In certain embodiments, described method relates to for often kind of ploidy hypothesis, for the expectation allelotrope counting at the multiple polymorphic locus places on karyomit(e) builds simultaneous distribution model; There is described herein a kind of method realizing this object.In many cases, trisomy is owing to intersecting, usually be not only mate or do not mate, therefore in this part, the result of derivation composite hypothesis H21 (maternal trisomy) and H12 (male parent trisomy), described hypothesis combinations matches and do not mate trisomy, explains possible intersection.
When trisomy, if there is no intersect, so trisomy will be coupling simply or not mate trisomy.The situations of two copies of coupling trisomy has been child from parent's heredity consistent chromosome segment.The situation of a copy of not mate trisomy be child from parent's heredity each homologous chromosomal segments.Owing to intersecting, more chromosomal sections can have coupling trisomy, and other parts can have and do not mate trisomy.How describe in this part is one group of allelic heterozygosis rate structure simultaneous distribution model; That is, for one or more hypothesis, for the expectation allelotrope counting at multiple locus place builds simultaneous distribution model.
If on SNP i, be coupling hypothesis H mmatching, and be do not mate hypothesis H umatching, and the crossover probability between pc (i)=SNP i-1 and i.So can calculate complete likelihood as follows:
Wherein about SNP 1:N, with the possibility that to suppose E be result.The hypothesis of last SNP of E=, E.In a recursive manner, can calculate:
Wherein ~ E is the hypothesis (not being E) except E, and wherein considered hypothesis is H mand H u.Specifically, can based in identical hypothesis and without intersection or contrary hypothesis and when intersecting 1: the likelihood of (i-1) SNP, be multiplied by the likelihood of SNP i, calculate the likelihood of 1: i SNP
About SNP 1, i=1.
About SNP 2, i=2,
And about i=3: N etc.
In certain embodiments, child's mark can be measured.Child's mark can refer to the ratio of the sequence being derived from child in DNA mixture.When non-invasive prenatal diagnosis, child's mark can refer to be derived from fetus or the sequence ratio containing the genotypic placental debris of fetus in maternal blood plasma.It can refer to from the preparation of maternal blood plasma and can child's mark the DNA sample of enriches fetal DNA.The object measuring the child's mark in DNA sample is applicable to the algorithm can making ploidy interpretation to fetus, and therefore, child's mark can refer in order to non-invasive prenatal diagnosis carries out any DNA sample of sequencing analysis.
Some supposition child mark in the algorithm of the part as Noninvasive antenatal dysploidy diagnostic method proposed in the present invention is known, and this may not be that such was the case with.In one embodiment, likely when presence or absence parent data, by the likelihood maximizing to the disomy on selected karyomit(e), most probable child's mark is found
Specifically, about disomy hypothesis with about the child's mark cf on karyomit(e) chr, if LIK (D|H11, cf, chr)=log-likelihood as above, about karyomit(e) (normally 1: 16) selected by Cset, assuming that be euploid, complete likelihood is:
Most probable child's mark is (through being derived as.
Likely use arbitrary group chromosome.Also may when not supposing with reference to child's mark of deriving when orthoploidy on karyomit(e).Make in this way, likely measure child's mark of any one in following situation: (1) has the situation of the array data about parent and the shotgun sequencing data about maternal blood plasma; (2) there is the situation of the array data about parent and the target sequencing data about maternal blood plasma; (3) there is the situation of the target sequencing data about parent and maternal blood plasma; (4) there is the situation of the target sequencing data about mother and maternal blood plasma mark; (5) there is the situation of the target sequencing data about maternal blood plasma mark; (6) other combination of parent and child's fraction measurement result.
In certain embodiments, information law can in conjunction with loss of data; This can produce the ploidy with more high accuracy and measure.In the present invention other is local, and the probability having supposed to obtain A is the direct function of child's mark in real maternal gene type, real child's genotype, mixture and child's copy number.Also may lose mother or child's allelotrope, for example replace measuring real child AB in mixture, this situation may be only measure the sequence being mapped to allelotrope A.Parent's Loss Rate of genome Yi Lu meter Na data can be expressed as d pg, parent's Loss Rate of sequence data is expressed as d psand child's Loss Rate of sequence data is expressed as d cs.In certain embodiments, mother's Loss Rate can suppose it is zero, and child's Loss Rate is relatively low; In the case, what result can not be lost has a strong impact on.In certain embodiments, the possibility of allelic loss can be very large, so that they create significant impact to the interpretation of prediction ploidy.For this situation, at this, allelic loss is incorporated in algorithm:
Parent SNP array data is lost: about maternal gene group data M, if the genotype after losing is m d, so
Wherein, as previously mentioned; And be in view of real genotype m, for Loss Rate d, genotype m after possible loss dlikelihood, be defined as follows
Similar equation is applicable to father SNP array data.
Parental array loss of data: about mother's sequence data SM
Wherein as in previous part define and from binomial distribution probability as previously in parent's data possibility part define.Similar equation is applicable to male parent sequence data.
The DNA sequence data of unmanaged flexibility is lost:
Wherein as in the data possibility part of unmanaged flexibility define.
In one embodiment, in view of real maternal gene type, assuming that Loss Rate d ps, be the probability of viewed maternal gene type; And in view of real child's genotype, assuming that Loss Rate d cs, be the genotypic probability of viewed child.If nA tthe allelic quantity of A in=real gene type c, nA dthe allelic quantity of A in=viewed genotype, wherein nA t>=nA d, and nB similarly tthe allelic quantity of B in=real gene type c, nB dthe allelic quantity of B in=viewed genotype, wherein nB t>=nB dand d=Loss Rate, so
In one embodiment, information law can be incorporated to the consistent deviation of Stochastic sum.Ideally, neither one SNP consistence sampling bias or random noise (except binomial distribution change) in the quantity of sequence count.Specifically, on SNP i, about maternal gene type m, real child genotype c and child's mark cf, and the quantity of A in the set of (A+B) reading on X=SNP i, X works mode as X ~ Binomial (p, A+B), the true probability of wherein p==A content.
In one embodiment, information law can be incorporated to random deviation.Typically, if measuring result exists deviation, the probability therefore obtaining A on this SNP equals q, and this q is somewhat different than p as hereinbefore defined.P and q has and how differently depends on the accuracy of measuring process and the quantity of other factors and the standard deviation that can depart from p by q comes quantitatively.In one embodiment, be likely modeled as by q and have β distribution, wherein parameter depends on the mean value of the described distribution centered by p, and some specified values deviation s.Specifically, these give, wherein.If our order, so parameter can through being derived as, wherein.
This is the definition of β-binomial distribution, wherein samples from the binomial distribution with varying parameter q, and wherein q obeys β distribution, and mean value is p.Therefore, in not tool situation devious, on SNP i, assuming that real maternal gene type (m), in view of mother's sequence A counting (am on SNP i i) and SNP i on mother's sequence B counting (bm i), parental array data (SM) probability can calculate as follows:
P (SM|m, i)=P x|m(am i), wherein X|m ~ Binom (p m(A), am i+ bm i)
Now, comprise random deviation and standard deviation s, this just becomes:
X|m~BetaBinom(p m(A),am i+bm i,s)
In not tool situation devious, assuming that real maternal gene type (m), really child's genotype (c), child's mark (cf), assuming that child supposes H, in view of the DNA sequence dna A of the unmanaged flexibility on SNP i counts (a i) and the sequence B counting (b of unmanaged flexibility on SNP i i), maternal plasma dna sequence data probability can calculate as follows:
Wherein X ~ Binom (p (A), a i+ b i), wherein p (A)=.
In one embodiment, comprise random deviation and standard deviation s, this just becomes X ~ BetaBinom (p (A), a i+ b i, s), the amount of wherein extra change is specified by straggling parameter s or is equivalent to N.S value less (or N value is larger), this distribution is close to conventional binomial distribution.Likely from the amount of clear and definite background AA|AA, BB|BB, AA|BB, BB|AA estimated bias, namely estimate N above, and estimated by using in above probability depend on the characteristic of data, N can be made to be a constant, with reading degree of depth a i+ b iirrelevant; Or a i+ b ifunction, make the reading degree of depth larger, deviation is less.
In one embodiment, information law can be incorporated to each SNP deviation of consistence.Due to the artifact of sequencing procedure, some SNP can have consistent lower or higher counting, have nothing to do with the substantial amount of A content.If SNP i as one man adds w in A count number ithe deviation of %.In certain embodiments, this deviation can be estimated from the set of derived training data under the same conditions, and be added back in parental array data estimation, for:
P (SM|m, i)=P x|m(am i), wherein X|m ~ BetaBinom (p m(A)+w i, am i+ bm i, s)
And when the DNA sequence data probability estimate of unmanaged flexibility, for:
, wherein X ~ BetaBinom (p (A)+w i, a i+ b i, s),
In certain embodiments, described method can be write especially to consider additional noise, differentiated sample quality, differentiated SNP quality and stochastic sampling deviation.Its example is given at this.When this method has shown the data being particularly useful for using the miniature PCR scheme of extensive compound to produce, and be used in experiment 7 to 13.Described method relates to several step, and each step introduces different types of noise and/or deviation to final mask:
(1) if the first sample comprising the mixture of the DNA of female parent and fetus contains original bulk size=N 0dNA molecular, usually at 1,000-40, in the scope of 000, the wherein true % reference of p=
(2) in the general joint adapter amplification of use, assuming that to N 1individual molecule is sampled; Usual N 1be about N 0/ 2 molecules and introduce stochastic sampling deviation due to sampling.Amplification sample can contain N 2individual molecule, wherein N 2> > N 1.Make X 1represent the N sampled 1with reference to the amount (based on each SNP) of locus in individual molecule, wherein p 1=X 1/ N 1change introduce stochastic sampling deviation at scheme rest part.By using β-binomialexpression (BB) distribution to replace using simple binomial distribution model, this sampling bias is comprised in a model.After a while can based on each sample, after adjustment leakage and amplification deviation, on SNP, 0 < p < 1, estimates the Parameter N of β-binomial distribution from training data.Leakage is the tendency reading SNP improperly.
(3) amplification step will increase any allelotrope deviation, increase due to may the deviation introduced of uneven amplification thus.If at an amplified allele f times of locus, at another amplified allele g times of described locus, wherein f=ge b, wherein b=0 represents zero deflection.Straggling parameter b centered by 0, and represents that the A allelotrope on specific SNP has more compared to B amplified allele or is less than how many.Parameter b can be different between SNP and SNP.Straggling parameter b based on each SNP, such as, can estimate from training data.
(4) sequencing steps relates to and checking order to increased molecular sample.In this step, can there is leakage, wherein leaking is the situation reading SNP improperly.Leakage can be caused by any amount of problem, and read SNP can be caused not to be real allelotrope A, but another allelotrope B found at described locus or be not find at the allele C of described locus or D usually.If described order-checking is from size N 3the sequence data of the multiple DNA molecular of amplification sample measurement, wherein N 3< N 2.In certain embodiments, N 3can 20,000 to 100,000,100,000 to 500,000,500,000 to 4,000,000,4,000,000 to 20,000,000 or 20,000, in the scope of 000 to 100,000,000.Each molecule through sampling is p by the probability correctly read g, it will correctly represent allelotrope A in the case.Sample will be 1-p by pronouncing the allelic probability irrelevant with initial molecular improperly g, and the probability looking like allelotrope A will be p r, the probability of allelotrope B is p mor the probability of allele C or allele D is p o, wherein p r+ p m+ p o=1.Parameter p g, p r, p m, p oestimate from training data based on each SNP.
Different schemes can relate to similar step, and wherein the change of molecular biology step causes the stochastic sampling of different amount, different level of amplification and different leakage deviation.Each in these situations can be applicable to equally well with drag.Based on each SNP, the model of the DNA amount of sampling is provided by following:
X 3~BetaBinomial(L(F(p,b),p r,p g),N*H(p,b))
The wherein substantial amount of p=reference dna, each SNP deviation of b=, and as mentioned above, p gthe probability of correct reading, p rthe probability that incorrect reading still chances on the reading looking like correct allele, in bad reading situation as above, so:
F(p,b)=pe b/(pe b+(1-p)),H(p,b)=(e bp+(1-p)) 2/e b,L(p,p r,p g)=p*p g+p r*(1-p g)。
In certain embodiments, described method uses β-binomial distribution instead of simple binomial distribution; This looks after stochastic sampling deviation.The Parameter N of β-binomial distribution is estimated as required based on each sample.Use offset correction F (p, b), H (p, b) and be not only p, look after amplification deviation.Straggling parameter b estimates from training data in advance based on each SNP.
In certain embodiments, described method uses to leak and corrects L (p, pr, p g), and be not only p; This looks after leakage deviation, i.e. different SNP and sample quality.In certain embodiments, parameter p g, p r, p oestimate in advance from training data based on each SNP.In certain embodiments, parameter p g, p r, p ocan upgrade with current sample, so that different sample quality to be described.
Model described be herein non-normal open and differentiated sample quality and differentiated SNP quality can be described.Different samples and SNP differently process, and as used β-binomial distribution by some embodiments, its mean value and variance are illustrated in the fact of the original bulk of DNA and the function of sample and SNP quality.
Platform modeling
Consider single SNP, wherein existing in blood plasma expectation allele ratio is r (based on genotype that is maternal and fetus).Estimate that allele ratio is defined as the allelic pre-number scoring of A in the DNA of combined female parent and fetus.About female genotype g mwith child genotype g c, estimate that allele ratio is provided by equation 1, assuming that genotype presents with allele ratio form equally.
r=fg c+(1-f)g m(1)
In the observations of SNP by the quantity n of existing each allelic mapping reading aand n bcomposition, n aand n breading degree of depth d altogether.Assuming that threshold value has been applied to mapping probabilities and phred mark can be considered to correct to make mapping and allelotrope observations.Phred mark is numerical metric, and relating in the particular measurement of particular bases is wrong probability.In one embodiment, wherein base is measured by order-checking, and phred mark can calculate from the ratio corresponding to the dye strength of institute's interpretation base and the dye strength of other base.The most naive model observing likelihood is binomial distribution, and each in its supposition d reading obtains independently from a great Chi with allele ratio r.Equation 2 describes this model.
P(n a,n b|r)=p bino(n a;n a+n b,r)= (2)
Binomial model can be expanded by various ways.When genotype that is maternal and fetus is all A or all B, the expectation allele ratio in blood plasma will be 0 or 1, and binomial probability can not define well.In fact, beyond thought allelotrope is in fact sometimes observed.In one embodiment, calibrated allele ratio is likely used to allow a small amount of beyond thought allelotrope.In one embodiment, likely use training data to carry out modeling to the beyond thought allelic ratio appeared on each SNP, and use this model to correct expectation allele ratio.When estimating that allele ratio is not 0 or 1, due to amplification deviation or other phenomenon, viewed allele ratio can not be restrained with the sufficiently high reading degree of depth for expectation allele ratio.Then allele ratio can be modeled as by the β distribution estimated centered by allele ratio, produce for P (n a, n b| β-binomial distribution r), its variance is higher than binomialexpression.
F (a, b, g will be defined as at the platform model of the response of single SNP c, g m, f) (3), or in view of genotype that is maternal and fetus, observe n a=a and n bthe probability of=b, it also depends on fetus mark by equation 1.The functional form of F can be binomial distribution, β-binomial distribution or similar function as discussed above.
F(a,b,g c,g m,f)=P(n a=a,n b=b|g c,g m,f)=P(n a=a,n b=b|r(g c,g m,f)) (3)
In one embodiment, child's mark can measure as follows.Maximum likelihood estimation for the fetus mark f of antenatal test can be derived when not using male parent information.In the disabled situation of the genetic data of male parent, such as, when in fact recorded father is not the genetic father of fetus, this can be appropriate.Fetus mark is estimated in the SNP set being 0 or 1 from female genotype, obtains only two possible genotypic set of fetus.By S 0be defined as female genotype be 0 SNP set and by S 1be defined as the SNP set that female genotype is 1.At S 0on possible fetus genotype be 0 and 0.5, obtain one group of possible allele ratio R 0(f)={ 0, f/2}.Similarly, R 1(f)={ 1-f/2,1}.Can expand this method a little to comprise the SNP that female genotype is 0.5, but due to the more big collection of possible allele ratio, these SNP will provide less information.
By N a0and N b0be defined through at S 0in s about the n of SNP asand n bsthe vector formed, and N a1and N b1similarly for S 1.The maximum likelihood estimation of f defined by equation 4.
arg max fP(N a0,N b0|f)P(N a1,N b1|f) (4)
Assuming that each SNP allelotrope counting independently with the blood plasma allele ratio of SNP for condition, probability can be expressed as the result (5) in each set on SNP.
P(N a0,N b0|f)=P(n as,n bs|f) (5)
P(N a1,N b1|f)=P(n as,n bs|f)
Dependency about f is by possible allele ratio R 0(f) and R 1the set of (f).SNP probability P (n as, n bs| the maximum likelihood genotype that can take f) f as condition by supposition gets approximation.Under suitably high fetus mark and the reading degree of depth, the genotypic selection of maximum likelihood will have high confidence level.For example, under the fetus mark 10% and the reading degree of depth of 1000, consider that mother has the SNP of genotype zero.Estimate that allele ratio is 0 and 5%, this can easily distinguish under the sufficiently high reading degree of depth.To estimate that child's genotype substitutes in equation 5, produce the integrated equation formula (6) being used for fetus mark and estimating.
arg max f(6)
Fetus mark must be optimized and easily can be implemented by the linear search of constraint in scope [0,1] and therefore.
Maximum likelihood genotype when there is the low scale degree of depth or high noise levels, preferably can not suppose maximum likelihood genotype, assuming that can cause artificial high confidence level.Another kind method will be that possible genotype at each SNP is sued for peace, for S 0in SNP for P (n a, n b| f) produce following formula (7).Prior probability P (r) can suppose at R 0f () is uniform, or can based on colony's frequency.S 1the expansion of group is inappreciable.
P(n a,n b|f)= (7)
In certain embodiments, probability can be derived as follows.Degree of confidence can from two kinds of hypothesis H tand H fdata likelihood calculate.Based on response model, estimate fetus mark, maternal gene type, allele population frequency and blood plasma allelotrope counting, derive the likelihood of often kind of hypothesis.
Define following symbol:
Assuming that each SNP observation independently with blood plasma allele ratio for condition, the likelihood of parental right hypothesis is the result of the likelihood on SNP.Following equation is derived the likelihood of single SNP.Equation 8 is general expression of the likelihood of any hypothesis h, and described expression formula will be broken down into H subsequently tand H fparticular case.
P(n a,n b|h,G m,G tf,f)=
= (8)
At H twhen, according to equation 9, suppose that father is true father and fetus genotype is hereditary with hypothesis father genotype from female genotype.
P(n a,n b|,G m,G tf,f)= (9)
At H fwhen, suppose that father is not true father.The genotypic optimum estimate of real father is provided by the colony's frequency at each SNP.Therefore, the genotypic probability of child by known maternal gene type and colony's frequency measurement, as in equation 10.
P(n a,n b|,G m,G tf,f)=
The degree of confidence C of correct parental right pbayes rule (Bayes rule) (11) is used to calculate from the result of the SNP with two kinds of likelihoods.
Cp= (11)
Use the Maximum Likelihood Model of fetus percentage fractional
By measuring in maternal serum the DNA of contained unmanaged flexibility or to measure fetus ploidy state by the genotype material measured in any biased sample be a job highly significant.There is multiple method, such as, perform reading analysis of accounts, if the specific karyomit(e) wherein supposing fetus is three bodies, the total amount so from the described chromosomal DNA found in maternal blood will raise to some extent relative to reference to karyomit(e).The amount normalization method making the DNA estimated for every bar karyomit(e) for detecting a kind of mode of the trisomy in this kind of fetus, such as, according to corresponding to the quantity of specifying chromosomal SNP in analytic set, or according to chromosomal can the quantity of demapping section uniquely.After to measuring result normalization method, any karyomit(e) that the amount of the DNA measured by judgement exceedes a certain threshold value is three bodies.This method is described in the people PNAS such as model (Fan), and 2008; 105 (42); 16266-16271 page; And the people BMJ 2011 such as mound (Chiu); In 342:c7401 in the paper of Qiu Dengren, realize normalization method by calculating Z score as follows:
Z score=((in a test case No. 21 karyomit(e) per-cents)-(with reference to No. 21 karyomit(e) average percents in contrast))/(standard deviations with reference to No. 21 karyomit(e) per-cents in contrast) of No. 21 karyomit(e) per-cents in a test case.
These methods use the single hypothesis method of exclusion to measure fetus ploidy state.But they have some obvious shortcomings.Because for measure fetus ploidy these methods can not per sample in foetal DNA per-cent and become, so they use a cutoff; Such result is mensuration accuracy is not best, and those situations that in mixture, the per-cent of foetal DNA is relatively low will suffer the poorest accuracy.
In one embodiment, the inventive method is for measuring fetus ploidy state, and described method relates to the mark considering foetal DNA in sample.In another embodiment of the present invention, described method relates to use maximum likelihood estimation.In one embodiment, the inventive method relates to the DNA per-cent in fetus or placenta source in calculation sample.In one embodiment, the threshold value for interpretation dysploidy adjusts adaptively based on calculated foetal DNA per-cent.In certain embodiments, the method for the per-cent estimating the DNA of fetal origin in DNA mixture comprises: obtain the biased sample comprising the genetic material from mother and the genetic material from fetus; Obtain the genetic material from fetus father; Measure the DNA in biased sample; Measure the DNA in father's sample; And use the DNA measuring result of biased sample and father's sample, calculate the per-cent of the DNA of fetal origin in biased sample.
In one embodiment of the invention, the mark of foetal DNA or the per-cent of foetal DNA in mixture can be measured.In certain embodiments, described mark can only use the gene type measuring result obtained about maternal plasma sample itself to calculate, and described maternal plasma sample is the mixture of fetus and maternal DNA.In certain embodiments, described mark can also use measured or otherwise known maternal gene type and/or measured or otherwise known father's genotype to calculate.In certain embodiments, the foetal DNA per-cent measuring result that the mixture about the DNA of maternal and fetus can be used to obtain and the understanding of parent's background is calculated.In one embodiment, the mark of foetal DNA can use colony's frequency to calculate with the probability adjustment model based on specific allelotrope measuring result.
In one embodiment of the invention, degree of confidence can calculate based on the accuracy of the measurement result of fetus ploidy state.In one embodiment, there is the hypothesis (H of PRML mainly) degree of confidence can calculate as follows: (1-H mainly)/∑ (all H).If the distribution of all hypothesis is known, so likely measure the degree of confidence of hypothesis.If parent genotype information is known, so likely measure the distribution of all hypothesis.If known about the knowledge of the expectation distribution of the expectation distribution of the data of euploid fetus and the data of aneuploid fetus, so likely calculate the degree of confidence that ploidy measures.If parent genotype data are known, so likely calculate these and estimate distribution.In one embodiment, can use about normal hypothesis and about the knowledge of the test statistics distribution of abnormal hypothesis to measure the reliability of interpretation and to optimize threshold value, thus make more reliable interpretation.When the amount of foetal DNA in mixture and/or per-cent low time, this is especially suitable for.It will help avoid because test statistics finds that the fetus being actually aneuploid is euploid situation, and such as Z statistics is no more than the threshold value that the threshold value based on the situation optimization for higher foetal DNA per-cent obtains.
In one embodiment, the copy number that method disclosed herein may be used for the target chromosome of female parent and fetus in the mixture of the genetic material by measuring maternal and fetus measures fetus dysploidy.This method can need to obtain the maternal tissue of the genetic material comprising maternal and fetus; In certain embodiments, this maternal tissue can be the maternal blood plasma or tissue that are separated from maternal blood.This method also can need by processing above-mentioned maternal tissue, obtains mixture that is maternal and fetal genetic material from described maternal tissue.This method can need gained genetic material to be assigned in multiple response sample; Random providing package containing the target sequence of target chromosome independent response sample and do not comprise the independent response sample of target sequence of target chromosome, such as, to sample execution high-flux sequence.This method can the genetic material of Water demand presence or absence in described independent response sample target sequence with provide represents in response sample presence or absence may euploid fetal chromosomal the first quantity binary result and represent that in response sample presence or absence may the binary result of the second quantity of aneuploid fetal chromosomal.The binary result of arbitrary quantity all can such as calculate by means of information technology, described information technology to be mapped to specific karyomit(e), chromosomal specific region, specific gene seat or one group of locus sequence reads count.This method can relate to the quantity based on locus in the length of chromosome length, chromosomal region or described group, makes the quantity normalization method of binary event.This method can need use first quantity, calculates the expectation distribution for the quantity of the binary result of possibility euploid fetal chromosomal in response sample.This method can need the estimated score of the foetal DNA found in use first quantity and mixture, such as be multiplied by (1+n/2) by the expectation reading count distribution of the quantity making the binary result for possibility euploid fetal chromosomal, wherein n estimates fetus mark, calculates the expectation distribution for the quantity of the binary result of possibility aneuploid fetal chromosomal in response sample.In certain embodiments, sequence reads can process by probability mapping instead of binary result mode; This method will produce higher accuracy, but needs larger computing power.Fetus mark can be estimated by multiple method, and some in these methods are described in other place of the present invention.This method can relate to use maximum likelihood method and determine whether the second quantity corresponds to the possible aneuploid fetal chromosomal for euploid or aneuploid.This method can relate in view of measured data, is the ploidy state corresponding to the maximum hypothesis of correct likelihood by the ploidy state interpretation of fetus.
It should be noted that the use of Maximum Likelihood Model may be used for improving the accuracy of any method measuring fetus ploidy state.Similarly, the degree of confidence of any method measuring fetus ploidy state can be calculated.The use of Maximum Likelihood Model will improve the accuracy that ploidy mensuration is any method using single hypothesis repulsion technique to carry out.Maximum Likelihood Model may be used for any method of the likelihood distribution that can calculate normally and under abnormal conditions.The use of Maximum Likelihood Model means the ability for ploidy interpretation calculating degree of confidence.
The further discussion of described method
In one embodiment, method disclosed herein utilizes the quantitative measurment at each allelic independent observation number of polymorphic locus, and wherein this does not relate to the allelic ratio of calculating.This is different from such as based on the method for the certain methods of microarray, and these methods provide the information about two kinds of allelic ratios at locus, but does not carry out quantitatively arbitrary allelic independent number of seeing.Certain methods as known in the art can provide the quantitative information about independent observation number, but for determining that the calculating of ploidy only utilizes allele ratio, but do not utilize quantitative information.In order to the importance retained about the information of independent observation number is described, consider the sample gene seat with two kinds of allelotrope (A and B).In the first experiment, observe 20 A allelotrope and 20 B allelotrope; In the second experiment, observe 200 A allelotrope and 200 B allelotrope.In two experiments, described ratio (A/ (A+B)) is equal to 0.5, but the second experiment is tested than first and be conveyed about the deterministic information of the allelic frequency of A or B more.The inventive method does not utilize allele ratio, but uses quantitative data to come to carry out modeling to the most probable gene frequency at each polymorphic locus more accurately.
In one embodiment, the inventive method builds the genetic model of the measuring result for adding up to multiple polymorphic locus, thus distinguishes trisomy and disomy better and measure trisomy type.In addition, the inventive method is incorporated with genetic linkage information with Enhancement Method accuracy.This and the certain methods be wherein averaged the allele ratio of polymorphic locuses all on karyomit(e) as known in the art are formed and contrast.Method disclosed herein clearly to distribute with the gene frequency estimated in the disomy not being separated generation during the early stage mitotic division of fetation and trisomy carry out modeling to by not being separated during meiosis I, not being separated during meiosis II.In order to illustrate why this is important, if there is no intersects, and not being separated so during meiosis I will produce trisomy, wherein from the homologue that parent's heredity two is different; Not being separated during meiosis II or during the early stage mitotic division of fetation will produce from two copies of the same homologue of a parent.Often kind of situation is all at each polymorphic locus and thinking that jointly all locus chain for physically (locus namely on same karyomit(e)) produce different expectation gene frequencies.Intersection causes the exchange of the genetic material between homologue, makes hereditary pattern more complicated; But the inventive method is by using genetic linkage information, and the physical distance namely between recombination fraction information and locus adapts to this point.Be not separated be not separated with meiosis II or mitotic division to distinguish meiosis I better, the crossover probability that the distance along with distance kinetochore increases and increases by the inventive method is incorporated in model.Meiosis II and mitotic division are not separated can be distinguished by the following fact: mitotic division be not separated usually produce a homologue unanimously or almost consistent copy and two homologues existing after meiosis II not departure event are usually different because one or more between the gamete emergence period intersects.
In one embodiment, if supposition is disomy, so the inventive method possibly cannot measure the haplotype of parent.In one embodiment, when trisomy, the inventive method can by using the fact obtaining blood plasma from two copies of a parent, make about one or both the mensuration of haplotype of parent, and parent's phase information can by noticing which two copy is measuring from discussed parent's heredity.Specifically, child can two copies (not mating trisomy) of two (coupling trisomys) in the identical copies of hereditary parent or parent.At each SNP, the likelihood mating trisomy and do not mate trisomy can be calculated.Do not use and explain that the ploidy interpretation method of the chain model intersected is by total likelihood of the simple weighted average value form calculus trisomy all karyomit(e) to mate and does not mate trisomy.But owing to producing the biomechanism being separated error and intersection, only have when occurring to intersect, the trisomy on karyomit(e) just can change to from coupling or not (and vice versa).The inventive method considers the likelihood of intersection with probabilistic manner, impels the accuracy of ploidy interpretation to be greater than not consider those methods of the likelihood intersected.
In one embodiment, with reference to karyomit(e) for measuring child's mark and noise level amount or probability distribution.In one embodiment, child's mark, noise level and/or probability distribution only use the genetic information that can obtain from the karyomit(e) of ploidy state to be determined to measure.The inventive method is in nothing is with reference to chromosomal situation and work when not fixing specific child's mark or noise level.This is compared to remarkable improvement and the distinctive points wherein carrying out the chromosomal genetic data of self-reference and must be used for the method for calibrating child's mark and chromosome behavior as known in the art.
Do not need reference karyomit(e) in the embodiment measuring fetus mark, measure hypothesis as follows:
* prior probability (H)
By algorithm and with reference to karyomit(e), usual supposition is disomy with reference to karyomit(e), and then (a) can suppose based on this and with reference to chromosome number certificate, fix most probable child's mark and the horizontal N of random noise:
And be then reduced to
Or (b) is based on this hypothesis with reference to chromosome number certificate, estimate child's mark and noise level distribution.Specifically, for cfr and N, an only value can not be fixed, but for more wide region allocation probability p (cfr, N) of possible cfr, N value:
Wherein prior probability (cfr, N) is the prior probability of specific child's mark and noise level, by priori and measuring.If desired, only even within the scope of cfr, N.Then can write:
Above two kinds of methods all obtain good result.
It should be noted that in some cases, it is unacceptable, impossible or infeasible for using with reference to karyomit(e).In this case, likely push away and derive each chromosomal best ploidy interpretation individually.Specifically:
Can measure each chromosomal as mentioned above individually, assuming that hypothesis H, not only suppose disomy for reference to karyomit(e).Make in this way, in order to keep noise and these two parameters of child's mark to fix, likely for every bar karyomit(e) and often kind of hypothesis, fix any one in described parameter, or keep two parameters to be Probability Forms.
The measurement of DNA easily produces noise and/or error, especially measures when DNA measures few or when DNA mixes with the DNA polluted.This noise causes the accuracy of genotype data to reduce, and the accuracy of ploidy interpretation reduces.In certain embodiments, some other methods of platform modeling or noise modeling may be used for the harmful effect that measures ploidy antinoise.The inventive method uses the conjunctive model of two channels, and it explains the random noise that amount, DNA quality and/or scheme quality owing to inputting DNA cause.
This and wherein ploidy as known in the art mensuration are that the certain methods that the ratio of the allelotrope intensity being used in locus carries out forms contrast.This method hampers SNP noise modeling accurately.Specifically, the error of measuring result is not depend on measured channel strength ratio especially usually, and this makes model simplification be use one-dimension information.The mutual accurate modeling of noise, channel quality and channel requires two-dimentional conjunctive model, and this can not use allele ratio to carry out modeling.
Specifically, two channel informations are projected to ratio r (wherein f (x, y) is r=x/y) and be not suitable for interchannel noise and bias modeling accurately.Noise on specific SNP is not the function of ratio, i.e. noise (x, y) ≠ f (x, y), but is actually the Copula of two channels.For example, in binomial model, the variance of the noise of measured ratio is r (1-r)/(x+y), and it is not merely the function of r.In this model, which includes any channel deviation or noise, if on SNP i, viewed channel X value is x=a ix+b i, wherein X is actual channel values, b iextra channel deviation and random noise.Similarly, if y=c iy+d i.Viewed ratio r=x/y can not predict real rate X/Y or exactly to residual noise modeling, because (aiX+bi)/(ciY+di) is not the function of X/Y.
Method disclosed herein describes the effective means using the associating binomial distribution of all channels of independent measurement noise and deviation to be carried out to modeling.Can speak of in the part of the consistent deviation of each SNP, P (good) and P (reference | bad), P (sudden change | bad) (they adjust SNP behavior effectively) and find correlate equation in other place in the document.In one embodiment, the inventive method uses β-binomial distribution, and it avoids limiting the practice only depending on allele ratio, but in fact carries out modeling based on two channel count to behavior.
In one embodiment, method disclosed herein can by using all available measuring results, the genetic data interpretation found from maternal blood plasma breed in the ploidy of fetus.In one embodiment, method disclosed herein can by using the measuring result of only parent's background subgroup, the genetic data interpretation found from maternal blood plasma breed in the ploidy of fetus.Certain methods as known in the art only uses measured genetic data, and wherein parent's background is from AA|BB background, and that is, wherein parent both sides are all isozygotied at appointment locus, but for not isoallele.A problem of this method is little from the ratio of the polymorphic locus of AA|BB background, is usually less than 10%.In an embodiment of method disclosed in this article, described method is not used in the hereditary measuring result that parent's background is the maternal blood plasma that the locus of AA|BB carries out.In one embodiment, the inventive method only uses parent's background to be the blood plasma measuring result of those polymorphic locuses of AA|AB, AB|AA and AB|AB.
Certain methods as known in the art relates to and being averaged the allele ratio from SNP in AA|BB background, and wherein the genotype of parent both sides all exists, and requires to measure ploidy interpretation from the average allele ratio these SNP.This method suffers remarkable inaccuracy due to differentiated SNP behavior.It should be noted that this method supposes that the genotype of parent both sides is all known.By contrast, in certain embodiments, the inventive method uses combined channel distributed model, and it does not suppose either party existence of parent, and does not suppose even SNP behavior.In certain embodiments, the inventive method explains different SNP behavior/weightings.In certain embodiments, the inventive method does not need to understand one or both genotype of parent.Below the example how the inventive method can realize this point:
In certain embodiments, the log-likelihood of hypothesis can be measured based on each SNP.On specific SNP i, assuming that fetus ploidy hypothesis H and foetal DNA per-cent cf, the log-likelihood of viewed data D is defined as:
Wherein in view of hypothesis H, m are the real maternal gene types of possible, f is possible real father's genotype, wherein m, f{AA, AB, BB}, and wherein c is possible child's genotype.Specifically, about monosomy, c; About disomy, c; About trisomy, c.It should be noted that comprising parent genotype data usually produces ploidy more accurately and measure, but it is necessary that parent genotype data not that the inventive method well works.
Certain methods as known in the art relates to and being averaged the allele ratio from SNP, wherein mother is the not isoallele still measured in blood plasma (AA|AB or AA|BB background) of isozygotying, and requires to measure ploidy interpretation from the average allele ratio these SNP.This method is intended for use the wherein disabled situation of male parent gene type.It should be noted that the blood plasma declared on specific SNP be heterozygosis and do not exist isozygoty and contrary father BB to have many be accurately problematic: because when low child's mark, seem that there is B allelotrope just may exist noise; In addition, look there is not the allelic loss that B may be fetus measuring result.Even when in fact measuring the heterozygosity of blood plasma, this method will distinguish male parent trisomy.Specifically, be AA about wherein mother, and wherein in blood plasma, measure the SNP of some B, if father is GG, so gained child genotype is AGG, makes the average ratio of A be 33% (child's mark=100%).But when father is AG wherein, gained child genotype can be AGG about coupling trisomy, contribution 33%A ratio; Or be AAG about not mating trisomy, make average ratio be more prone to 66%A.In view of the multiple trisomys on the karyomit(e) intersected, bulk dyeing body can have between invariably mating trisomy and all not mate between trisomy any does not mate trisomy, and this ratio can change between 33% to 66%.About common disomy, ratio should be about 50%.When not using the accurate error model of chain model or mean value, this method will miss multiple situations of male parent trisomy.By contrast, method disclosed herein, based on available genotype information and colony's frequency, is that each candidate's parent genotype distributes parent genotype probability, and indefinitely requires parent genotype.In addition, even if when there are not or exist parent genotype data, method disclosed herein still can detect trisomy, and chain model can be used by differentiating to compensate from matching the possible point of crossing of not mating trisomy.
Certain methods as known in the art is declared a kind of for being averaged the allele ratio of all unknown SNP of wherein genotype that is maternal or male parent, and for measuring the method for ploidy interpretation from the average ratio on these SNP.But the method realizing these objects is not open.Method disclosed herein can make ploidy interpretation accurately in this case, and in the document disclosed elsewhere concrete practice mode, use joint probability maximum likelihood method and optionally utilize SNP noise and buggy model and chain model.
Certain methods as known in the art relates to and to be averaged allele ratio and to require to measure ploidy interpretation from the average allele ratio at one or several SNP.But these class methods are unfavorable uses chain concept.Method disclosed herein does not have these shortcomings.
Sequence length is used to measure DNA source as priori
Report, maternal different with the sequence length distribution of the DNA of fetus, wherein fetus is general shorter.In one embodiment of the invention, likely use the prior knowledge in rule of thumb data form, and be the estimated length structure prior distribution of mother's (P (X| is maternal)) and foetal DNA (P (X| fetus)).In view of the DNA sequence dna of new unidentified length x, be maternal or the priori likelihood of fetus based on x, likely distribute and specify DNA sequence dna to be probability that is maternal or foetal DNA.Specifically, if P (x| is maternal) > P (x| fetus), so DNA sequence dna can be classified as female parent, wherein P (x| is maternal)=P (x| is maternal)/[(P (x| is maternal)+P (x| fetus)]; And if p (x| is maternal) < p (x| fetus), so DNA sequence dna can be classified as fetus, P (x| fetus)=P (x| fetus)/[(P (x| is maternal)+P (x| fetus)].In one embodiment of the invention, can can distribute to the specific female parent for described sample of sequencing of female parent or fetus and the distribution of foetal sequence length by high probability by consideration, and then described sample specific distribution can be used as the expectation size distribution of described sample.
The minimum variations readings degree of depth is dropped to for making order-checking cost
In the multiple clinical trials about diagnosis, such as, at people BMJ 2011 such as mounds; In 342:c7401, set the scheme with multiple parameter, and then perform same approach for each patient in test with identical parameters.When using order-checking to measure the ploidy state of the fetus bred in mother's body as the method measuring genetic material, a correlation parameter is reading quantity.Reading quantity can refer to part swimming lane, all swimming lane on the quantity of the quantity of actual read number, predetermined reading, sequenator or full flow cell.In these researchs, reading quantity is setting under the level of the levels of accuracy guaranteed desired by the acquisition of all or nearly all sample usually.Order-checking is a kind of technology of costliness at present, cost is approximately every 5,000,000 can map reading 200 dollars, and although price reduces, allow any method still operated based on the diagnosis of order-checking with less reading under similar levels of accuracy must will save mint of money.
The accuracy that ploidy measures depends on many factors usually, comprises the mark of foetal DNA in reading quantity and mixture.When in mixture, the mark of foetal DNA is higher, accuracy is usually higher.Meanwhile, if reading quantity is larger, so accuracy is usually higher.Likely occur measuring the situation of ploidy state with two kinds of situations with suitable accuracy, wherein in the mixture that has of the first situation the mark of foetal DNA lower than the second situation; And the reading carrying out in the first case checking order is more than the second situation.Likely use the estimated score of foetal DNA in mixture as the guidance reaching the reading quantity of specifying levels of accuracy to measure.
In one embodiment of the invention, one group of sample can be run, wherein different to the reading degree of depth of the different sample order-checkings in described group, the appointment levels of accuracy that the reading quantity that wherein each sample runs can reach with the calculating mark of foetal DNA in each mixture is selected.In one embodiment of the invention, this can need the mark measuring to measure foetal DNA in mixture to biased sample; This estimation of fetus mark can be carried out with order-checking, and order-checking can be carried out with Plutarch is graceful, and order-checking can be carried out with qPCR, and order-checking can be carried out with SNP array, and order-checking can specify not homoallelic any method of locus to carry out with distinguishing.The needs carrying out the estimation of fetus mark can be contained all or one group of hypothesis selecting fetal parts and eliminate by comprising in the hypothesis group considered when comparing with actual measurement data.After determining the foetal DNA mark in mixture, the quantity of the sequence of the pending reading of each sample can be measured.
In one embodiment of the invention, 100 pregnant woman access its corresponding OB, and their blood is drawn into containing in anti-cracking agent and/or the heparin tube for the reagent of deactivation DNA enzymatic.Everyone takes a test kit home for they, and the father of the fetus allowing it breed provides saliva sample.These two groups of genetic material of all 100 couples of Mr. and Mrs are sent back to laboratory, wherein mother's blood are rotated reduction of speed and are separated leukocytic cream and blood plasma.Blood plasma comprises the mixture of the DNA in maternal DNA and placenta source.Maternal leukocytic cream and male parent blood use SNP array to carry out gene type, and with the DNA in Xiu Er Selleck special hybridization probe target female parent plasma sample.Be used to generation 100 signature library for each in maternal sample with the DNA that probe launches, wherein the different mark of each sample marks.A part is taken out from each storehouse, each in those parts is mixed and adds to complex method on two swimming lanes of Yi Lu meter Na HISEQ DNA sequencer, wherein each swimming lane generation about 5,000 ten thousand can map reading, 100 kinds of compounding mixtures produce about 100,000,000 and can map reading, or each sample about 1,000,000 reading.Sequence reads is for measuring the mark of foetal DNA in each mixture.In the mixture of 50 in sample, foetal DNA is more than 15%, and 1,000,000 readings are enough to the ploidy state of confidence measure fetus by 99.9%.
In remaining mixture, the foetal DNA of 25 kinds is between 10% and 15%; A swimming lane part for each in the related libraries prepared from these mixtures being incorporated in again HISEQ runs, and each sample produces 2,000,000 readings again.By added together for this two data unit sequence of each in the mixture of foetal DNA between 10% and 15%, and gained each sample 3,000,000 reading is enough to the ploidy state of those fetuses of confidence measure by 99.9%.
In remaining mixture, the foetal DNA of 13 kinds is between 6% and 10%; A swimming lane part for each in the related libraries prepared from these mixtures being incorporated in again HISEQ runs, and each sample produces 4,000,000 readings again.By added together for this two data unit sequence of each in the mixture of foetal DNA between 6% and 10%, and gained each mixture 5,000,000 total indicator reding is enough to the ploidy state of those fetuses of confidence measure by 99.9%.
In remaining mixture, the foetal DNA of 8 kinds is between 4% and 6%; A swimming lane part for each in the related libraries prepared from these mixtures being incorporated in again HISEQ runs, and each sample produces 6,000,000 readings again.By added together for this two data unit sequence of each in the mixture of foetal DNA between 4% and 6%, and gained each mixture 7,000,000 total indicator reding is enough to the ploidy state of those fetuses of confidence measure by 99.9%.
In residue four kinds of mixtures, their foetal DNA is all between 2% and 4%; A swimming lane part for each in the related libraries prepared from these mixtures being incorporated in again HISEQ runs, and each sample produces 1,200 ten thousand readings again.By added together for this two data unit sequence of each in the mixture of foetal DNA between 2% and 4%, and gained each mixture 1,300 ten thousand total indicator reding is enough to the ploidy state of those fetuses of confidence measure by 99.9%.
This method needs on HISEQ machine, have six swimming lanes that check order to make to reach 99.9% accuracy more than 100 samples.If each sample needs the operation of equal amts to be 99.9% to the accuracy guaranteed each ploidy and measure, so by needs 25 order-checking swimming lane; And if allow 4% without interpretation rate or specific inaccuracy, so 14 order-checking swimming lanes just can achieve.
Use original gene typing data
Can use and have multiple to the Fetal genetic information measured by the foetal DNA found in maternal blood to the method completing NPD.Some in these methods relate to use SNP array to measure foetal DNA, and certain methods relates to non-targeted order-checking, and certain methods relates to target order-checking.Target order-checking can target SNP, can target STR, can other polymorphic locus of target, can the non-polymorphic locus of target or its some combinations.Some in these methods can relate to and use commercial or special allelotrope decipherer, from from the allelic identity of intensity data interpretation of sensor of carrying out the machine measured.For example, Yi Lu meter Na Yin flies Nimes system or high fly gene chip microarray system and relate to the bead or microchip that are connected with the DNA sequence dna that can hybridize to DNA complementary segment; After hybridization, the fluorescent characteristic of the sensor molecule that can detect changes.Also there is sequence measurement, such as Yi Lu meter Na Suolaisa gene order-checking instrument or ABI Suo Lide gene order-checking instrument, wherein check order to the gene order of DNA fragmentation; After the extension of the DNA chain with the chain complementation needing to be checked order, usually extend the identity of Nucleotide via the fluorescence or radiation marker detection that are attached to complementary nucleotide.In all these methods, genotype or sequencing data are usually based on fluorescence or other signal or lack these signals to measure.These systems usually and low-level software packaging combine, described software package carries out specific alleles interpretation (secondary genetic data) from the modulating output (original genetic data) of fluorescence or other proofing unit.For example, in the allelic situation of the appointment on SNP array, described software will make interpretation, if the fluorescence intensity such as measured is higher or lower than a certain threshold value, and so a certain SNP of presence or absence.Similarly, the output of sequenator is the color atlas of the fluorescence level of often kind of dyestuff detected by instruction, and following interpretation made by described software: a certain base pair is A or T or C or G.High-flux sequence instrument carries out a series of measurement so usually, and interpretation represents the reading of the most possible structure of the DNA sequence dna be sequenced.At this direct modeling of color atlas exported and be defined as original genetic data, and think that the base pair/SNP interpretation undertaken by software is secondary genetic data at this.In one embodiment, raw data refers to raw intensity data, and it is the unprocessed output of gene type platform, and wherein gene type platform can refer to SNP array or order-checking platform.Secondary genetic data refers to treated genetic data, wherein made allelotrope interpretation, or sequence data distributes base pair, and/or sequence reads is mapped to genome.
These allelotrope interpretations of many higher levels of applications exploitings, SNP interpretation and sequence reads, i.e. the secondary genetic data of genotyping software generation.For example, DNANEXUS, ELAND or MAQ are by employing order-checking reading and they are mapped to genome.For example, when non-invasive prenatal diagnosis, such as PARENTAL SUPPORT tMcomplex information a large amount of SNP interpretation can be utilized to measure individual genotype.In addition, before implantation when gene diagnosis, likely adopt one group to be mapped to genomic sequence reads, and by adopting the normalization method counting being mapped to the reading of every bar karyomit(e) or chromosome segment, likely can measure individual ploidy state.When non-invasive prenatal diagnosis, likely can adopt and carry out for DNA existing in maternal blood plasma one group of sequence reads measuring, and they are mapped to genome.Then the normalization method of the reading being mapped to every bar karyomit(e) or chromosome segment can be adopted to count, and use described data to measure individual ploidy state.For example, likely can draw to draw a conclusion: being extracted in the fetus bred in mother's body of blood those karyomit(e)s with disproportionately a large amount of reading is three bodies.
But in fact, the initial output of surveying instrument is simulating signal.When by base pair a certain to the relevant software interpretation of order-checking software, such as described software can interpretation base pair be T, and in fact, described interpretation is that most probable interpretation thought by software.In some cases, but described interpretation can have low confidence, and such as simulating signal can indicate particular bases may be T to only having 90%, and 10% may be A.In another example, relevant to SNP array reader genotype interpretation software can a certain allelotrope of interpretation be G.But in fact, basic simulating signal can indicate described allelotrope to only have 70% may be G, and described allelotrope 30% may be T.In these cases, when higher level application uses the genotype interpretation and sequence interpretation that are obtained by lower level software, they lost some information.That is, as the original genetic data directly measured by gene type platform may than pass through the secondary genetic data that measures of the software package that connects chaotic, but it contains more information.By secondary genetic data sequence mapping in genomic process, reject multiple reading, because the sharpness that reads of some bases is not and/or map unclear.When use original genetic data sequence reads time, in those disallowable when being first converted into secondary genetic data sequence reads readings all or multiple can by using with probabilistic manner process reading.
In one embodiment of the invention, higher level software do not rely on measured by more low-level software allelotrope interpretation, SNP interpretation or sequence reads.In fact, the calculating of higher level software is the simulating signal based on directly measuring from gene type platform.In one embodiment of the invention, modification is as PARENTAL SUPPORT tMbased on information method with make its reconstruct embryo/fetus/child genetic data ability through transformation directly to use as the original genetic data by gene type platform measuring.In one embodiment of the invention, such as PARENTAL SUPPORT tMthe method based on information can not use when secondary genetic data and make allelotrope interpretation and/or chromosomal copy number interpretation at the original genetic data of use.In one embodiment of the invention, all gene interpretations, SNP interpretation, sequence reads, sequence mapping are all with probabilistic manner, by using as the raw intensity data directly measured by gene type platform, instead of original genetic data is converted into secondary base because of interpretation and processes.In one embodiment, the DNA measuring result preparing sample for the relative probability calculating allelotrope counting probability and mensuration often kind hypothesis comprises original genetic data.
In certain embodiments, described method can improve and have the accuracy of the target individual genetic data of the genetic data of at least one related individuals, and described method comprises acquisition to be had specific original genetic data to the genome of target individual and has specific genetic data to the genome of related individuals; Create the hypothesis one group of one or more which which section chromosomal about related individuals may correspond to those sections in the genome of target individual; In view of the original genetic data of target individual and the genetic data of related individuals measure the probability of each hypothesis; And to often kind, use supposes that relevant probability measures the most probable state of the actual genetic material of target individual.In certain embodiments, described method can measure the copy number of chromosome segment in the genome of target individual, and described method comprises establishment one group about the copy number hypothesis that there is how many chromosome segment copy in the genome of target individual; By the original genetic data from target individual be incorporated to data centralization from the genetic information of one or more related individuals; Estimate the feature of the platform response relevant to data set, wherein platform response can change to another from an experiment; In view of data set and platform response characteristic, calculate the conditional probability of each copy number hypothesis; And suppose based on most probable copy number, measure the copy number of chromosome segment.In one embodiment, the inventive method can measure at least one chromosomal ploidy state in target individual, and described method comprises acquisition from target individual and the original genetic data from one or more related individuals; For every bar karyomit(e) of target individual, create one group of at least one ploidy state hypothesis; Use one or more technical skill to measure the statistical probability of often kind of ploidy state hypothesis in described group, often kind of technical skill used is in view of gained genetic data; For often kind of ploidy state hypothesis, combination is as the statistical probability measured by one or more technical skill; And based on the combination statistical probability that often kind of ploidy state is supposed, measure each chromosomal ploidy state in target individual.In one embodiment, the inventive method can measure the allele status in one group of allelotrope of parent one or both and optionally one or more related individuals of target individual and target individual, described method comprise obtain from target individual and from parent one or both and from the original genetic data of any related individuals; For target individual and parent one or both and optionally one or more related individuals, create one group of at least one allelotrope hypothesis, wherein said hypothesis describes allele status possible in allelotrope group; In view of gained genetic data, measure the statistical probability of often kind of allelotrope hypothesis in hypothesis group; And for target individual and parent one or both and optionally one or more related individuals, based on the statistical probability of often kind of allelotrope hypothesis, measure each allelic allele status in allelotrope group.
In certain embodiments, the genetic data of biased sample can comprise sequence data, and wherein sequence data can not be mapped to human genome uniquely.In certain embodiments, the genetic data of biased sample can comprise sequence data, and wherein sequence data is mapped to the multiple positions in genome, and wherein often kind may map all that to map be correct probability correlation with appointment.In certain embodiments, assuming that sequence reads is not relevant to the specific position in genome.In certain embodiments, sequence reads is relevant with the dependent probability belonging to described position to the multiple position in genome.
For measuring the counting process of chromosomal copy number
On the one hand, the invention is characterized in that the quantity by comparing the sequence mark aimed at coloured differently body tests the method for the skewed distribution of fetal chromosomal (referring to the United States Patent (USP) the 8th such as submitted on April 20th, 2012,296, No. 076, the mode that described patent is quoted hereby is in full incorporated herein).As known in the art, term " sequence mark " refers to the nucleotide sequence of relatively short (such as, 15-100) that may be used for differentiating a certain larger sequence, such as, be mapped to karyomit(e) or genome area or gene.In certain embodiments, described method relates to (i) and makes to comprise the sample of the mixture of the DNA of maternal and fetus and hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different target gene seat is to produce reaction mixture; Wherein said target gene seat is from multiple different karyomit(e); And wherein multiple different karyomit(e) comprise at least one doubtful there is skewed distribution in described sample the first chromosome and the second karyomit(e) of at least one supposition normal distribution in described sample; (ii) reaction mixture is made to experience primer extension reaction condition to produce amplified production; (iii) check order to obtain multiple sequence mark aimed at target gene seat to amplified production; Wherein the length of sequence mark is enough to distribute to specific targets locus; (iv) on computers multiple sequence mark is distributed to the target gene seat of its correspondence; V quantity that () measures the sequence mark aimed at the target gene seat of the first chromosome on computers and the quantity of sequence mark of aiming at the second chromosomal target gene seat; And (vi) compares quantity from step (v) on computers to determine the skewed distribution of presence or absence the first chromosome.
On the one hand, the invention provides the method (disclose No. WO2012/103031 referring to the PCT such as submitted on January 23rd, 2012, the described open mode hereby quoted in full is incorporated herein) for detecting presence or absence fetus dysploidy by comparing the relative frequency of target amplicon between karyomit(e).In certain embodiments, described method relates to (i) and makes to comprise the sample of the mixture of the DNA of maternal and fetus and hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different non-polymorphic target gene seat is to produce reaction mixture; Wherein said target gene seat is from multiple different karyomit(e); (ii) reaction mixture is made to experience primer extension reaction condition to produce the amplified production comprising target amplicon; (iii) carry out quantitatively to the relative frequency of the target amplicon from the first and second relative chromosome on computers; (iv) relative frequency of the target amplicon from the first and second relative chromosome is compared on computers; And (v) is based on the relative frequency of the first and second compared relative chromosome, differentiate presence or absence dysploidy.In certain embodiments, the first chromosome is doubtful euploid karyomit(e).In certain embodiments, the second karyomit(e) is the karyomit(e) of doubtful dysploidy.
Combination methods for prenatal diagnosis
The method that may be used for antenatal diagnosis or antenatal screening dysploidy or other hereditary defect has multiple.Other U.S. utility application the 11/603rd that is local and that submit on November 28th, 2006 in the document, No. 406; The U.S. utility application the 12/076th that on March 17th, 2008 submits to, a kind of such method is described in No. 348 and No. PCT/S09/52730th, PCT application, described method uses the genetic data of related individuals to improve accuracy, and the genetic data by related individuals is known or the genetic data of estimating target individuality (such as fetus).Other method for antenatal diagnosis relates to the level of some hormone measured in maternal blood, and wherein those hormones are relevant with various genetic abnormality.Its example is referred to as triple test, wherein measures the test of the level of several (usual two kinds, three kinds, four kinds or the five kinds) hormons in maternal blood.Use multiple method to measure the likelihood of designated result wherein, in wherein said method, do not have one itself to be in conclusive situation, be likely combined through the information that those methods provide and predict accurately than any one independent method to make.In triple test, be combined through the genetic abnormality prediction that information that three kinds of hormons provide can produce accurately more foreseeable than independent hormonal readiness.
There is disclosed herein a kind of method of more Accurate Prediction of the possibility for making the genetic abnormality about the genetic state of fetus, especially fetus, described method comprises predicting the outcome of the genetic abnormality of combination fetus, and wherein those predictions use multiple method to make." more accurate " method can refer under the false positive rate of specifying, and has the method for diagnosing abnormal of lower false negative rate.In advantageous embodiment of the present invention, one or more in described prediction makes based on the genetic data about fetus, and wherein genetic knowledge uses PARENTAL SUPPORT tMmethod measures, and that is, uses the genetic data of the individuality relevant to fetus to measure the genetic data of fetus with larger accuracy.In certain embodiments, genetic data can comprise the ploidy state of fetus.In certain embodiments, genetic data can refer to one group of allelotrope interpretation on Fetal genome.In certain embodiments, triple test can be used to make some predictions.In certain embodiments, the measuring result of other hormonal readiness in maternal blood can be used to make some predictions.In certain embodiments, by consider diagnosis the prediction made of method can with by considering that the prediction that the method for screening is made is combined.In certain embodiments, described method relates to the maternal blood level measuring α-fetoprotein (AFP).In certain embodiments, described method relates to the unconjugated trihydroxy-oestrin (UE of measurement 3) maternal blood level.In certain embodiments, described method relates to the maternal blood level measuring β-human chorionic gonadotropin (β-hCG).In certain embodiments, described method relates to the maternal blood level measuring aggressiveness trophoblast antigens (ITA).In certain embodiments, described method relates to the maternal blood level measuring statin.In certain embodiments, described method relates to the maternal blood level measuring PAPP-A (PAPP-A).In certain embodiments, described method relates to the maternal blood level measuring other hormone or maternal blood serum designated object.In certain embodiments, other method can be used to make some predictions.In certain embodiments, the test fully integrated can be used to make some predictions, such as, by the ultrasonic wave pregnant at the about the 12nd week and blood testing and the test of combining at the second time blood testing of the about the 16th week.In certain embodiments, described method relates to measurement fetus nuchal translucency (NT).In certain embodiments, the above-mentioned hormonal readiness that described method relates to measured by use is predicted.In certain embodiments, described method relates to the combination of aforesaid method.
The mode of combined prediction has multiple, and such as hormone measuring result can be changed into median multiple (MoM) and then change into likelihood ratio (LR) by a kind of mode.Similarly, the mixture model that other measuring result can use NT to distribute is converted into LR.LR for NT and biochemical marker can be multiplied by age and pregnant relevant risk to derive the risk of the various patient's condition, such as 21-tri-Signs.Verification and measurement ratio (DR) and false positive rate (FPR) can specify the risk ratio of risk threshold value to calculate by bearing to exceed.
In one embodiment, the method of interpretation ploidy state relates to the relative probability of each in the ploidy of the relative probability of each in the ploidy hypothesis using simultaneous distribution model and allelotrope counting probability to measure and Using statistics technique computes being supposed and combines, described statistical technique is taken from and is measured other method that fetus is the risk score of three bodies, include, but is not limited to: read analysis of accounts, relatively heterozygosis rate, only in the Statistical information using parent's genetic information just available, for the probability of the normalization method genotype signal of some parent's background, the Statistical information that the estimation fetus mark using the first sample or prepare sample calculates and its combination.
Another kind method can relate to the situation of the hormonal readiness that four are measured, and the probability distribution wherein about those hormones is known: about euploid situation, p (x 1, x 2, x 3, x 4| e); And about aneuploid situation, p (x 1, x 2, x 3, x 4| a).Then can measure the probability distribution of DNA measuring result, euploid and aneuploid situation are g (y|e) and g (y|a) respectively.Assuming that they are independently about euploid/aneuploid hypothesis, p (x can be combined as 1, x 2, x 3, x 4| a) g (y|a) and p (x 1, x 2, x 3, x 4| e) g (y|e), and then in view of the maternal age, priori p (a) and p (e) will be multiplied by separately.Then the highest that can be selected.
In one embodiment, likely expect central limit theorem with supposition g (y|a or e) on distribution be Gaussian distribution, and by checking that multiple sample comes average value measured and standard deviation.In another embodiment, can suppose that they about result are not independently and collect abundant sample to estimate simultaneous distribution p (x 1, x 2, x 3, x 4| a or e).
In one embodiment, the ploidy state of target individual is judged to be the ploidy state relevant to the hypothesis of maximum probability.In some cases, a kind of hypothesis will have the normalization method combined probability being greater than 90%.Often kind of hypothesis is all relevant to one or one group of ploidy state, such as, and can select to be greater than 90% or some other threshold values with normalization method combined probability, the ploidy state that the hypothesis of 50%, 80%, 95%, 98%, 99% or 99.9% is correlated with is as the threshold value be read needed for the hypothesis of judged ploidy state.
From the DNA of previously pregnant child in maternal blood
A difficulty of non-invasive prenatal diagnosis distinguishes from the fetal cell of current gestation and from previously pregnant fetal cell.Some people thinks, will disappear over time, become, but not yet demonstrate unambiguous evidence from previously pregnant genetic material.In one embodiment of the invention, likely PARENTAL SUPPORT is used tM(PS) knowledge of method and male parent gene group, measures the foetal DNA (that is, fetus is from the DNA of father's heredity) of paternal origin existing in maternal blood.This method can utilize phasing parent genetic information.Likely use the genetic data of grand parents (such as from the genetic data that the sperm of grandfather is measured) or gone out to bear child from other or the genetic data of sample of miscarrying, never phasing genotype information phasing parent genotype.Phasing can also be carried out to non-phasing genetic information by means of based on the phasing of HapMap or the haplotype analysis of paternal cell.By arresting cell at mitotic stages when karyomit(e) is and closely restraints and using microfluid to be placed in hole separately by the karyomit(e) separated, demonstrate successful haplotype analysis.In another embodiment, likely use phasing parent Haplotype data to detect the existence of more than one homologues from father, mean the genetic material existed in blood from more than one child.Estimating it is euploid karyomit(e) in fetus by concentrating on, the possibility that fetus suffers from trisomy can be got rid of.In addition, whether be not from current father, in the case, other method (such as triple test) can be used to predict genetic abnormality if likely measuring foetal DNA.
Other source via the available fetal genetic material of method except blood drawing can be there is.In maternal blood when available fetal genetic material, main exist two classes: (1) whole fetal cell, such as, have core fetus red blood cell or normoblast, and the foetal DNA of (2) unmanaged flexibility.When whole fetal cell, there are some evidences, fetal cell can retain the time period of an elongated segment to make the cell be likely separated from pregnant woman containing the DNA from previously pregnant child or fetus in maternal blood.Also there is evidence, the foetal DNA of unmanaged flexibility is removed within about several weeks from system.A challenge is the identity of the individuality how determining contained genetic material in cell, namely guarantees that measured genetic material is not from previously pregnant fetus.In one embodiment of the invention, the knowledge of maternal inheritance material may be used for guaranteeing that discussed genetic material is not maternal inheritance material.The method achieved this end has multiple, comprises the method based on information, such as PARENTAL SUPPORT tM, as in this document or in this document described in arbitrary patent of mentioning.
In one embodiment of the invention, the blood obtained from pregnant mothers can be divided into the part of the foetal DNA comprising unmanaged flexibility and comprise the part of erythroblast.The DNA of unmanaged flexibility can optionally enrichment, and can measure the genotype information of DNA.According to the genotype information that the DNA from unmanaged flexibility measures, the knowledge of female genotype may be used for measuring the genotypic each side of fetus.These aspects can refer to ploidy state and/or one group of allelotrope identity.Then, individual erythroblast can to use in this document other local and other referenced patent, especially in the first part of this document mention the method described in those patents and carry out gene type.Permission is measured whether any appointment list hemocyte is maternal on gene by the knowledge of female parent gene group.And permission is measured single hemocyte and whether derive from the current fetus bred on gene by the genotypic aspect of fetus measured as mentioned above.In essence, this aspect of the present invention allow to use the genetic knowledge of mother and possibly from the genetic information of other related individuals (such as father) and the genetic information of the DNA measurement of unmanaged flexibility that finds from maternal blood to measure find in maternal blood through be separated karyocyte whether (a) be maternal at gene, (b) on gene from the current fetus bred, or (c) on gene from the fetus of previous gestation.
Antenatal sex chromosome dysploidy measures
In the method be known in the art, people have employed the fact that there is the DNA (fffDNA) of the unmanaged flexibility of fetus in the blood plasma mother when attempting the sex of fetus from the blood measuring of mother breeds.If the Y specific gene seat in maternal blood plasma can be detected, so this means that the fetus in breeding is the male sex.But, when using currently known methods in this area, can't detect in blood plasma Y specific gene seat always do not guarantee to breed in fetus be women, as in some cases, the amount of fffDNA is too low so that can not guarantee Y specific gene seat to be detected when male fetus.
Propose a kind of method of novelty at this, it does not need to measure Y specific nucleic acid, namely from the DNA of the exclusively locus of paternal origin.Previously disclosed Parental Support method used chiasma frequency data, parent genotype data and information technology to measure the ploidy state of the fetus in breeding.Sex of fetus is that fetus is at heterosomal ploidy state.The child of XX is women, and XY is the male sex.Method described herein can also measure the ploidy state of fetus.It should be noted that gender typing and heterosomal ploidy measure and be actually synonym; When gender typing, hypothesis child is euploid usually, and therefore possible hypothesis is less.
Method disclosed herein relate to check X and Y chromosome the locus that shares, thus create the baseline of the estimate about the foetal DNA representing fetus.Then, can inquire about and only there are those regions specific to determine that fetus is women or the male sex to X chromosome.When the male sex, we estimate to see in foetal DNA and are less than and all have specific locus from having specific locus to X chromosome to X and Y.By contrast, in female child, the amount of the DNA of each that we estimate in these groups is identical.The DNA discussed can measure by carrying out quantitative any technology to the amount of DNA existing on sample, such as qPCR, SNP array, gene type array or order-checking.About the DNA exclusively from body one by one, expectation is seen following by we:
Foetal DNA to mix with mother DNA and wherein in mixture the mark of foetal DNA be F, and wherein in mixture the mark of maternal DNA be M, therefore when F+M=100%, expectation is seen following by we:
When F and M is known, can expected rate be calculated wherein, and viewed data and predicated data can be compared.When M and F is unknown, threshold value can be selected based on historical data wherein.In both cases, can be used as baseline having the amount of DNA that specific locus measures to X and Y, and sex of fetus test can based in the amount only X chromosome to the DNA that specific locus is observed.Go out to approximate the amount of 1/2F if described amount is lower than baseline or makes it drop to the amount of below predetermined threshold, so determining that fetus is the male sex; And if if described amount approximates baseline or not lowly sends as an envoy to that it drops to the amount of below predetermined threshold, so determine that fetus is women.
In another embodiment, only can check and be commonly referred to Z chromosome by those locus that X and Y chromosome share.Locus subgroup on Z chromosome usual always A on X chromosome, and on Y chromosome always B usually.If find, from the SNP of Z chromosome, there is 1 B gene type, be so the male sex by fetus interpretation; If find, from the SNP of Z chromosome, only there is A genotype, be so women by fetus interpretation.In another embodiment, the locus only found on X chromosome can be checked.The background of such as AA|B especially provides information, because the existence of B represents that fetus has the X chromosome from father.The background of such as AB|B also provides information, because we estimate to see compared to male fetus, when female child, B mostly just exists half.In another embodiment, can check the SNP on Z chromosome, wherein A and B allelotrope is all present on X and Y chromosome, and wherein which SNP known is the Y chromosome from male parent, and which is the X chromosome from male parent.
In one embodiment, likely increase the single nucleotide position changed between known homology non-recombinant (HNR) region being shared by Y chromosome and X chromosome.In this HNR region, the sequence between X with Y chromosome is consistent to a great extent.Although in the X chromosome of the single nucleotide position in this Uniform Domains in colony and be constant in Y chromosome, be different between X from Y chromosome.Each PCR detects and can increase from the sequence of the locus be present on X and Y chromosome.It will be the single base that order-checking or some other methods can be used to detect in each extension increasing sequence.
In one embodiment, the DNA of the unmanaged flexibility of the fetus that sex of fetus can find from maternal blood plasma measures, and it is some or all of that described method comprises in following steps: 1) design the X/Y varient single nucleotide position in PCR (conventional or miniature PCR, adds combined type if desired) primer amplification HNR region; 2) maternal blood plasma is obtained; 3) use HNR X/Y PCR to detect, pcr amplification is from the target of maternal blood plasma; 4) amplicon is checked order; 5) the allelic existence of Y in one or more in checking sequence data in extension increasing sequence.One or more existence will indicate male fetus.Neither there is all Y allelotrope instruction female child in all amplicons.
In one embodiment, the DNA and/or parent genotype that target can be used to check order measure in maternal blood plasma.In one embodiment, all sequences of the DNA being clearly derived from paternal origin can be ignored.For example, in background AA|AB, can count the quantity of A sequence and ignore all B sequences.In order to measure the heterozygosis rate of above algorithm, can the quantity of more viewed A sequence and the expectation quantity of total sequence of appointment probe.The mode that can calculate the expectation sequence quantity of each probe based on each sample has multiple.In one embodiment, likely use historical data to measure the mark belonging to each specific probe in all sequences reading be how many and then use the sum of this experience points and sequence reads to estimate the sequence quantity at each probe.Another kind method can be some known homozygous alleles of target and then use historical data, the reading quantity at each probe and the reading quantity at known homozygous alleles is linked together.About each sample, then can measure the reading quantity at homozygous alleles and the relation then using this measuring result and derive by rule of thumb, estimate the quantity of the sequence reads at each probe.
In certain embodiments, likely sex of fetus is determined by being combined through predicting the outcome that multiple method obtains.In certain embodiments, multiple method takes from the method described in the present invention.In certain embodiments, at least one in multiple method takes from the method described in the present invention.
In certain embodiments, described herein method may be used for measuring the ploidy state of the fetus in breeding.In one embodiment, ploidy interpretation method use to X chromosome have specificity or X and Y chromosome the locus that shares, but do not utilize any Y specific gene seat.In one embodiment, ploidy interpretation method use following in one or many person: to X chromosome have specific locus, X and Y chromosome the locus that shares and there is specific locus to Y chromosome.In one embodiment, wherein heterosomal ratio is similar, such as 45, X (Turner syndrome), 46, XX (normal female) and 47, XXX (X tri-body), can by comparing allele distributions and realizing distinguishing according to the expectation allele distributions of various hypothesis.In another embodiment, this can be euploidly to realize with reference to chromosomal relative populations relative to one or more supposition by comparative chromosomal sequence reads.It shall yet further be noted that and can expand these methods to comprise aneuploid situation.
single-gene disorder screens
In one embodiment, the method for the ploidy state for measuring fetus can be expanded can test monogenic disorders simultaneously.Single-gene disorder diagnosis utilizes tests identical targeted approach with for dysploidy, and requires extra specific targets.In one embodiment, linkage analysis is passed through in single-gene NPD diagnosis.In many cases, the direct test of cfDNA sample is insecure, because the existence of maternal DNA makes may to measure hardly the fetus whether heredity sudden change of mother.The challenge of the allelic detection of unique paternal origin is less, but only just can provide sufficient information when described disease is dominant and is carried by father, limits the practicality of described method.In one embodiment, described method relates to PCR or relevant amplification method.
In certain embodiments, described method relates to the information used from first degree relative, carries out phasing to the abnormal allele in parent around closely chain SNP.Then Parental Support can be run to measure fetus from parent both sides' heredity which homologue (normal or exception) to the target sequencing data that obtains from these SNP.As long as SNP is fully chain, the genotypic Genetic conditions of fetus so just very reliably can be measured.In certain embodiments, described method comprises (a) add one group of SNP locus with side joint one group of common disease of specifying thick and fast in our the compound pond for dysploidy test; B (), based on the genetic data from each relatives, carries out reliably phasing with normal and abnormal allelotrope to the allelotrope from these added SNP; And (c) reconstructs fetus haplotype or phasing SNP allelotrope group to measure fetus genotype in disease gene seat peripheral region on hereditary female parent and male parent homologue.In certain embodiments, in the polymorphic locus group of testing for dysploidy, add the additional probes being closely attached to disease linked gene seat.
The double type of reconstruct fetus is challenging, because sample is maternal and the mixture of the DNA of fetus.In certain embodiments, described method is incorporated with for phasing SNP and the allelic relative information of disease, then consider SNP physical distance and from location specific restructuring possibility recombination data and measure viewed data from the heredity of maternal blood plasma, thus obtain the most probable genotype of fetus.
In one embodiment, the multiple additional probes of each disease linked gene seat is comprised in target polymorphic locus group; The quantity of the additional probes of each disease linked gene seat can between 4 and 10, between 11 and 20, between 21 and 40, between 41 and 60, between 61 and 80 or its combination.
Can be challenging to the diploid data phasing from parent, and the mode that can realize this point has multiple.Discuss in the present invention, other be described in greater detail in that other is open in (disclose No. W02009105531 referring to the PCT such as submitted on February 9th, 2009 and the PCT that submits on August 4th, 2009 discloses No. W02010017214, the described openly respective mode hereby quoted in full is incorporated herein).In one embodiment, parent by inferring, by measuring the haploid tissue from parent, such as, can carry out phasing by one or more sperm of measurement or ovum.In one embodiment, parent can, by inferring, use the genotype data of measured first degree relative (father and mother of such as father and mother or siblings) to carry out phasing.In one embodiment, DNA by dilution phasing, wherein can be diluted to and wherein estimates that in each hole, each haplotype is no more than the degree of about copy, and then measure the DNA in one or more hole by parent in one or more hole.In one embodiment, parent genotype can carry out phasing by using the computer program based on the Haplotype frequencies of colony, infers most probable facies pattern.In one embodiment, if the non-phasing genetic data of one or more hereditary offspring of the Haplotype data of phasing of another parent and parent is known, so phasing can be carried out to parent.In certain embodiments, the hereditary offspring of parent can be one or more embryo, fetus and/or gone out to bear child.These methods and for one or both some of carrying out in other method of phasing is disclosed in the U.S. of submitting in such as on August 19th, 2010 in more detail and discloses No. 2011/0033862 to parent; The U.S. that on February 3rd, 2011 submits to discloses No. 2011/0178719; The U.S. that on November 22nd, 2006 submits to discloses No. 2007/0184467; The U.S. that on March 17th, 2008 submits to discloses in No. 2008/0243398, and the described open mode hereby quoted in full is separately incorporated herein.
Fetal genome reconstructs
On the one hand, the invention is characterized in the method for the haplotype for measuring fetus.In different embodiments, this method allows to measure fetus genetic which polymorphic locus (such as SNP) and reconstructs which homologue (comprising recombination event) (and thus sequence being inserted between polymorphic locus) existing in fetus.If desired, the whole genome of fetus can substantially be reconstructed.If remain some uncertainties (spacing of such as intersecting) in the genome of fetus, this uncertainty can be dropped to minimum by analyzing extra polymorphic locus so if desired.In different embodiments, polymorphic locus is selected to contain one or more in karyomit(e) with the density any uncertainty being reduced to desired level.This method is highly suitable for detecting the related polymorphic in fetus or other sudden change, because it makes their detection be based on chain the existence of chain polymorphic locus (in the such as Fetal genome) instead of by related polymorphic in direct-detection Fetal genome or other sudden change.For example, if parent is the carrier of the sudden change relevant to cystic fibrosis (CF), so can analyzes and comprise the maternal DNA from fetus mother and the nucleic acid samples from the foetal DNA of fetus and whether comprise haplotype containing CF sudden change to measure foetal DNA.Specifically, polymorphic locus can be analyzed and whether comprise to measure foetal DNA the haplotype suddenlyd change containing CF, and the CF sudden change itself in foetal DNA need not be detected.
In certain embodiments, described method relates to mensuration parent's haplotype (such as, mother of fetus or the haplotype of father).In certain embodiments, this mensuration is carried out when not using the data from the relative of mother or father.In certain embodiments, in succession use as measured parent's haplotype with the dilution process described in other place (U.S. referring to such as submission in 20108 months 19 days discloses No. 2011/0033862, and the described open mode hereby quoted in full is incorporated herein), SNP gene type or order-checking herein.Because DNA is diluted, so more than one haplotypes are unlikely in identical part (or pipe).Therefore, can effectively there is unique DNA molecule in pipe, this allows to measure the haplotype on unique DNA molecule.In certain embodiments, described method comprises and DNA sample is divided into multiple part and comprises item chromosome from dyad or a chromosome segment with at least one making in described part, and gene type is carried out (such as to the DNA sample at least one in described part, measure the existence of two or more polymorphic locuses), thus measure parent's haplotype.In certain embodiments, gene type relates to order-checking (such as shotgun sequencing).In certain embodiments, gene type relates to use SNP array to detect polymorphic locus, and such as at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different polymorphic locus.In certain embodiments, gene type relates to use composite PCR.In certain embodiments, described method relate to make a part of sample with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different polymorphic locus (such as SNP) is to produce reaction mixture; And make reaction mixture experience primer extension reaction condition to produce amplified production, measure amplified production with high-flux sequence instrument, produce sequencing data.
In certain embodiments, by any one in described method herein, the haplotype of the data determination mother of the relatives from mother is used.In certain embodiments, by any one in described method herein, the haplotype of the data determination father of the relatives from father is used.In certain embodiments, the haplotype of father and mother is measured.In certain embodiments, in the DNA sample using SNP array to measure from the relatives of mother (or father) and mother (or father) at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the existence of 000 different polymorphic locus.In certain embodiments, described method relate to make from the relatives of mother (or father) and/or mother (or father) DNA sample with hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different polymorphic locus (such as SNP) is to produce reaction mixture; And make reaction mixture experience primer extension reaction condition to produce amplified production, measure amplified production with high-flux sequence instrument, produce sequencing data.Parent's haplotype can measure based on SNP array or sequencing data.In certain embodiments, parent's data can carry out phasing by other local method that is described or that mention in this document.
This parent's Haplotype data may be used for measuring fetus whether heredity parent's haplotype.In certain embodiments, comprising the maternal DNA from fetus mother and the nucleic acid samples from the foetal DNA of fetus uses SNP array to carry out analyzing to detect at least 1,000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100,000 different polymorphic locus.In certain embodiments, the maternal DNA from fetus mother and the nucleic acid samples from the foetal DNA of fetus is comprised by making described sample and hybridize at least 1,000,2 simultaneously, 000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000,75,000 or 100, the primer storehouse contact of 000 different polymorphic locus (such as SNP) is to produce reaction mixture to analyze.In certain embodiments, reaction mixture is made to experience primer extension reaction condition to produce amplified production.In certain embodiments, measure amplified production with high-flux sequence instrument, produce sequencing data.In different embodiments, SNP array or sequencing data are used for by using the data about the probability of the chiasma of the different positions in karyomit(e) (such as to pass through to use recombination data, such as can be found in the data in HapMap database, create the restructuring risk score for any spacing), modeling is carried out to the dependency between the polymorphic allele on described karyomit(e), measures parent's haplotype.In certain embodiments, based on sequencing data, calculate on computers and count at the allelotrope of polymorphic locus.In certain embodiments, create on computers and multiplely to suppose about the ploidy of chromosomal different possible ploidy state separately; For often kind of ploidy hypothesis, be that the expectation allelotrope counting of polymorphic locus on chromosome builds model (such as simultaneous distribution model) on computers; Use simultaneous distribution model and allelotrope counting, measure the relative probability of each in ploidy hypothesis on computers; And by selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of interpretation fetus.In certain embodiments, use does not need use with reference to chromosomal method for allelotrope counting builds simultaneous distribution model and the step of the relative probability of execution mensuration often kind of hypothesis.
In certain embodiments, measure one or more to take from by No. 13, No. 18, No. 21, the chromosomal fetus haplotype of group that forms of X and Y chromosome.In certain embodiments, the fetus haplotype of all fetal chromosomals is measured.In different embodiments, described method measures the whole genome of fetus substantially.In certain embodiments, the haplotype of the Fetal genome of at least 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% is measured.In certain embodiments, the haplotype measurement result of fetus comprises about at least 1, and 000,2,000,5,000,7,500,10,000,20,000,25,000,30,000,40,000,50,000, which allelic information 75,000 or 100, there is in 000 different polymorphic locus.
The composition of DNA
When performing information analysis to measure genomic information (ploidy state of such as fetus) about fetus to the sequencing data measured by the mixture about fetus and maternal blood, it can be favourable for measuring one group of allelic allele distributions.Regrettably, in many cases, such as when attempting the ploidy state of the DNA mixture determining fetus found from the blood plasma of maternal blood sample, the quantity not sufficient of available DNA is directly to measure the allele distributions in mixture by good fidelity of reproduction.In these cases, the amplification of DNA mixture will provide the DNA molecular of sufficient amount, so that the allele distributions desired by can measuring by good fidelity of reproduction.But, normally very devious for current amplification method usually used in the DNA cloning that checks order, mean them and do not make two of polymorphic locus amounts that amplified allele is identical.Amplification devious can cause the allele distributions in allele distributions and original mixture far from it.For most of object, the measuring result being present in the pin-point accuracy of the allelic relative quantity of polymorphic locus is unwanted.By contrast, in one embodiment of the invention, specific enrichment polymorphic allele and keep the amplification of allele ratio or enriching method to be favourable.
There is described herein multiple may be used for makes allelotrope deviation drop to the method for minimum mode in multiple locus priority enrichment DNA sample.Some examples use cyclisation middle probe to carry out the multiple locus of target, and wherein 3 ' the end and 5 ' of circularizing probes holds the base being designed to one or several position hybridized to away from the allelic polymorphic site of target in advance.Another example uses PCR probe, and wherein 3 ' end of PCR probe is designed to the base of one or several position hybridized to away from the allelic polymorphic site of target.Another example uses split-and-merge method to create DNA mixture, wherein with the locus of hypomorph deviation enrichment through priority enrichment, do not have the shortcoming of direct combination.Another example uses hybrid capture method, and wherein capture probe is designed such that the region of the DNA being designed to the polymorphic site hybridizing to side joint target in capture probe is separated by one or a small amount of base and polymorphic site.
Wherein when the allele distributions measured by one group of polymorphic locus is for measuring individual ploidy state, need to keep the allelic relative quantity as in the DNA sample through measuring for the preparation of heredity.This preparation can relate to WGA amplification, target amplification, selective enrichment technology, hybrid capture technology, cyclisation middle probe or intend the amount of DNA amplification and/or Selective long-range DEPT and corresponds to other method of the existence of some allelic DNA molecular.
In some embodiments of the invention, there is one group and be designed to the DNA probe that wherein locus has the target gene seat of maximum secondary gene frequency.In some embodiments of the invention, there is the probe that a group is designed to target, wherein genome has the PRML at those locus with the fetus of high information quantity SNP.In some embodiments of the invention, there is the probe that a group is designed to target gene seat, wherein said probe is optimized for appointment colony subgroup.In some embodiments of the invention, there is the probe that a group is designed to target gene seat, wherein said probe is for the appointment hybrid optimization of colony's subgroup.In some embodiments of the invention, there is the probe that a group is designed to target gene seat, wherein said probe specifies parent to optimize for a pair, and described parent is from the different groups subgroup with different secondary gene frequency overview.In some embodiments of the invention, existence comprises at least one through being annealed to the chain of cyclisation of the DNA of the base pair of a slice DNA with fetal origin.In some embodiments of the invention, existence comprises at least one through being annealed to the chain of cyclisation of the DNA of the base pair of a slice DNA with placenta source.In some embodiments of the invention, there is the chain of cyclisation of the DNA of cyclisation, and at least some in Nucleotide is annealed to the DNA with fetal origin.In some embodiments of the invention, there is the chain of cyclisation of the DNA of cyclisation, and at least some in Nucleotide is annealed to the DNA with placenta source.In some embodiments of the invention, there is one group of probe, some the target unitary series of operations in its middle probe repeat, and some the target single nucleotide polymorphism in probe.In certain embodiments, in order to non-invasive prenatal diagnosis Select gene seat.In certain embodiments, in order to non-invasive prenatal diagnosis uses probe.In certain embodiments, use can comprise cyclisation middle probe, MIP, caught by hybridization probe, the method for probe on SNP array or its combination carrys out target gene seat.In certain embodiments, described probe be used as cyclisation middle probe, MIP, caught by hybridization probe, probe on SNP array or its combination.In certain embodiments, in order to non-invasive prenatal diagnosis checks order to locus.
Wherein when larger to the relative entropy of relevant parent's background combination time series, must send as an envoy to thus reaches the maximum quantity of information of the order-checking reading group about biased sample that can make containing the wherein quantity of the sequence reads of the SNP that parent's background is known and reaches maximum.In one embodiment, the quantity containing the wherein sequence reads of the SNP that parent's background is known can strengthen by using qPCR preferential amplification specific sequence.In one embodiment, the quantity containing the wherein sequence reads of the SNP that parent's background is known can strengthen by using cyclisation middle probe (such as, MIP) preferential amplification specific sequence.In one embodiment, the quantity containing the wherein sequence reads of the SNP that parent's background is known can be caught (such as Xiu Er Selleck spy) preferential amplification specific sequence by hybridizing method and strengthens by being used.Can make differently to strengthen the quantity containing the wherein sequence reads of the SNP that parent's background is known.In one embodiment, can engaging by extending, engaging without extending, realizing target by hybrid capture or PCR.
In the sample of fragmentation genomic dna, a part of DNA sequence dna is mapped to independent karyomit(e) uniquely; Other DNA sequence dna can find on coloured differently body.It should be noted that DNA (maternal or fetus of the no matter originating) fragmentation usually making to find in blood plasma, normal length is less than 500bp.In Exemplary gene group sample, about 3.3% can will be mapped to No. 13 karyomit(e)s by sequence of mapping; 2.2% can will be mapped to No. 18 karyomit(e)s by sequence of mapping; 1.35% can will be mapped to No. 21 karyomit(e)s by sequence of mapping; 4.5% can will be mapped to the X chromosome in women by sequence of mapping; 2.25% can will be mapped to X chromosome (in the male sex) by sequence of mapping; And 0.73% can will be mapped to Y chromosome (in the male sex) by sequence of mapping.These are the euploid karyomit(e) of most possible right and wrong in fetus.In addition, in short data records, in 20 sequences, about 1 containing SNP, will use SNP contained on dbSNP.In view of there is still undiscovered multiple SNP, described ratio is higher possibly.
In one embodiment of the invention, targeted approach may be used for strengthening in DNA sample and is mapped to the mark of specifying chromosomal DNA, to make described mark significantly beyond the exemplary percentages of genomic samples listed above.In one embodiment of the invention, targeted approach may be used for strengthening the DNA mark in DNA sample, is significantly greater than the per-cent that usually can find in genomic samples to make the per-cent of the sequence containing SNP.In one embodiment of the invention, in order to antenatal diagnosis, targeted approach may be used in the mixture of the DNA of target female parent and fetus from karyomit(e) or the target dna from one group of SNP.
Should note, report (United States Patent (USP) 7,888,017) a kind of method for measuring fetus dysploidy, described method is mapped to by counting the quantity of suspecting chromosomal reading and it and the quantity be mapped to reference to chromosomal reading is compared, and use more than the reading suspected on karyomit(e) excessive corresponding in fetus in described chromosomal triploid hypothesis.Those methods for antenatal diagnosis will not utilize the target of any classification, and they do not describe yet and use target to carry out antenatal diagnosis.
By utilizing targeted approach in the order-checking of biased sample, the accuracy of certain level likely can be obtained when less sequence reads.Accuracy can refer to susceptibility, and it can refer to specificity, or it can refer to its some combinations.Desired levels of accuracy can be between 90% and 95%; It can be between 95% and 98%; It can be between 98% and 99%; It can be between 99% and 99.5%; It can be between 99.5% and 99.9%; It can be between 99.9% and 99.99%; It can be between 99.99% and 99.999%; It can be between 99.999% and 100%.Levels of accuracy more than 95% can be referred to as high accuracy.
There is multiple displaying in the prior art can how from the open method of the ploidy state of the biased sample mensuration fetus of the DNA of maternal and fetus, such as: people's clinical chemistries 2011 such as Liao G.J.W. (G.J.W.Liao); 57 (1) 92-101 pages.These methods concentrate on along the chromosomal thousands of position of every bar.Can target but for the sequence reads of specified quantity along karyomit(e), it is low unexpectedly for still producing from the biased sample of DNA the number of positions that the high accuracy ploidy about fetus measures.In one embodiment of the invention, ploidy measures and can check order, use any targeted approach by using target accurately, such as qPCR, ligand-mediated PCR, other PCR method, to be undertaken by hybrid capture or cyclisation middle probe, wherein need the quantity of the locus be targeted can be between 5 along karyomit(e), 000 and 2, between 000 locus; It can be between 2,000 and 1, between 000 locus; It can be between 1, between 000 and 500 locus; It can be between 500 and 300 locus; It can be between 300 and 200 locus; It can be between 200 and 150 locus; It can be between 150 and 100 locus; It can be between 100 and 50 locus; It can be between 50 and 20 locus; It can be between 20 and 10 locus.Best, it can be between 100 and 500 locus.A small amount of sequence reads unexpectedly can be performed obtain high-level accuracy by a small amount of locus of target.Reading quantity can be between 100,000,000 and 0.5 hundred million readings; Reading quantity can be between 0.5 hundred million and 0.2 hundred million readings; Reading quantity can be between 0.2 hundred million and 0.1 hundred million readings; Reading quantity can be between 0.1 hundred million and 5,000,000 readings; Reading quantity can be between 5,000,000 and 2,000,000 readings; Reading quantity can be between 2,000,000 and 1,000,000; Reading quantity can be between 1,000,000 and 500, between 000; Reading quantity can be between 500,000 and 200, between 000; Reading quantity can be between 200,000 and 100, between 000; Reading quantity can be between 100,000 and 50, between 000; Reading quantity can be between 50,000 and 20, between 000; Reading quantity can be between 20,000 and 10, between 000; Reading quantity can be less than 10,000.Less reading quantity is that more substantial input DNA is necessary.
In certain embodiments, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and the per-cent being wherein mapped to No. 13 chromosomal sequences is uniquely greater than 4%, is greater than 5%, is greater than 6%, is greater than 7%, is greater than 8%, is greater than 9%, is greater than 10%, is greater than 12%, is greater than 15%, is greater than 20%, is greater than 25% or be greater than 30%.In some embodiments of the invention, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and the per-cent being wherein mapped to No. 18 chromosomal sequences is uniquely greater than 3%, is greater than 4%, is greater than 5%, is greater than 6%, is greater than 7%, is greater than 8%, is greater than 9%, is greater than 10%, is greater than 12%, is greater than 15%, is greater than 20%, is greater than 25% or be greater than 30%.In some embodiments of the invention, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and the per-cent being wherein mapped to No. 21 chromosomal sequences is uniquely greater than 2%, is greater than 3%, is greater than 4%, is greater than 5%, is greater than 6%, is greater than 7%, is greater than 8%, is greater than 9%, is greater than 10%, is greater than 12%, is greater than 15%, is greater than 20%, is greater than 25% or be greater than 30%.In some embodiments of the invention, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and the per-cent being wherein mapped to the sequence of X chromosome is uniquely greater than 6%, is greater than 7%, is greater than 8%, is greater than 9%, is greater than 10%, is greater than 12%, is greater than 15%, is greater than 20%, is greater than 25% or be greater than 30%.In some embodiments of the invention, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and the per-cent being wherein mapped to the sequence of Y chromosome is uniquely greater than 1%, is greater than 2%, is greater than 3%, is greater than 4%, is greater than 5%, is greater than 6%, is greater than 7%, is greater than 8%, is greater than 9%, is greater than 10%, is greater than 12%, is greater than 15%, is greater than 20%, is greater than 25% or be greater than 30%.
In certain embodiments, describe a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, wherein be mapped to item chromosome uniquely and the per-cent of sequence containing at least one single nucleotide polymorphism is greater than 0.2%, be greater than 0.3%, be greater than 0.4%, be greater than 0.5%, be greater than 0.6%, be greater than 0.7%, be greater than 0.8%, be greater than 0.9%, be greater than 1%, be greater than 1.2%, be greater than 1.4%, be greater than 1.6%, be greater than 1.8%, be greater than 2%, be greater than 2.5%, be greater than 3%, be greater than 4%, be greater than 5%, be greater than 6%, be greater than 7%, be greater than 8%, be greater than 9%, be greater than 10%, be greater than 12%, be greater than 15% or be greater than 20%, and wherein said karyomit(e) takes from No. 13, No. 18, No. 21, the group of X or Y.In some embodiments of the invention, there is a kind of composition, it comprises the mixture of the DNA of fetal origin and the DNA in maternal source, and being wherein mapped to item chromosome uniquely and containing at least one per-cent from the sequence of the single nucleotide polymorphism of one group of single nucleotide polymorphism is be greater than 0.15%, be greater than 0.2%, be greater than 0.3%, be greater than 0.4%, be greater than 0.5%, be greater than 0.6%, be greater than 0.7%, be greater than 0.8%, be greater than 0.9%, be greater than 1%, be greater than 1.2%, be greater than 1.4%, be greater than 1.6%, be greater than 1.8%, be greater than 2%, be greater than 2.5%, be greater than 3%, be greater than 4%, be greater than 5%, be greater than 6%, be greater than 7%, be greater than 8%, be greater than 9%, be greater than 10%, be greater than 12%, be greater than 15% or be greater than 20%, wherein said karyomit(e) takes from No. 13, No. 18, No. 21, X and Y chromosome group, and the quantity of single nucleotide polymorphism wherein in single nucleotide polymorphism group is between 1 and 10, between 10 and 20, between 20 and 50, between 50 and 100, between 100 and 200, between 200 and 500, between 500 and 1, between 000, between 1,000 and 2, between 000, between 2,000 and 5, between 000, between 5,000 and 10, between 000, between 10,000 and 20, between 000, between 20,000 and 50, between 000 and between 50,000 and 100, between 000.
In theory, each amplification cycle makes the amount of existing DNA double; But in fact, amplification degree is a little less than two.In theory, the zero deflection that will produce (comprising target amplification) DNA mixture that increases increases; But not homoallelic amplification degree is often different from other allelotrope in fact.When DNA is amplified, the degree of allelotrope deviation increases along with the quantity of amplification step usually.In certain embodiments, described herein method relates to low-level allelotrope deviation DNA amplification.Because allelotrope deviation often increases by one along with the cycle and increases, so can be measured each cycle allelotrope deviation by the n th Root calculating overall deviation, wherein n is the logarithm being the end with 2 of enrichment.In certain embodiments, there is a kind of composition, it comprises the 2nd DNA mixture, wherein said 2nd DNA mixture at multiple polymorphic locus from a DNA mixture priority enrichment, wherein enrichment is at least 10, at least 100, at least 1, 000, at least 10, 000, at least 100, 000 or at least 1, 000, 000, and the coefficient wherein differed with the allelic ratio at described locus in a DNA mixture at the allelic ratio of each locus in the 2nd DNA mixture is on average less than 1, 000%, 500%, 200%, 100%, 50%, 20%, 10%, 5%, 2%, 1%, 0.5%, 0.2%, 0.1%, 0.05%, 0.02% or 0.01%.In certain embodiments, there is a kind of composition, it comprises the 2nd DNA mixture, wherein said 2nd DNA mixture at multiple polymorphic locus from a DNA mixture priority enrichment, wherein each cycle allelotrope deviation of multiple polymorphic locus is on average less than 10%, 5%, 2%, 1%, 0.5%, 0.2%, 0.1%, 0.05% or 0.02%.In certain embodiments, multiple polymorphic locus comprises at least 10 locus, at least 20 locus, at least 50 locus, at least 100 locus, at least 200 locus, at least 500 locus, at least 1,000 locus, at least 2,000 locus, at least 5,000 locus, at least 10,000 locus, at least 20,000 locus or at least 50,000 locus.
some embodiments
In certain embodiments, there is disclosed herein a kind of method of report for generation of disclosing the ploidy state measured in the fetus in breeding, described method comprises: obtain the first sample containing the DNA from fetus mother and the DNA from fetus; Genotype data is obtained from one or both of fetus parent; The first sample is prepared so that sample is prepared in acquisition by DNA isolation; Measure the DNA prepared at multiple polymorphic locus in sample; From about the DNA measuring result prepared sample and obtain, calculate at the allelotrope counting of multiple polymorphic locus or allelotrope counting probability on computers; For chromosomal different possible ploidy state, create the ploidy hypothesis of multiple expectation allelotrope about multiple polymorphic locuses on chromosome counting probability on computers; For often kind of ploidy hypothesis, using the genotype data of one or both from fetus parent, is that the allelotrope counting probability of each polymorphic locus on karyomit(e) builds simultaneous distribution model on computers; Use simultaneous distribution model and for the allelotrope counting probability preparing sample calculating, measure the relative probability of each in ploidy hypothesis on computers; By selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of interpretation fetus; And produce the report disclosing measured ploidy state.
In certain embodiments, described method for measure in multiple mother's body accordingly multiple breed in the ploidy state of fetus, described method comprises further: the per-cent measuring the DNA of fetal origin in each preparation in sample; And wherein measure the step of preparing DNA in sample by prepare in sample each in multiple DNA moleculars carry out order-checking to carry out, the DNA molecular be wherein sequenced is prepared sample from those with less foetal DNA mark and is prepared the many of sample than from those with larger foetal DNA mark.
In certain embodiments, described method for measure in multiple mother's body accordingly multiple breed in the ploidy state of fetus, and wherein for each in fetus, by checking order to the preparation DNA sample of the first mark, obtain first group of measuring result and measure the DNA prepared in sample, described method comprises further: in view of first group of DNA measuring result, measures the first relative probability of each in the ploidy of each hypothesis in fetus; But the sample of the preparing ploidy hypothesis showing to correspond to aneuploid fetus from the first relative probability measurement result of each in wherein ploidy hypothesis to second mark of those fetuses of significant indecisive probability is resurveyed sequence, obtains second group of measuring result; Use second group of measuring result and optionally also use first group of measuring result, measuring the second relative probability of the ploidy hypothesis of fetus; And by select corresponding to have maximum probability (as measured by the second relative probability measure) the ploidy state of hypothesis, interpretation is resurveyed to its second sample the ploidy state of fetus of sequence.
In certain embodiments, disclose a kind of objective composition, described objective composition comprises: the DNA sample of priority enrichment, the DNA sample of wherein said priority enrichment at multiple polymorphic locus from the first DNA sample priority enrichment, wherein the first DNA sample origin comes from the maternal DNA of maternal blood plasma and the mixture composition of foetal DNA, wherein enrichment is at least 2 times, and the average allelotrope deviation wherein between the first sample and priority enrichment sample is selected from by the following group formed: be less than 2%, be less than 1%, be less than 0.5%, be less than 0.2%, be less than 0.1%, be less than 0.05%, be less than 0.02% and be less than 0.01%.In certain embodiments, a kind of method creating the DNA sample of this kind of priority enrichment is disclosed.
In certain embodiments, disclose the method for presence or absence fetus dysploidy in a kind of maternal tissue sample for being determined at the genomic dna comprising fetus and female parent, wherein said method comprises: (a) obtains the mixture of fetus and maternal genomic dna from described maternal tissue sample; B () is at the mixture of multiple polymorphic allele selective enrichment fetus and maternal DNA; C () makes to distribute from the selective enrichment fragment of the fetus of step a and the mixture of maternal genomic dna, obtain the response sample of the amplified production comprising term single gene group DNA molecular or term single gene group DNA molecular; D () is to step c) response sample in the genomic DNA fragment of selective enrichment perform large-scale parallel DNA sequencing, measure the sequence of described selective enrichment fragment; E () is differentiated in steps d) middle karyomit(e) belonging to gained sequence; (f) analytical procedure d) data, measure i) from steps d) genomic DNA fragment in belong at least one supposition be the chromosomal quantity of diplontic first object in mother and fetus, and ii) from steps d) genomic DNA fragment in belong to the second target chromosome quantity, wherein said second karyomit(e) is doubtful is aneuploid in fetus; If (g) second target chromosome be euploid, be so used in step f) part i) in the quantity that measures, calculate from steps d) genomic DNA fragment in distribute about the expectation of the quantity of the second target chromosome; If (h) second target chromosome be aneuploid, so be used in step f) part i) in the first quantity and step b) mixture in the estimated score of foetal DNA that finds, calculate from steps d) genomic DNA fragment in distribute about the expectation of the quantity of the second target chromosome; And (i) uses maximum likelihood or maximum a posteriori probability method to be determined at step f) part ii) in the quantity of genomic DNA fragment that measures whether may be more in step g) in the distribution that calculates or in step h) in the part of distribution that calculates; Thus instruction presence or absence fetus dysploidy.
exemplary Method for cancer diagnostics
It should be noted that and show, the DNA deriving from cancer lived in main body can be found in the blood of main body.In an identical manner, the measuring result of the hybrid dna that gene diagnosis can find from maternal blood is carried out, and the measuring result of the hybrid dna that gene diagnosis can find equally well from main body blood is carried out.Gene diagnosis can comprise aneuploid state or transgenation.The arbitrary claim of the present invention of reading based on the ploidy state or genetic state that measure fetus from the measuring result that obtains about maternal blood can measure the ploidy state of cancer based on the measuring result from main body blood well equally or genetic state is read.
In certain embodiments, the inventive method allows the ploidy state measuring cancer, and described method comprises the biased sample obtaining and contain to come the genetic material of autonomous agent and the genetic material from cancer; Measure the DNA in biased sample; Calculate in biased sample the mark of the DNA with cancer source; And use the measuring result obtained about biased sample and the mark calculated, measure the ploidy state of cancer.In certain embodiments, described method the measurement result comprised further based on the ploidy state of cancer can cast cancer therapy.In certain embodiments, described method the measurement result comprised further based on the ploidy state of cancer can cast cancer therapy, and wherein cancer therapy is taken from and comprised medicine, biotherapy and the group based on the therapy of antibody and its combination.
exemplary implementation method
Any one in embodiment disclosed herein can be implemented in Fundamental Digital Circuit, unicircuit, custom-designed ASIC (application specific integrated circuit), computer hardware, firmware, software or its combination.The equipment of disclosed embodiment of this invention can be implemented by the computer program be visibly implemented in the machine-readable storing device that performed by programmable processor; And the method steps of disclosed embodiment of this invention can by the operation exported about input data and generation, and the programmed instruction performed for the function performing disclosed embodiment of this invention by programmable processor is carried out.Disclosed embodiment of this invention can perform at one or more and/or advantageously implement in explainable computer program on programmable system, described programmable system comprises at least one programmable processor, it can be special or general, through coupling to receive data and instruction from storage system, at least one input unit and at least one take-off equipment, and to storage system, at least one input unit and at least one take-off equipment transmission data and instruction.Each computer program by high-caliber program or Object-Oriented Programming Language or can be implemented with assembly language or machine language if desired; And under any circumstance, described language can be compiler language or interpretative code.Computer program can adopt by any form, comprises with stand-alone program form or with module, assembly, sub-routine or the form being applicable to other unit in computing environment.Computer program may be used on one computer or a website or be distributed in multiple website by network of communication and interconnection multiple stage computer on perform or explain.
Computer-readable storage medium as used herein refers to entity or tangible storage (compared to signal) and includes, but is not limited to for any method of the tangible storage of information or the volatibility of technology implementation and non-volatile, detachable and non-detachable media, and described information is computer-readable instruction, data structure, programmodule or other data such as.Computer-readable storage medium includes, but is not limited to RAM, ROM, EPROM, EEPROM, flash memory or other solid state storage technologies, CD-ROM, DVD or other optical memory, magnetictape cartridge, tape, multiple head unit or other magnetic storage device or may be used for visibly storing desired information or data or instruction and can by other entity any of computer or processor access or material media.
Any one data that can comprise in physical format herein in described method export, such as, on the computer screen or on printer paper.In the explanation of any embodiment in other place in the document, should be appreciated that, described method can with can the output of service data combine in the form that can be worked by doctor.In addition, described method can with produce the actual execution of clinical decision of clinical treatment or actionless clinical decision perform combination.In this document about measure about some in the embodiment described in the genetic data of target individual can with select one or more embryo to combine for the decision transferred when IVF, optionally combine with the process in uterus embryo being transferred to ready-to-be mother.In this document about measure about some in the embodiment described in the genetic data of target individual can with notify the potential chromosome abnormalty of medical professional or lack chromosome abnormalty to combine, optionally combine with the decision of having an abortion or not having an abortion when antenatal diagnosis.In embodiment described herein some can with can service data output and produce the execution of clinical decision of clinical treatment or actionless clinical decision perform combination.
exemplary diagnosis box
In one embodiment, the present invention comprises the diagnosis box of any one that can partially or even wholly perform in method described in the present invention.In one embodiment, described diagnosis box can be positioned at the office of doctor, hospital laboratory or suitably close to any correct position of patient care point.Described box can run whole method in fully automated mode, or described box can need one or more step manually to be completed by technician.In one embodiment, described box can at least be analyzed about the genotype data measured by maternal blood plasma.In one embodiment, described box can be connected to for genotype data measured on diagnosis box is transferred to the outside component calculating facility, and described outside calculates facility then can analyzing gene type data, and also may produce report.Diagnosis box can comprise can transfer to another machine unit by water-based or liquid sample from a container.It can comprise many kinds of solids and liquid reagent.It can comprise high-flux sequence instrument.It can comprise computer.
experimental section
Describe disclosed embodiment of this invention in the following example, described example understands the present invention and setting forth to help, and should not be construed restriction by any way as in above claims the scope of the present invention that defines.Propose following instance so that the entire disclosure providing those of ordinary skill in the art how to use described embodiment and description, and the scope be not intended to limit the present invention, they do not intend to represent that following experiment is performed all or only have experiment yet.Be devoted to guarantee the accuracy about quantity used (such as amount, temperature etc.), but some experimental errors and deviation should have been considered.Unless specified otherwise herein, otherwise number is all parts by volume, and temperature is with degree Celsius to represent.Should be appreciated that, as described in the method variant that obtain can be to illustrate when not changing experiment basic sides.
Experiment 1
Object is to show Bayes's maximum likelihood estimation (MLE) algorithm, and described algorithm uses parent genotype to calculate fetus mark, improves the accuracy of the antenatal three bodies diagnosis of Noninvasive compared to the method that discloses.
By sampling to the reading about 21-tri-Signs and corresponding mother's clone gained, create the simulation sequencing data of maternal cfDNA.About openly method (the people BMJ 2011 such as mound; 342:c7401) and our algorithm based on MLE, the disomy correct from 500 simulated determinations of various fetus mark and the ratio of trisomy interpretation.We, by obtaining the 5000000 air gun readings collected according to IRB approval protocol from four pregnant mothers and corresponding father, verify simulation.290K SNP array obtains parent genotype.(referring to Figure 14)
In simulations, based on the method for MLE the accuracy of 99.0% reached for the low fetus mark to 9% and report and correspond to the degree of confidence of overall accuracy well.We use four authentic samples to verify these results, and wherein we obtain all correct interpretations with the calculated degree of confidence more than 99%.By contrast, we need the fetus mark of 18% to reach the accuracy of 99.0% about the embodiment of the algorithm disclosed in the people such as mound, and under 9% foetal DNA, only reach the accuracy of 87.8%.
Under the first and second pregnancy period early stage estimated fetus marks, be greater than public algorithm from parent genotype and based on the accuracy that the method mensuration fetus mark of MLE reaches.In addition, the confidence measure that method disclosed herein produces, in determination result reliability, is especially vital under the more difficult low fetus mark of Ploidy detection.Open method uses the more inaccurate threshold method carrying out interpretation ploidy based on a large amount of disomy training dataset, and this is a kind of method of predefine false positive rate.In addition, when without confidence measure, when fetus cfDNA is not enough to make interpretation, open method has the risk of report false negative result.In certain embodiments, the reliability estimating value of the ploidy state of institute's interpretation is calculated.
Experiment 2
Object is in order to by using target sequence measurement and parent genotype and Hapmap data, detect, especially in the sample be made up of low fetus mark with the Noninvasive of 18,21 of Bayes's maximum likelihood estimation (MLE) algorithm improvement fetus and X tri-Signs.
According to IRB approval protocol, obtain maternal sample from four euploids and two positive conceived persons of three bodies and corresponding male parent sample from the patient that fetal chromosomal group type is known.Extract maternal cfDNA from blood plasma and after targeting specific SNP described in priority enrichment, obtain about 1,000 ten thousand sequence reads.Similarly parental animal is checked order, obtain genotype.
About normal chromosomal, No. 18, described algorithm correctly interpretation and No. 21 karyomit(e) two Signs of all euploid samples and aneuploid sample.18-tri-Signs and the interpretation of 21-tri-Signs are correct, as the X chromosome copy number in masculinity and femininity fetus.The degree of confidence produced by described algorithm is in all cases all more than 98%.
Described method reports the ploidy of all test chromosome of six samples exactly, comprises the sample be made up of the foetal DNA being less than 12%, and described sample accounts for about 30% of the 1st and the 2nd pregnancy period earlier time point.Key difference between MLE algorithm of the present invention and open method is that it utilizes parent genotype and Hapmap data to improve accuracy and to produce confidence measure.Under low fetus mark, all methods all become not too accurate; Correctly differentiate that when fetus cfDNA is not enough to make reliable interpretation sample is very important.Other people have employed Y chromosome specific probe to estimate the fetus mark of male fetus, but parallel parental gene somatotype achieves the estimation of the fetus mark of two kinds of sexes.Use another inherent limitations of the open method of non-targeted shotgun sequencing to be difference due to the such as factor of GC abundance, the accuracy of chromosomal ploidy interpretation is different.Target sequence measurement of the present invention changes independent of this chromosomoid scale to a great extent and between karyomit(e), obtains more consistent performance.
Experiment 3
Object novel information science can be used to analyze the SNP locus of the foetal DNA of the unmanaged flexibility in maternal blood plasma in order to determine whether, detects the trisomy of triploid fetus with high confidence level.
After the ultrasonic wave of exception, extract 20mL blood from pregnant patients.After centrifugation, maternal DNA (Di Niesi, Kai Jie (DNEASY, QIAGEN)) is extracted from leukocytic cream; Cell-free DNA (Qi Anpukaijie (QIAAMP QIAGEN)) is extracted from blood plasma.Target order-checking is applied to the SNP locus on No. 2 in two DNA sample, No. 21 and X chromosome.PRML Bayesian Estimation selects most probable hypothesis from all possible ploidy state group.Described method measures the clear and definite degree of confidence in foetal DNA mark, ploidy state and ploidy mensuration.Suppose about not doing with reference to chromosomal ploidy.Diagnosis uses the test statistics independent of sequence reads counting, and this is up-to-date prior art level.
No. 2, the inventive method Accurate Diagnosis and No. 21 chromosomal trisomys.Estimate that child's mark is 11.9% [CI11.7-12.1].Find that fetus has No. 2 and No. 21 chromosomal female parents and two male parents and copies, effective degree of confidence is 1 (word error probability < 10 -30).This realizes when being difference 92,600 and 258,100 readings in No. 2 and No. 21 karyomit(e)s.
This be first time prove from fetus be triploid maternal blood non-invasive prenatal diagnosis three Autosome, as by Metaphase Chromosome group type confirm.The existing method of non-invasive diagnostic can not detect the dysploidy in this sample.Current method to depend on three Autosomes relative to disome with reference to chromosomal superfluous sequence reads; But triploid fetus does not have disome reference.In addition, existing method does not reach similar high degree of confidence ploidy mensuration under this foetal DNA mark and sequence reads quantity.Flat-footed way described method is expanded to all 24 karyomit(e)s.
Experiment 4
Following scheme for using Standard PC R (mean do not use nested), to the DNA be separated from the maternal blood plasma of euploid pregnancy person and carry out 800 from the genomic dna of triploidy 21 clone and heavily increase.Storehouse preparation and amplification relate to single tube end flat end, are then A tailings.Use the coalescing agents box found in the special test kit of Agilent Xiu Er Selleck to run adapter to engage, and run PCR lasting 7 cycles.Then, use the different primers of the SNP on 800 targets No. 2, No. 21 and X chromosome to carrying out 15 STA cycles (95 DEG C, 30s; 72 DEG C, 1min; 60 DEG C, 4min; 65 DEG C, 1min; 72 DEG C, 30s).Reaction is run with 12.5nM primer concentration.Then with Yi Lu meter Na IIGAX sequenator, DNA is checked order.Sequenator exports 1,900,000 readings, wherein 92% is mapped to genome; Being mapped in those readings genomic, be mapped to by the region of target primer institute target more than 99%.Plasma dna is substantially the same with the quantity of genomic dna.Figure 15 shows two kinds of allelic ratios of about 780 SNP detected by sequenator in the genomic dna obtained from the known clone that is trisomys at No. 21 karyomit(e)s.It should be noted that for the ease of visual, depict allele ratio at this, because allele distributions is visually read not simple and clear.Circle represents the SNP on two Autosomes, and star represents the SNP on three Autosomes.Figure 16 illustrates with another of identical data in figure X, and wherein Y-axis is the relative populations for A and B measured by each SNP, and wherein X-axis is the wherein SNP quantity that separated by karyomit(e) of SNP.In figure 16, SNP1 finds on No. 2 karyomit(e)s to 312, and SNP 313 to 605 finds on No. 21 three Autosomes, and SNP 606 to 800 is on X chromosome.Data presentation two Autosome of No. 2 and X chromosome because Relative sequence counting is arranged in three bunches: the AA at figure top, BB in figure bottom and in the drawings between AB.The data presentation of No. 21 three Autosomes four bunches: the AAB around the AAA at figure top, online 0.65 (2/3), online 0.35 (1/3) the ABB around and BBB in figure bottom.
Figure 17 A-D shows the data of 800 identical double recipe cases, but is for measured by the DNA of four plasma samples amplification from pregnant woman.About these four samples, we estimate see seven points bunch: (1) along figure top be that wherein mother and fetus are all those locus of AA; (2) a little less than figure top be that wherein mother is AA and fetus is those locus of AB; (3) a little more than line 0.5 is that wherein mother is AB and fetus is those locus of AA; (4) along line 0.5 is that wherein mother and fetus are all those locus of AB; (5) a little less than line 0.5 is that wherein mother is AB and fetus is those locus of BB; (6) a little more than figure bottom be that wherein mother is BB and fetus is those locus of AB; (1) along figure bottom is that wherein mother and fetus are all those locus of BB.Fetus mark is less, and bunch between (1) and (2), bunch (3), interval between (4) and (5) and bunch between (6) and (7) be less.Estimate that interval is the half of the mark of the DNA of fetal origin.For example, if DNA is 20% fetus and 80% female parent, so we estimate that (1) to (7) is respectively centered by 1.0,0.9,0.6,0.5,0.4,0.1 and 0.0; Referring to such as Figure 17 D, POOL1_BC5_ reference ratio.If change DNA into be 8% fetus and 92% female parent, so we estimate that (1) to (7) is respectively centered by 1.00,0.96,0.54,0.50,0.46,0.04 and 0.00; Referring to such as Figure 17 B, POOL1_BC2_ reference ratio.If can't detect foetal DNA, so we estimate to can't see (2), (3), (5) or (6); Or we can say that interval is zero, and therefore (1) and (2) is overlapped, the same with (7) with (5) and (6) as (3), (4); Referring to such as Figure 17 C, POOL1_BC7_ reference ratio.It should be noted that Figure 17 A, the fetus mark of POOL1_BC1_ reference ratio is about 25%.
Experiment 5
The most methods of DNA cloning and measurement all can produce some allelotrope deviations, the intensity wherein detected by the usual two kinds of allelotrope found at locus or count the actual amount of not representation DNA sample allelic.For example, for single individuality, at heterozygous genes seat, we estimate to see that two kinds of allelic ratios are 1: 1, and this is the estimated theoretical ratio of heterozygous genes seat; But due to allelotrope deviation, we can see 55: 45 or even 60: 40.It shall yet further be noted that when checking order, if the reading degree of depth is low, so simple random noise all can cause significant allelotrope deviation.In one embodiment, if likely carry out modeling to make to observe consistent deviation for specific allelotrope to the behavior of each SNP, so this deviation can be corrected.Figure 18 shows the fractional data can explained by the binomial variance before and after offset correction.In figure 18, star represents viewed allelotrope deviation on the original sequence data of heavily testing 800; Circle represents allelotrope deviation after calibration.It should be noted that, if there is not allelotrope deviation completely, so predicated data declines along line x=y by we.Carry out data that one group of class likelihood data that DNA amplification produces produces after offset correction to drop to closely on line 1: 1 by using 150 heavy targets amplifications.
Experiment 6
Use and engage adapter with marking adapter and have the effect that universal amplification that specific primer carries out DNA has the ratio of the shorter DNA chain of enrichment, wherein primer annealing and extension time are limited to several minutes.Create the most of storehouses scheme of DNA library being applicable to check order contain this kind of step through being designed for, and disclose premises and described premises is well-known to those of skill in the art.In some embodiments of the invention, join plasma dna to by containing the adapter of common tags, and use and to adapter mark, there is specific primer and increase.In certain embodiments, common tags can be as check order the same tag that uses, it can be can be only a group echo for the common tags of pcr amplification or it.Because foetal DNA is normally short at occurring in nature, and maternal DNA can be short and long at occurring in nature, in this way has the effect of the ratio of the foetal DNA in enriched Mixture.The DNA of unmanaged flexibility is considered to the DNA from apoptotic cell, and containing fetus and maternal DNA, is short, and major part is lower than 200bp.The cell DNA discharged by cytolysis (common phenomenon after venous cutdown) is almost maternal usually completely, and also very long, and major part is more than 500bp.Therefore, the blood sample that exceedes several minutes is left standstill by the mixture containing short (fetus+maternal) and longer (female parent) DNA.Compared with the blood plasma only using target to increase to increase time, with the relatively short extension time, universal amplification is performed to maternal blood plasma, then carry out target and to increase the relative proportion often increasing foetal DNA.This can see in Figure 19, described in illustrate when input be plasma dna time fetus per-cent (longitudinal axis) contrast fetus per-cent of measuring when input DNA is and has used Yi Lu meter Na GAIIx storehouse to prepare the plasma dna in storehouse prepared by scheme of measuring.All points drop to below described line, and instruction storehouse preparation process concentrates the mark of the DNA of fetal origin.When performing storehouse and prepare before target amplification, the especially significant enrichment of the plasma sample display fetus mark of two redness (instruction hemolytic action and the amount of the therefore existing maternal DNA of length will increase to some extent from cytolysis).Some other situations that the situation that method disclosed herein is particularly useful for wherein existing hemolytic action or the cell that there occurs the relatively long chain wherein comprising the DNA polluted dissolve, the biased sample of short dna and length dna is polluted.Relatively short annealing and the time of extension are normally between 30 seconds and 2 minutes, but they can be short to 5 or 10 seconds or shorter, or reach 5 or 10 minutes.
Experiment 7
Following scheme is for using Direct PCR scheme and half nested type method, and carry out 1 to the DNA be separated from the maternal blood plasma of euploid pregnancy person and from the genomic dna of triploidy 21 clone, 200 heavily increase.Storehouse preparation and amplification relate to single tube end flat end, are then A tailings.Use the improvement of the coalescing agents box found in the special test kit of Agilent Xiu Er Selleck to engage to run adapter, and run PCR lasting 7 cycles.In target primer pond, carry out 550 detections to from No. 21 chromosomal SNP, and 325 detections are carried out to the SNP from each in No. 1 and X chromosome.Two schemes relate to 15 STA cycles (95 DEG C, 30s; 72 DEG C, 1min; 60 DEG C, 4min; 65 DEG C, 30s; 72 DEG C, 30s), use 16nM primer concentration.Half nested PCR scheme relate to 15 STA cycles (95 DEG C, 30s; 72 DEG C, 1min; 60 DEG C, 4min; 65 DEG C, 30s; 72 DEG C, 30s) the second amplification, use the inner forward label concentration of 29nM and 1 μM or 0.1 μM of reverse label concentration.Then with Yi Lu meter Na IIGAX sequenator, DNA is checked order.About Direct PCR scheme, the reading of 73% is mapped to genome; About half nested type scheme, the sequence reads of 97.2% is mapped to genome.Therefore, the information of half nested type scheme fecund raw about 30%, mainly may produce the primer of primer dimer owing to eliminating most probable.
Reading change in depth when use half nested type scheme is often higher than when using Direct PCR scheme (referring to Figure 20), wherein rhombus refers to the reading degree of depth of the locus run by half nested type scheme, and square refers to the reading degree of depth of the locus run when not nested.About rhombus, SNP is arranged by the reading degree of depth, and therefore rhombus all drops on curve, and foursquare dependency seems loose; The arrangement of SNP is arbitrary, and the height of point represents the reading degree of depth instead of its left-to-right position.
In certain embodiments, described herein method can reach the splendid reading degree of depth (DOR) variance.For example, in the version (Figure 21) of this experiment using 1,200 of genomic dna heavy Direct PCR amplifications, 1, in 200 detections: 1186 DOR detected are greater than 10; The average reading degree of depth is 400; The reading degree of depth of 1063 detections (88.6%) is between 200 and 800, and there is ideal window, wherein each allelic reading quantity is high enough to provide significant data, and the not high marginal purposes to those readings of each allelic reading quantity is little especially.Only 12 allelotrope have the higher reading degree of depth, and wherein the highest is 1035 readings.The standard deviation of DOR is 290, and average DOR is the coefficient of variation of 453, DOR is 64%, there is 950,000 total indicator reding, and the reading of 63.1% is mapped to genome.In another experiment (Figure 22) of the heavy half nested type scheme of use 1,200, DOR is higher.The standard deviation of DOR is 583, and average DOR is the coefficient of variation of 630, DOR is 93%, there is 870,000 total indicator reding, and the reading of 96.3% is mapped to genome.Note, in both of these case, SNP is arranged by the reading degree of depth of mother, therefore the maternal reading degree of depth of curve representation.Difference between child and father is not remarkable; It is the significant trend in order to this is explained just.
Experiment 8
In an experiment, the heavy PCR of half nested type 1,200 is used for from a cell with from three cell amplification DNA.This experiment with use the fetal cell that is separated from maternal blood to carry out antenatal dysploidy test or to use biopsy blastomere or trophectoderm sample to carry out implanting front gene diagnosis relevant.3 of 1 and 3 cell that often kind of situation exists from 2 individualities (46XY and 47XX+21) copy.Detect No. 1, target, No. 21 and X chromosome.Use the dissolving method that three kinds different: Acker Tours this (ARCTURUS), MPERv2 and Alkaline solubilization.In an order-checking swimming lane, order-checking is run to 48 samples of compound.For each in three karyomit(e) and for each in described copying, algorithm returns correct ploidy interpretation.
Experiment 9
In an experiment, prepare four maternal plasma samples and use half side nested type 9,600 double recipe case to increase.Prepare sample in the following manner: the maternal blood of centrifugal nearly 40mL is to be separated leukocytic cream and blood plasma.Prepare the genomic dna maternal sample from leukocytic cream and prepare male parent DNA from blood sample or saliva sample.Triumphant outstanding circle nucleic acid test kit (CIRCULATING NUCLEIC ACID kit) is used to be separated Cell-free DNA in maternal blood plasma and according to the specification sheets of manufacturers, wash-out in 45 μ L TE damping fluids.General joint adapter be attached to the end of each molecule of the purified plasma dna of 35 μ L and use adapter primer amplified storehouse, continuing 7 cycles.With An Jinkaoteanpulei (AGENCOURTAMPURE) bead purify storehouse and in 50 μ l water wash-out.
9600 desired specificities mark reverse primers that use primer concentration is 14.5nM and a storehouse adapter specific forward primer of 500nM, make 3 μ l DNA cloning, 15 STA cycle (first Polymerase activation 10min at 95 DEG C; Then be 15 cycles: 95 DEG C, 30s; 72 DEG C, 10s; 65 DEG C, 1min; 60 DEG C, 8min; 65 DEG C, 3min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).
Half side nested PCR scheme relates to and uses the reverse label concentration of 1000nM and for each the 16.6u nM concentration in 9600 desired specificities forward primers, carry out the second amplification of the cut back of a STA result, continue 15 STA cycle (first Polymerase activation 10min at 95 DEG C; Then be 15 cycles: 95 DEG C, 30s; 65 DEG C, 1min; 60 DEG C, 5min; 65 DEG C, 5min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).
Then by Standard PC R, mark the aliquots containig of the reverse primer amplification STA product of specificity forward and provided with bar code with 1 μM, continue 10 cycles, the order-checking storehouse of generating strap barcode.The aliquots containig in each storehouse mixes from different barcode storehouse and uses centrifugal column purifying.
In this way, in single hole reaction, 9,600 primers are used; Described primer is designed to target at No. 1, No. 2, No. 13, No. 18, No. 21, SNP that X and Y chromosome find.Then Yi Lu meter Na GAIIX sequenator is used to check order to amplicon.Pass through sequenator, each sample produces about 3,900,000 readings, wherein 3,700,000 (94%) readings are mapped to genome, and in those readings, 2,900,000 readings (74%) with 344 the average reading degree of depth and 255 intermediate value reading depth map to target SNP.The fetus mark of discovery four samples is 9.9%, 18.9%, 16.3% and 21.2%.
Relevant female parent and the genome DNA sample of male parent use half nested type 9600 double recipe case increase and check order.The difference of half nested type scheme is that it applies 9,600 outside forward primers in a STA and 7.3nM marks reverse primer.The thermal cycle conditions of the 2nd STA and composition and barcode PCR identical with half side nested type scheme.
Information law disclosed herein is used to analyze sequencing data and its DNA of interpretation is present in six chromosomal ploidy state of the fetus in 4 maternal plasma samples.With 28 chromosomal ploidy interpretations all in the degree of confidence more than 99.2% correctly group described in interpretation, except item chromosome is by correctly interpretation, but degree of confidence is 83%.
Figure 23 shows 9, the reading degree of depth of 600 heavy half side nesting methods and described in experiment 71, the reading degree of depth of 200 heavy half nested type methods, but the quantity that the reading degree of depth is greater than 100, is greater than 200 and is greater than the SNP of 400 is significantly higher than 1, in 200 double recipe cases.By the reading quantity at the 90th hundredths divided by the reading quantity at the 10th hundredths, the zero dimension metric of the homogeneity indicating the reading degree of depth can be obtained; Numeral is less, the reading degree of depth more even (narrow).About the method run in experiment 9, the average ratio of the 90th hundredths/10th hundredths is 11.5; And about the method run in experiment 7, described average ratio is 5.6.For the scheme multiplicity of specifying, the reading degree of depth is narrower, and order-checking efficiency is better, because fewer for guaranteeing that the reading of certain percentage exceedes the necessary sequence reads of reading amount threshold.
Experiment 10
In an experiment, prepare four maternal plasma samples and use half nested type 9,600 double recipe case amplification.The details of experiment 10 is very similar to experiment 9, and exception part is nested scheme, and comprises the identity of four samples.With 28 chromosomal ploidy interpretations all in the degree of confidence more than 99.7% correctly group described in interpretation.7600000 (97%) readings are mapped to genome, and 6,300,000 (80%) readings are mapped to target SNP.The average reading degree of depth is 751, and the intermediate value reading degree of depth is 396.
Experiment 11
In an experiment, three maternal plasma samples are split into five moieties, and each part uses 2,400 composite primer (four parts) or 1,200 composite primer (part) are increased and use half nested type scheme amplification, 10,800 primers altogether.After amplification, described part is merged together for order-checking.The details of experiment 11 is very similar to experiment 9, and exception part is nested scheme and split-and-merge method.With 21 chromosomal ploidy interpretations all in the degree of confidence more than 99.7% correctly group described in interpretation, except an interpretation of losing, wherein degree of confidence is 83%.3400000 readings are mapped to target SNP, and the average reading degree of depth is 404 and the intermediate value reading degree of depth is 258.
Experiment 12
In an experiment, four maternal plasma samples are split into four moieties, and each part uses 2,400 composite primer amplifications and uses half nested type scheme amplification, 9,600 primers altogether.After amplification, described part is merged together for order-checking.The details of experiment 12 is very similar to experiment 9, and exception part is nested scheme and split-and-merge method.With 28 chromosomal ploidy interpretations all in the degree of confidence more than 97% correctly group described in interpretation, except an interpretation of losing, wherein degree of confidence is 78%.4500000 readings are mapped to target SNP, and the average reading degree of depth is 535 and the intermediate value reading degree of depth is 412.
Experiment 13
In an experiment, prepare four maternal plasma samples and use 9,600 heavy triple half side nested type scheme amplifications, 9,600 primers altogether.The details of experiment 12 is very similar to experiment 9, and exception part is the nested scheme relating to three-wheel amplification; Three-wheel relates separately to 15,10 and 15 STA cycles.With the ploidy interpretation of 27 in 28 karyomit(e)s in the degree of confidence more than 99.9% correctly group described in interpretation, except one with 94.6% correctly interpretation, and degree of confidence is the interpretation of the loss of 80.8%.3500000 readings are mapped to target SNP, and the average reading degree of depth is 414 and the intermediate value reading degree of depth is 249.
Experiment 14
In an experiment, 1, the 200 heavy half nested type schemes of use increase 45 groups of cells, order-checking, and make ploidy to three karyomit(e)s and measure.It should be noted that this experiment be intended to simulate to from the embryo of the 3rd day unicellular biopsy or implant the condition of front gene diagnosis from the trophectoderm biopsy of the embryo of the 5th day.Be positioned over by 15 independent unicellular and 30 groups of three cells in 45 independent reaction tubess and carry out 45 reactions altogether, wherein each reaction comprises the cell from an only clone, but differential responses comprise the cell from different clone.Cell to be prepared in 5 μ l lavation buffer solutions and to dissolve damping fluid (applying biological system) by the general flower bud of interpolation 5 μ l Acker this pik of Tours (ARCTURUS PICOPURE) and dissolve and be incubated 20 minutes at 56 DEG C, at 95 DEG C, be incubated 10 minutes.
Use 1200 desired specificities forwards and mark reverse primer that primer concentration is 50nM, the DNA of amplification list/tri-cells continues 25 STA cycle (first Polymerase activation 10min at 95 DEG C; Then be 25 cycles: 95 DEG C, 30s; 72 DEG C, 10s; 65 DEG C, 1min; 60 DEG C, 8min; 65 DEG C, 3min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).
Half nested PCR scheme relates to use 1000nM and oppositely marks Auele Specific Primer concentration and the concentration for each 60nM in 400 desired specificities nested type forward primers, carry out the second amplification that three of the cut back of a STA result are parallel, continue 20 STA cycles (first at 95 DEG C of Polymerase activation 10min; Then be 15 cycles: 95 DEG C, 30s; 65 DEG C, 1min; 60 DEG C, 5min; 65 DEG C, 5min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).In three parallel 400 heavily reaction, increase 1200 targets altogether thus that increase in a STA.
Then by Standard PC R, mark the aliquots containig of the reverse primer amplification STA product of specificity forward and provided with bar code with 1 μM, continue 15 cycles, the order-checking storehouse of generating strap barcode.The aliquots containig in each storehouse mixes from different barcode storehouse and uses centrifugal column purifying.
In this way, in unicellular reaction, 200 primers are used; Described primer is designed to the SNP that target finds in No. 1, No. 21 and X chromosome.Then Yi Lu meter Na GAIIX sequenator is used to check order to amplicon.By sequenator, each sample produces about 3,900,000 readings, and wherein 500,000 to 800,00,000 ten thousand readings are mapped to genome (all readings of each sample 74% to 94%).
Use half identical nested type 1200 re-detection pond and similar scheme, with less cycle and 1200 heavy 2nd STA, analyze the genome DNA sample of relevant female parent from clone and male parent, and check order.
Information law disclosed herein is used to analyze sequencing data and three chromosomal ploidy state of sample described in interpretation.
Figure 24 shows three karyomit(e) (1=1 karyomit(e)s of six samples; 2=21 karyomit(e); 3=X karyomit(e)) normalization method reading depth ratio (longitudinal axis).Ratio set is become to equal the quantity being mapped to described chromosomal reading, is normalized, and divided by being mapped to the quantity of described chromosomal reading, three holes is averaged, eachly comprise three 46XY cells.Three groups of data point expected rate corresponding to 46XY reaction are 1: 1.Three groups of data points corresponding to 47XX+21 cell estimate that No. 1 chromosomal ratio is 1: 1, and No. 21 karyomit(e)s are 1.5: 1, and X chromosome is 2: 1.
Figure 25 shows and reacts for three the allele ratio drawn for three karyomit(e)s (No. 1, No. 21, X).The reaction of lower left shows the reaction on three 46XY cells.Left field is No. 1 chromosomal allele ratio, and region intermediate is No. 21 chromosomal allele ratio, and right side area is the allele ratio of X chromosome.About 46XY cell, about No. 1 karyomit(e), we estimate that the ratio seen is 1,0.5 and 0, corresponding to SNP frequency of genotypes AA, AB and BB.About 46XY cell, about No. 21 karyomit(e)s, we estimate that the ratio seen is 1,0.5 and 0, corresponding to SNP frequency of genotypes AA, AB and BB.About 46XY cell, about X chromosome, we estimate that the ratio seen is 1 and 0, corresponding to SNP genotype A and B.Bottom-right reaction shows the reaction on three 47XX+21 cells.As in the figure of lower-left, separate each allele ratio by karyomit(e).About 47XX+21 cell, about No. 1 karyomit(e), we estimate that the ratio seen is 1,0.5 and 0, corresponding to SNP frequency of genotypes AA, AB and BB.About 47XX+21 cell, about No. 21 karyomit(e)s, we estimate that the ratio seen is 1,0.67,0.33 and 0, corresponding to SNP frequency of genotypes AA A, AAB, ABB and BBB.About 47XX+21 cell, about X chromosome, we estimate that the ratio seen is 1,0.5 and 0, corresponding to SNP frequency of genotypes AA, AB and BB.Top-right figure is being obtained by reacting of 1ng genomic dna about comprising from 47XX+21 clone.Figure 26 shows as figure identical in Figure 25, but only performs reaction to a cell.Left figure is the reaction comprising 47XX+21 cell, and right figure is the reaction comprising 46XX cell.
From the figure shown in Figure 25 and Figure 26, intuitively obviously, exist about chromosomal two points bunch, wherein we estimate that the ratio seen is 1 and 0; About chromosomal three points bunch, wherein we estimate that the ratio seen is 1,0.5 and 0; And about chromosomal four points bunch, wherein we estimate that the ratio seen is 1,0.67,0.33 and 0.Parental support algorithm can make correct interpretation to all three karyomit(e)s of all 45 reactions.
Experiment 15
In an experiment, prepare maternal plasma sample and use half side nested type 19,488 double recipe case to increase.Prepare sample in the following manner: the maternal blood of centrifugal nearly 20mL is to be separated leukocytic cream and blood plasma.Prepare the genomic dna maternal sample from leukocytic cream and prepare male parent DNA from blood sample or saliva sample.Triumphant outstanding circle nucleic acid test kit is used to be separated Cell-free DNA in maternal blood plasma and according to the specification sheets of manufacturers, wash-out in 50 μ L TE damping fluids.General joint adapter be attached to the end of each molecule of the purified plasma dna of 40 μ L and use adapter primer amplified storehouse, continuing 9 cycles.With An Jinkaoteanpulei bead purify storehouse and in 50 μ l DNA buffer suspension liquid wash-out.
19,488 desired specificities mark reverse primers that use primer concentration is 7.5nM and a storehouse adapter specific forward primer of 500nM, make 6 μ l DNA cloning 15 the 1st take turns the STA cycle (first Polymerase activation 10min at 95 DEG C; Then be 15 cycles: 96 DEG C, 30s; 65 DEG C, 1min; 58 DEG C, 6min; 60 DEG C, 8min; 65 DEG C, 4min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).
Half side nested PCR scheme relates to and uses the reverse label concentration of 1000nM and for 19, each 20nM concentration in 488 desired specificities forward primers, carry out the second amplification that the 1st takes turns the cut back of STA result, continue 15 cycle (the 2nd takes turns STA) (first Polymerase activation 10min at 95 DEG C; Then be 15 cycles: 95 DEG C, 30s; 65 DEG C, 1min; 60 DEG C, 5min; 65 DEG C, 5min and 72 DEG C, 30s; And last at 72 DEG C of downward-extension 2min).
Then by Standard PC R, the aliquots containig of STA product is taken turns with the reverse primer amplification the 2nd of 1 μM of mark specificity forward and provided with bar code, lasting 12 cycles, the order-checking storehouse of generating strap barcode.The aliquots containig in each storehouse mixes from different barcode storehouse and uses centrifugal column purifying.
In this way, in single hole reaction, 19,488 primers are used; Described primer is designed to target at No. 1, No. 2, No. 13, No. 18, No. 21, SNP that X and Y chromosome find.Then Yi Lu meter Na GAIIX sequenator is used to check order to amplicon.Pass through sequenator, each plasma sample produces about 1,000 ten thousand readings, wherein 940-960 ten thousand reading is mapped to genome (94%-96%), and in those readings, 99.95% with 460 the average reading degree of depth and 350 intermediate value reading depth map to target SNP.In order to compare, perfection is uniformly distributed and will is: 10M reading/19,488 target=513 reading/targets.About primer dimer, 30,000 reading is from checked order primer dimer (0.3% of the reading produced by sequenator).About genomic samples, the reading of 99.4%-99.7% is mapped to genome, and in those readings, 99.99% is mapped to target SNP, and 0.1% reading produced by sequenator is primer dimer.
About the plasma sample with 1,000 ten thousand order-checking readings, usual 19, in 488 target SNP, at least 19,350 (99.3%) are amplified and check order.About the DNA sample with 2M order-checking reading, at least 19,000 target SNP (97.5%) is amplified and checks order usually.Lower quantity may be due to sampling noise, because reading quantity is lower and sequenator misses some amplified productions.If desired, order-checking reading quantity can be increased be amplified and the quantity of the target SNP checked order to increase.
7.5nM half nested type 19,488 outside forward primers and mark reverse primer is used to take turns in STA the 1st increase relevant female parent and the genome DNA sample of male parent.2nd thermal cycle conditions of taking turns STA and composition and barcode PCR identical with half side nested type scheme.
The average fetus mark of discovery 407 samples is 14.8%.Information law disclosed herein is used to analyze sequencing data and the ploidy state of X chromosome in four karyomit(e)s (No. 13, No. 18, No. 21, Y) of its DNA of interpretation is present in 407 maternal plasma samples in 378 fetus and in 407 maternal plasma samples 375.With in the degree of confidence more than 90% correctly group described in interpretation all 1,887 chromosomal ploidy interpretations.1882 in 1887 interpretations more than 95%; And 1,862 in 1,887 interpretations with the degree of confidence interpretation more than 99%.
The DNA using water to replace extracting from blood plasma performs and control experiment similar blood plasma PCR scheme.Based on six these type of tests of experiment, the order-checking reading of 5%-6% is primer dimer.Other order-checking reading is caused by ground unrest.This tests proof, even if when there is not the nucleic acid samples of the target gene seat (instead of hybridize to other primer and form amplimer dimer) of hybridizing containing primer, define less primer dimer.
Experiment 16
Following experiment shows is for designing and selecting the exemplary methods in the primer storehouse that may be used in any one in composite PCR method of the present invention.Object is the primer in order to a large amount of target gene seat (or subgroup of target gene seat) of selecting to may be used for increasing in single reaction from initial candidate primer storehouse simultaneously.About one group of initial candidate target gene seat, for each target gene seat design or primer need not be selected.Preferably, design for the target gene seat that major part is the most desirable and select primer.
Step 1
Based on the open available information about the parameter desired by target gene seat, such as, in the target group frequency of SNP or the heterozygosis rate (www.ncbi.nlm.nih.gov/projects/SNP/ of SNP; Thank to inner ST (Sherry ST), the people dbSNP such as Ward MH (Ward MH), Ke Luoduofu M (Kholodov M): the ncbi database (dbSNP:the NCBI database of genetic variation) of genovariation. nucleic acids research (Nucleic Acids Res.) January 1 calendar year 2001; 29 (1): 308-11, its mode quoted in full is separately incorporated herein), select one group of candidate target locus (such as SNP).For each candidate gene seat, use Primer3 program (www.primer3.sourceforge.net; Libprimer3 version 2 .2.3, its mode hereby quoted in full is incorporated herein) design one or more PCR primer pair.If not for the feasible PCR primer design of specific objective locus, so according to further considering, eliminate described target gene seat.
If desired, can calculate " target gene seat mark " (desirability that higher fraction representation is higher) of great majority or all target gene seats, the weighted mean of the various desired parameters of such as based target locus, calculates target gene seat mark.Described parameter can distribute different weights based on it for by the importance of the application-specific using primer.Exemplary parameters comprise target gene seat heterozygosis rate, with target gene seat sequence (such as, polymorphism) relevant incidence rate, with in the relevant disease penetrance of the sequence (such as, polymorphism) of target gene seat, for the specificity of the candidate drugs of the target gene seat that increases, for the size of the candidate drugs of Extended target locus and the size of target amplicon.
Step 2
Use mutually mark (referring to such as I ties up H.T. (Allawi mutually from step 1 thermodynamics calculated between each primer and all primers of other target gene seats all, and the sub-J. (SantaLucia of little Sheng Talu fine jade H.T.), J., Jr.) (1998), " thermodynamics (Thermodynamics of Internal C-T Mismatches in DNA) of inner C-T mispairing in DNA ", nucleic acids research 26,2694-2701; Pendant thunder N. (Peyret, N.), Wella Tener P.A. (Seneviratne received by plug, P.A.), I ties up H.T. and the sub-J. (1999) of little Sheng Talu fine jade, " there is nearest neighbour thermodynamics and the NMR (Nearest-Neighbor Thermodynamics and NMR of DNA Sequences with Internal A-A; C-C; G-G; and T-T Mismatches) of the DNA sequence dna of inner A-A, C-C, G-G and T-T mispairing ", biological chemistry (Biochemistry) 38,3468-3477; I ties up H.T. and the sub-J. (1998) of little Sheng Talu fine jade, " the nearest neighbour thermodynamics of the inside A-C mispairing in DNA: serial correlation and pH effect (Nearest-Neighbor Thermodynamics of Internal A-C Mismatches in DNA:Sequence Dependence and pH Effects) ", biological chemistry 37,9435-9444; I ties up H.T. and the sub-J. (1998) of little Sheng Talu fine jade, " the nearest neighbour thermodynamical coordinate (Nearest Neighbor Thermodynamic Parameters for Internal G-A Mismatches in DNA) of inner G-A mispairing in DNA ", biological chemistry 37,2170-2179; And I ties up H.T. and the sub-J. (1997) of little Sheng Talu fine jade, " thermodynamics of the inside G-T mispairing in DNA and NMR (Thermodynamics and NMR of Internal G-T Mismatches in DNA) ", biological chemistry 36,10581-10594; MultiPLX 2.1 (Kapp Lin Siji L (Kaplinski L), An Deliesong R (Andreson R), the Auto-grouping of Pulan moral T (Puurand T), rem M (Remm M) .MultiPLX:PCR primer and assessment (MultiPLX:automatic grouping and evaluation of PCR primers). information biology (Bioinformatics) .2005 April 15; 21 (8): 1701-2, its mode hereby quoted in full is separately incorporated herein).This step creates the 2D matrix using mutually mark mutually.The mutually mutual likelihood relating to the primer dimer of two kinds of interactional primers with Score on Prediction.Following calculating mark:
Interaction mark=max (-Δ G_2,0.8* (-Δ G_i))
Wherein
Δ G_2=is by dimeric Gibbs energy (Gibbs energy) (break energy dimer needed for) of PCR in two ends (namely 3 ' end of each primer is annealed to another primer) upper extension; And
The dimeric Gibbs energy that Δ G_1=extends at least one end by PCR.
Step 3:
For each target gene seat, if there is the design of more than one primer pairs, following methods is so used to select a kind of design:
1 for each primer pair design of locus, find for kind of the primer of two in described design and for all primers of all designs of other target gene seats all worst case (the highest) use mark mutually mutually.
2 select the design that the worst case with the best (minimum) uses mutually mark mutually.
Step 4
Build a figure and represent a locus and its relevant primer pair design (such as, maximal clique problem) to make each node.Create a limit between every pair of nodes.Assign weight to every bar limit, the worst case (the highest) that described weight equals between the primer relevant to two nodes connected by described limit uses mark mutually mutually.
Step 5
If desired, for every a pair design of two different target locus, wherein from a kind of design a kind of primer and from another kind design a kind of primer will be annealed to overlapping with target area, two kinds design node between add a limit again.By the weight setting on these limits for equaling highest weighting distributed in step 4.Therefore, step 5 prevents storehouse to have will be annealed to the primer overlapping with target area, and thus prevent during composite PCR reaction interfering with each other.
Step 6
Prima facies is calculated as follows by score threshold mutually:
Weight threshold=max (limit weight)-0.05* (max (limit weight)-min (limit weight))
Wherein
Max (limit weight) is the maximum limit weight in figure; And
Min (limit weight) is the minimum edge weight in figure.
The initial boundary of threshold value is set as follows:
Max_ weight threshold=max (limit weight)
Min_ weight threshold=min (limit weight)
Step 7
Construct a newly figure, described figure is made up of the set of node identical with the figure of step 5, only comprises the limit that weight exceedes weight threshold.Therefore, step have ignored the interaction that mark is equal to or less than weight threshold.
Step 8
Node (being connected to the limit of removing node with all) is removed until do not have limit to be left from the figure of step 7.Node is removed by applying following program repeatedly:
The node of 1 searching top (limit of maximum quantity).If there is more than one, select so arbitrarily one.
The set of node that 2 definition are made up of above selected node and all nodes of being connected with it, but any node not comprising that degree is less than above selected node.
3 select the node with minimum target gene seat mark (mark is lower, represents that desirability is lower) from the set of step 1.That node is removed from figure.
Step 9
If remaining number of nodes meets the target gene seat quantity (in tolerance) needed for composite PCR pond in figure, so continue described method in step 10.
If remaining node is too much or very few in figure, so performs dichotomizing search and determine that what kind of threshold value can make the node of the desired quantity of residue in figure.If there is multinode in the drawings, so the adjustment of weight threshold boundary was as follows:
Max_ weight threshold=weight threshold
Otherwise (if there is very few node in the drawings), so the adjustment of weight threshold boundary is as follows:
Min_ weight threshold=weight threshold
Then, weight threshold adjustment is as follows:
Weight threshold=(max_ weight threshold+min_ weight threshold)/2
Repeating step 7-9.
Step 10
For primer storehouse selects the primer pair relevant to node remaining in figure to design.This primer storehouse may be used in any one in the inventive method.
If desired, can to wherein only primer (instead of primer pair) be used to increase primer storehouse of target gene seat perform this design and select the method for primer.In the case, node represents a primer (instead of primer pair) of each target gene seat.
Experiment 17
Figure 27 is the figure comparing two the primer storehouses using the inventive method design.This diagram illustrates have specific secondary gene frequency by the quantity of the locus of each primer storehouse target.Between the selecting period in " new pond " storehouse, retain more primers.This storehouse achieves more multiple goal locus, especially there is the amplification of the target gene seat (they are for certain methods of the present invention, such as, for detecting the allelotrope providing more information of fetal chromosomal abnormalities) of relatively large secondary gene frequency.
These primer storehouses are used in following composite PCR method.Blood (20-40mL) is collected to two to four CELL-FREE from each experimenter tMin DNA pipe (Shi Teleike (Streck)).Via dual centrifugation protocol (2,000g, 20min; Then be 3,220g, 30min) separated plasma (minimum 7mL) from each sample, shifts supernatant liquor after first time rotates.Use Kai Jieqianpu circle nucleic acid test kit to be separated cfDNA from 7-20mL blood plasma and in 45 μ LTE damping fluids wash-out.Pure female parent gene group DNA is separated from the leukocytic cream obtained after first time is centrifugal, and similarly from the pure male parent gene group DNA of blood, saliva or cheek sample preparation.
Using 11,000 desired specificities to detect makes maternal cfDNA, female parent gene group DNA and male parent gene group DNA sample increase in advance 15 cycles, and aliquots containig is transferred in the 2nd PCR reaction in 15 cycles using nested primers.Finally, by adding the sample of mark for the preparation of order-checking of provided with bar code in the PCR in 12 cycles of third round.Therefore, increase 11,000 target in single reaction; The SNP that described target is included in No. 13, No. 18, No. 21, X and Y chromosome find.Then Yi Lu meter Na GAIIx or HISEQ sequenator is used to check order to amplicon.To check order to parent genotype lower than the genotypic reading degree of depth of fetus (the cfDNA reading degree of depth of about 20%).
Experiment 18
If desired, the size and number of PCR primer can use standard method to analyze, such as, use Agilent technology 2100 biological analyser (Figure 28 A-M).For example, 2,400 weights (Figure 28 B-28G) and 19,488 heavily test in (Figure 28 H to 28M) use described not containing nested Direct PCR method herein.The amount of the primer of Figure 28 B-28D and 28H to 28J is 10nM.The amount of the primer of Figure 28 E-28G and 28K to 28M is 1nM.The amount of the input DNA of Figure 28 B, 28E, 28H and 28K is 24ng; Figure 28 C, 28F, 28I and 28L are 80ng; And Figure 28 D, 28G, 28J and 28M are 250ng.More multi input DNA produces the 180 desired base pair products of more vast scale.Primer dimer products at the peak value at 140 base pair places.
Experiment 19
Principle Evidence based research proves, in all karyomit(e), T13, T18, T21,45, X and the accuracy of detection of 47, XXY equally high.
Patient
According to local law, according to the agreement ratified by institutional review board, recruit conceived Mr. and Mrs at specific prenatal care center.Inclusion criteria is at least 18 years old ages, week in pregnant age at least nine, single pregnancy and sign Informed Consent Form.Extract blood sample from pregnant mothers, and collect blood or cheek sample from father.Select from 2 T13 (handkerchief pottery syndrome) conceived person, 2 conceived persons of T18 (Edward), 2 conceived persons of T21 (Down's syndrome), 2 45, X pregnancy person, 2 47, the sample of XXY pregnancy person and 90 normal pregnancies, then tests about 500 chromosome abnormalties that women colony is detected to test described method.When postnatal child can be obtained to be organized, confirm normal fetal chromosomal group type by carrying out molecule genome analysis to sample.Euploid sample was extracted from low risk women before invasive test.Within at least 7 days after invasive test, extract aneuploid sample and confirm dysploidy via the cytogenetics genome analysis in independent experiment room or fluorescence in situ hybridization.
Sample preparation and composite PCR
About the data in Figure 30 A-E, 30G, 30H and 31A-31G, perform sample preparation and 19 as tested described in 15,488 heavy PCR.About the data in Figure 30 F, perform sample preparation and 11 as tested described in 17,000 heavy PCR.
Method and data analysis
Described algorithm considers parent genotype and chiasma frequency data (such as from the data of HapMap database), for 19,488 polymorphic locuses are for the possible fetus ploidy state of huge amount and under various fetus cfDNA mark, calculate and estimate allele distributions.(Figure 29 A-29C).Be different from the method based on allele ratio, it also contemplates linkage disequilibrium, and uses non-gaussian data model, in view of viewed platform features and amplification deviation are described in the expectation distribution of the allelotrope measuring result of SNP.It is more various prediction allele distributions and the actual allele distributions (Figure 29 C) as measured in cfDNA sample then, and calculate often kind of hypothesis (monosomy, disomy or trisomy based on sequencing data, wherein based on various potential intersection, there is multiple hypotheses) likelihood.The ploidy state interpretation with largest global likelihood to likelihood summation (Figure 29 D) of often kind of independent monosomy, disomy or trisomy hypothesis, and is copy number and fetus mark (Figure 29 E) by described algorithm.Although laboratory researchers can not ignore sample karyotype, described algorithm is when ignoring truth without interpretation ploidy state when human intervention.
Data interpretation
Produce data figure represent
In order to measure the ploidy state of relative chromosome, described algorithm is considered at every bar chromosomal 3, the distribution of the sequence count of each in two kinds of possibility allelotrope of 000 to 4,000 SNP.Be important to note that, described algorithm use is not suitable for visualization method and makes ploidy interpretation.Therefore, in order to illustrate, at this to be labeled as two kinds of allelic rate form of most probable of A and B, demonstrating data in a simplified manner, to make it possible to more easily make pertinent trends visual.This simplified illustration does not consider some in algorithm characteristics.For example, in described algorithm can not be by method for visualizing two importances that illustrate of display allele ratio: the ability 1) utilizing linkage disequilibrium, namely in the measuring result of a SNP on the impact of the possible identity of adjacent S NP; With 2) in view of platform features and amplification deviation, use non-gaussian data model be described in the allelotrope measuring result of SNP expectation distribution.It shall yet further be noted that described algorithm only considers two kinds of most common alleles at each SNP, have ignored the allelotrope that other is possible.
Figure in Figure 30 A-30H represents the sample comprising and wherein there are two, one or three fetal chromosomals.In general, these figures represent and indicate orthoploidy (Figure 30 A-30C), monosomy (Figure 30 D) and trisomy (Figure 30 E-30H) respectively.In all figure, the single SNP of each expression, wherein from left to right draws the target SNP of item chromosome successively along transverse axis.The allelic reading quantity of longitudinal axis instruction A, the fractional form of the allelic reading sum of A and B in SNP.It should be noted that described measurement carries out the total cfDNA be separated from maternal blood, and cfDNA comprises cfDNA that is maternal and fetus; Therefore, the fetus of the described SNP of each expression and the combination of maternal DNA contribution.Therefore, genotype that is maternal and fetus is moved gradually or moves down in figure by making some points depend on the ratio of maternal cfDNA to be increased to 100% from 0%.This point is explained in more detail with corresponding figure hereinafter.
If want to promote visual, so color coding can be carried out according to female genotype to each point, because the position of female genotype to each point is contributed more and most of trisomy is maternal inheritance; This contributes to making ploidy state visual.Specifically, female genotype is that the SNP of AA can represent by redness, and female genotype is that the SNP of AB can represent by green, and the SNP that female genotype is BB can represent by blueness.
In all cases, the isozygoty upper limit of SNP and figure of (AA) of the A allelotrope finding in mother and fetus is closely related, because the mark of A allelotrope reading is high, because should not there is B allelotrope.Otherwise, find that the lower limit of SNP and the figure that the B allelotrope in mother and fetus isozygotys is closely related, because the mark of A allelotrope reading is low, because should only there is B allelotrope.Not with the upper and lower bound of figure closely-related point represent wherein mother, fetus or both are SNP of heterozygosis; These points are applicable to differentiate fetus ploidy, but can also provide information about mensuration male parent contrast maternal inheritance.These genotype based on maternal and fetus and fetus mark separate, and therefore, stoichiometry and fetus mark are depended in the exact position along each independent point of y-axis.For example, wherein mother be AA and fetus be AB locus estimate depend on fetus mark, there is the A allelotrope reading of different mark, and thus along y-axis difference location.
There are two karyomit(e)s
Figure 30 A-30C depicts when sample is that female parent (does not exist fetus cfDNA completely, Figure 30 A), containing medium fetus cfDNA mark (Figure 30 B) or containing high fetus cfDNA mark (Figure 30 C) time, indicate two chromosomal data that there is situation.
Figure 30 A shows the data of the cfDNA acquisition be separated in the blood of never conceived women.When there is not fetus cfDNA and sample only contains maternal cfDNA, described figure only represents euploid female genotype; Indicator model comprises point " bunch ": with the closely-related redness in figure top bunch (wherein female genotype is the SNP of AA), with the figure closely-related blueness in bottom bunch (wherein female genotype is the SNP of BB) and single concentrated green bunch (wherein female genotype is the SNP of AB) (color is not shown in the figures).
When there is fetus cfDNA, the skew of the position of point is to make bunch to be separated into discrete " band ".It should be noted that about fetus mark be the sample of 0%, the grouping of point is called " bunch " (as in Figure 30 A); And about all samples of fetus mark > 0%, the grouping of point is called " band " (as in Figure 30 B-30J).If fetus mark is enough high, so will easily see these divergent belts.Specifically, Figure 30 B with 30C illustrates the feature mode relevant to two fetal chromosomals existed with medium and high fetus mark respectively.This pattern comprises two " edge " band (color is not shown in the figures) corresponding to the SNP that isozygotys in mother corresponding to three center green bands of the heterozygosis SNP in mother and the top (redness) of each comfortable figure and bottom (blueness).
Figure 30 B shows the data from carrying the cfDNA that is separated euploid fetus and the plasma sample of the women of 12% fetus cfDNA mark and obtaining.At this, be separated into two divergent belts separately with the top of figure and bottom closely-related point bunch: a redness and a upper limit that is blue and figure or lower limit keep closely-related external edge marginal zone; With a redness and a blue internal edges marginal zone (color is not shown in the figures) separated with the boundary of figure.These internal edges marginal zones centered by 0.92 and 0.08 represent that wherein female genotype is AA and fetus genotype is the SNP (representing by redness) of AB respectively, and wherein female genotype is BB and fetus genotype is the SNP (representing by blueness) of AB.The center cluster of green point expands, but under this fetus mark, is separated into unique band and is not easy to see.
Under high fetus cfDNA mark, easily see chromosomal typical module (one group of three green band and two redness and two blue edge bands) (color is not shown in the figures) that there is situation of expression two clearly.Figure 30 C illustrates the data that the plasma sample that carries the women of euploid fetus from the fetus cfDNA mark with 26% obtains.At this, peripheral zone is separated to make inner band towards figure off centering, and this is because the allelic level of B is changed because fetus cfDNA mark increases.Noteworthy, under higher fetus mark, easily see now center green bunch clearly and divide into three different band.In the case, this ternary middle its central band cluster around 0.37,0.50 and 0.63, corresponds to wherein female genotype and is AB and fetus genotype is those SNP of AA (top), AB (centre) and BB (bottom).
These indicator models, namely three green band and four peripheral zones (two redness and two bluenesss) represent that two chromosomally exist situation, as the X chromosome in euchromosome orthoploidy or in women (XX) fetus.
There is item chromosome
When fetus only heredity Single chromosome and therefore only heredity single allelotrope time, the heterozygosity of fetus is impossible.Therefore, uniquely possible fetus SNP identity is A or B.Therefore, the feature mode of the haplochromosome of maternal inheritance has two center green bands that expression mother is the SNP of heterozygosis; And only have and represent that wherein mother is single edge redness and the blue ribbon of the SNP isozygotied, and they keep closely related (Figure 30 D) (color is not shown in the figures) with the upper and lower bound (1 and 0) of figure respectively.It should be noted that to there is not internal edges marginal zone.This modal representation item chromosome there is situation, as the X chromosome in the euchromosome monosomy of maternal inheritance or in the male sex (XY) fetus.
There are three karyomit(e)s
Three Autosomes have three kinds of feature modes.The reduction division trisomy of the first modal representation maternal inheritance, wherein fetus is from mother's heredity two homologies, inconsistent chromosomal reduction division mistake (Figure 30 E); This pattern comprises two center green bands and edge is red and each two of blue ribbon.The reduction division trisomy of (color is not shown in the figures) the second modal representation paternal inheritance, wherein fetus is from father's heredity two homologies, inconsistent karyomit(e) (Figure 30 F); This pattern comprises four center green bands and three edge red zones and three edge blue ribbons (color is not shown in the figures).The mitotic division trisomy of the maternal ground (Figure 30 G) of the third modal representation or male parent ground (Figure 30 H) heredity, wherein fetus is from mother or father's heredity two consistent chromosomal mitotic division mistakes; This pattern comprises four center green bands and edge is red and each two of blue ribbon.The mitotic division trisomy of female parent ground and the heredity of male parent ground can be distinguished position that is red by side and blue ribbon, makes redness and the center (color not shown in the figures) of blue internal edges marginal zone (not those relevant to the boundary of figure) closer to the mitotic division trisomy of male parent ground heredity.This is caused by consistent chromosomal male parent contribution.It should be noted that our Previous results shows, in the blastomere stage, the trisomy of the maternal inheritance of 66.7% is maiotic, and only the trisomy of 10.2% is paternal inheritance.
About Y chromosome, PS method considers a different set of hypothesis: there are zero, one or two karyomit(e)s.Because there is not maternal contribution and be impossible (two Y chromosome must relate to two consistent chromosomal situations) because of heterozygous genes seat in the sequence reads of each locus, so described band keeps and the top (A allelotrope) of figure or bottom (B allelotrope) closely related (data are not shown in the figures), and depend on quantitative allelotrope enumeration data, enormously simplify analysis.Should notice that its uses the homology non-recombinant SNP from Y chromosome because described method inquiry SNP, therefore for a probe to the data obtained on X and Y.
Differentiate dysploidy
In view of enough fetus marks, use this method for visualizing based on figure to differentiate that euchromosome dysploidy is simple and clear, and only need chromosomal abnormal quantity existing in discriminating figure, as mentioned above.The copy number knowledge of combination X and Y chromosome identifies the presence or absence of sex chromosome dysploidy.Specifically, represent that genotype is 47, the figure of the fetus of XXX will have typically " three karyomit(e) " pattern, and represent that genotype is 47, typical " two karyomit(e) " pattern that the figure of the fetus of XXY will have for X chromosome, but the allelotrope reading also will with an instruction existence Y chromosome.Described method similarly can interpretation 47, XYY, and wherein " item chromosome " modal representation exists single X chromosome, and allelotrope reading instruction existence two Y chromosomes.Genotype is typical " item chromosome " pattern that the fetus of 45, X will have for X chromosome, and the data of instruction zero Y chromosome.
The impact of fetus mark
As discussed above, the quantity from the sequence reads of fetus to facilitate in figure each point along the exact position of y-axis.Because fetus mark can affect the ratio of the reading being derived from fetus and mother, so it also can affect the location of each point.Under the balloon score (in general exceeding about 20%) of fetus cfDNA, as in Figure 30 C-30E and Figure 30 G and 30H, easily clear, although point is bunch mainly based on female genotype, the existence being different from the allelic foetal DNA of female genotype from genotype makes bunch to shift into multiple different band.But, along with fetus mark reduces (as in Figure 30 B and 30F), put and return towards the limit of figure and center, produce more closely bunch.Specifically, wherein female genotype is that the edge red zone group of AA returns towards the top of figure; Wherein female genotype is that the edge blue ribbon group of BB returns towards bottom; Wherein mother is that the center green band group of heterozygosis becomes single bunch (comparison diagram 30B and 30C) (color is not shown in the figures) at the central compressed of figure.Although for low fetus mark situation, use this visualization technique and be not easy to see dysploidy clearly, described algorithm can differentiate the ploidy state with extremely low fetus mark, such as 3% fetus mark.It can so do is because point-device data model of the allele distributions of the more viewed data of statistical technique and prediction designated samples parameter set (comprising such as copy number, parent genotype and fetus mark).Data model accuracy is crucial in low fetus mark situation because the difference between the allele distributions of Different Ploidy state and fetus mark proportional.In addition, described algorithm can determine when data contained by data set are not enough to make the fetus ploidy be sure of and determine.
Result
That the order-checking reading being mapped to target SNP thinks to provide information and used by described algorithm.The target gene seat more than 95% is observed in sequencing result.Depict in Figure 31 A-31G and make the visual figure of crucial ploidy interpretation.Figure 31 A represents euploid sample.At this, No. 13, No. 18 and No. 21 karyomit(e)s have typically " two karyomit(e) " pattern (as described herein).This comprises one group of three center green band and two redness and two blue peripheral zones.Two center green bands of this and X chromosome and existence instruction euploid XY genotype (color is not shown in the figures) along the Y chromosome band of the periphery of figure.
Figure in Figure 31 B, 31C and 31D indicates the most general euchromosome trisomy T13, T18 and T21 respectively.Specifically, Figure 31 B describes T13 sample.At this, No. 18 and No. 21 typical " two karyomit(e) " patterns of karyomit(e)s displaying, X chromosome shows typical " item chromosome " pattern, and there is the reading from Y chromosome.Altogether, this represents No. 18 and No. 21 chromosomal disomies, and has identified fetus XY genotype.But, specifically, typical " three the karyomit(e) " pattern of No. 13 chromosome paintings.Similarly, Figure 31 C describes T18 sample, and Figure 31 D describes T21 sample.
Described method can also detect sex chromosome dysploidy, comprises 45, X (Figure 31 E), 47, XXY (Figure 31 F) and 47, XYY (Figure 31 G).It should be noted that described method is interpretation at No. 13, No. 18, No. 21, the copy number of X and Y chromosome; Assuming that residue karyomit(e) is disomy, report bulk dyeing body quantity.The X chromosome region describing the figure of 45, X sample discloses single chromosomal existence.But, lack the reading from Y chromosome and No. 13, No. 18 and No. 21 chromosomal " two karyomit(e) " pattern instructions 45, X gene type.Otherwise the figure that 47, XXY sample produces discloses the existence of two X chromosomes.Data further disclose the allelotrope reading from Y chromosome.Together with the existence of No. 13, No. 18 and No. 21 chromosomal two copies, this instruction 47, XXY genotype.For the existence of " item chromosome " pattern of X chromosome and reading instruction 47, the XYY genotype of instruction existence two Y chromosomes.
Discuss
This method non-invasively detects T13, T18, T21,45, X, 47, XXY and 47, XYY from maternal blood.This method is increased and high-flux sequence by the target composite PCR of 19,488 SNP, inquires the cfDNA from maternal blood plasma.This point and the complex information analysis of method considering parent genotype information and many sample parameters (comprising fetus mark and DNA quality) more effectively detect fetal signals and to dysploidy (T13, T18, T21,45 during seven class modal births, X, 47, XXX, 47, XXY and 47, XYY) in all five karyomit(e)s of involving made the ploidy interpretation of pin-point accuracy.This method provides the multiple clinical advantages being better than prior method, comprises and is most noteworthy that larger clinical fraction of coverage and sample specificity calculate accuracy (being similar to personalized risk score).
The clinical fraction of coverage increased
This method accurately detects the ability of euchromosome trisomy and sex chromosome dysploidy in view of it, and the dysploidy fraction of coverage provided is equivalent to the about twice of obtainable NIPT method clinically.In unique Noninvasive test that this method proposed is with the heterosomal ploidy of high accuracy interpretation.Previous DNA combined experiments and the independent plasma sample analyzed in our experiment detects show, this method, by detecting larger sex chromosomal abnormality colony, comprises 47, XXX.In this method proposed also with highly sensitive and specific detection No. 13, No. 18 and No. 21 chromosomal dysploidy, and appropriate design of primers is estimated can detect equally at the chromosomal copy number of residue.
Sample specificity calculates accuracy
Noteworthy, this method calculates the sample specificity accuracy for the ploidy interpretation on the every bar karyomit(e) in each sample.By differentiating and being marked with the independent sample with the bad DNA of quality or low fetus mark that may produce the bad test result of accuracy, the accuracy calculated by this method estimates the ratio significantly reducing incorrect interpretation.By contrast, method based on large-scale parallel shotgun sequencing (MPSS) uses single hypothesis to repel test and produces positive or negative interpretation, and it is based on open Research Group instead of the feature based on independent sample that its accuracy is estimated, assuming that they have the accuracy identical with described colony.But the independent accuracy with the sample of the parameter at population distribution afterbody can be significantly different.Under low fetus mark, as early interim or for the sample with low DNA quality in pregnant age, exacerbate this situation.These samples are not generally differentiated and are marked for follow-up test, and this can produce the interpretation of loss.But the inventive method considers multiple parameter, comprise fetus mark and multiple DNA quality metric, make each chromosomal copy number interpretation, calculate the sample specificity accuracy of described interpretation.This makes described method differentiate independent sample with low accuracy and mark for follow-up test them.Estimate so to almost eliminate the interpretation of loss, especially First Trimester when fetus mark usual lower time.Assuming that be far preferable over the interpretation of loss without interpretation, repaint because only require without interpretation and analyze again.
Calculated accuracy is converted into traditional risk score
This method can provide the adjustment risk of the dysploidy for excessive risk pregnant woman, wherein adjust risk and consider priori risk (this P (Benn P), Cook H (Cuckle H), Pai Gemante E (Pergament E). the non-invasive prenatal diagnosis of Down's syndrome: example will change, but lentamente (Non-invasive prenatal diagnosis for Down syndrome:the paradigm will shift, but slowly). ultrasonic wave tocology and gynecology (Ultrasound Obstet Gyneco1) 2012; 39:127-130, its mode hereby quoted in full is incorporated herein).Although the calculating accuracy that the inventive method provides each patient to customize, for Clinical practice, these accuracys can be converted into traditional risk score, and but traditional risk score also represents the risk of aneuploid gestation represents with fractional form.Traditional risk score considers various parameter, comprise maternal age relevant risk and the serum level of biochemical marker, thus a risk score is provided, exceed described risk score, just think that mother is high risk and advises follow-up invasive diagnosis program to her.This method significantly optimizes this risk score, thus reduces false positive rate and false negative rate, and provides the evaluation more accurately of independent maternal risk.Calculating accuracy as used in this is ploidy interpretation is correct likelihood, and represents with per-cent, but calculating accuracy used in experiment 19 does not comprise age relevant risk.Because the calculating of risk score generally includes age relevant risk, thus calculate accuracy and traditional risk score not interchangeable; They must combine to change into traditional risk score.Combination age relevant risk with the formula calculating accuracy is:
Wherein R 1be the risk score as calculated by the inventive method and R 2it is the risk score as calculated by early pregnancy screening.
Method based on SNP counteracts amplification variation issue
The inherent defect of the method for counting that some other methods are used is their reading quantity of being mapped to relative chromosome (such as, No. 21 karyomit(e)s) by measurement with the ratio be mapped to reference to those reading quantity chromosomal to measure fetus ploidy state.The amplification variability with the karyomit(e) (comprising No. 13, X and Y chromosome) of high or low GC content is high.This can produce signal intensity, and its value is suitable with fetus cfDNA signal, and it by change from the allelotrope reading of relative chromosome and the ratio carrying out the chromosomal allelotrope reading of self-reference, can obscure copy number interpretation.This can cause No. 13, the accuracy of X and Y chromosome is low.Noteworthy, under low fetus cfDNA mark, as pregnant age in early days situation often, exacerbate this problem.
By contrast, the method based on SNP does not rely on the consistent level of amplification between karyomit(e), and estimates thus to provide for all karyomit(e) result equally accurately.Because the inventive method partly checks the not homoallelic comparative counting at polymorphic locus, they only have mononucleotide different in definition, it does not need to use with reference to karyomit(e), and this eliminates to depend on and count the amplification variation issue of carrying out between the intrinsic karyomit(e) of quantitative method and karyomit(e) to reading.Be different from and need euploid with reference to chromosomal quantivative approach, the inventive method is estimated can detect sexual abnormality in triploidy and copy number, as Uniparental disomy.
The importance of early detection
Noteworthy, during the birth of combining of sex chromosome dysploidy, prevalence rate is higher than modal euchromosome dysploidy (Figure 32).But current do not exist the Conventional non-invasive screening method reliably detecting sex chromosomal abnormality.Therefore, generally with the antenatal detection sex chromosomal abnormality of side effect form of the conventionally test of Down's syndrome or other euchromosome dysploidy; Miss most situation completely.In early days and detect that to improve in these illnesss of clinical effectiveness multiple for wherein early treatment intervention be vital accurately.For example, Turner syndrome is usually until be just diagnosed pubescence, but during its overall birth, prevalence rate is 2, just has 1 in 500 women.Tethelin therapy become known for preventing by illness cause of short and small stature, but when the treatment started before 4 years old is obviously more effective.In addition, controversies in hormone replacement in the elderly can suffer from patient's moderate stimulation secondal sexual character of Turner syndrome, but treatment discontented ten years old time, must start before usually syndromes being detected again.In a word, this address the early stage, conventional of sex chromosome dysploidy and the importance of the detection of safety.The method provides the first method of the potentiality with the routine screening of serving as sex chromosomal abnormality.
Additional application
Because this method utilizes target to increase, so it prepares to detect submicroscopic exception uniquely, such as micro-deleted and micro-repetition.Although the non-targeted method as MPSS has shown detect the micro-deleted syndromes of enlightening George, this has needed sufficiently high genome fraction of coverage level, makes described method infeasible.This is because the validity of non-targeted amplification on submicroscopic region is by a decimal order of magnitude, because the order-checking reading of minimum mark will be to provide information.In addition, current obtainable method inconvenient fact in the chromosomal ploidy state of accurate distinctive shows, they also will run into the problem of variable amplification on less chromosome segment.
Similarly, method based on SNP can detect UPD illness, these are sexual abnormalities in copy number, will not detect by the current noninvasive method that depends on counting or traditional invasive method (as amniocentesis and CVS) of depending on cytogenetics genome analysis and/or fluorescence in situ hybridization.This is because individual haplotype can be distinguished uniquely based on the method for SNP, and obtainable to increase non-polymorphic locus and therefore can not determine such as whether relative chromosome is derived from same parent with targeted approach based on MPSS clinically.This means that these micro-deletedly/micro-copy and UPD syndromes, comprise pula moral-Willie syndromes, Angleman (Angelman) syndromes and shellfish Wei Shi (Beckwith-Wiedemann) syndromes, generally can not antenatal diagnosis, and usually at first by mistaken diagnosis in postpartum.This is significantly delayed therapeutic intervention.In addition, because this method target SNP, institute in this way also will promote that parent's haplotype reconstructs, allow fetus genetic situation (the Vladimír Kocman JO (Kitzman JO) detecting independent disease linked gene seat, Cynddelw, Brydydd Mawr MW (Snyder MW), texts and pictures draw the whole gene order-checking of Noninvasive (Noninvasive whole-genome sequencing of a human fetus) of the people human foetus such as M (Ventura M). science translational medicine (Sci Transl Med) 2012; 4:137ra76, its mode hereby quoted in full is incorporated herein).
The result proposed at this confirms that this method is for differentiating the expanded range of antenatal dysploidy.Specifically, by to 19,488 SNP carry out increasing and checking order, this method can be determined at No. 13, No. 18, No. 21, the copy number of X and Y chromosome, and estimate to detect not by other other chromosome abnormalty of detecting of obtainable noninvasive method clinically any uniquely, such as triploidy and UPD.The clinical fraction of coverage increased and powerful sample specificity calculate accuracy and show, this method can provide practicable auxiliary to the invasive test for detecting fetal chromosomal aneuploidy.
The mode that all patents quoted herein, open application and open reference quote in full with it is hereby incorporated herein.Although the inventive method should be appreciated that, can make further amendment to it described by having had in conjunction with its specific embodiment.In addition, the application is intended to contain any variant of the inventive method, purposes or reorganization, comprise as in the scope of the known or customary practice in field belonging to the inventive method and depart from as the present invention belonged in above Claims scope.For example, disclosed hereinly by comprising the reverse transcription step for RNA being changed into DNA, easily can be adapted for RNA for any one in the method for DNA.If desired, for illustration of the example of use polymorphic locus can easily be adapted for the non-polymorphic locus that increases.

Claims (102)

1. a method for the target gene seat in amplification of nucleic acid sample, described method comprises:
A () makes described nucleic acid samples contact to produce reaction mixture with the test primer storehouse hybridizing at least 1,000 different target gene seat simultaneously; And
B () makes described reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon.
2. method according to claim 1, wherein at least 5,000 different target gene seat is amplified.
3. method according to claim 1, wherein at least 10,000 different target gene seat is amplified.
4. method according to claim 1, wherein at least 20,000 different target gene seat is amplified.
5. method according to claim 1, wherein at least 30,000 different target gene seat is amplified.
6. method according to claim 1, wherein the described amplified production of at least 90% is target amplicon.
7. method according to claim 1, wherein the described amplified production of at least 95% is target amplicon.
8. method according to claim 1, wherein the described amplified production of at least 99% is target amplicon.
9. method according to claim 1, wherein the described target gene seat of at least 90% is amplified.
10. method according to claim 1, wherein the described target gene seat of at least 95% is amplified.
11. methods according to claim 1, wherein the described target gene seat of at least 99% is amplified.
12. methods according to claim 1, the described amplified production being wherein less than 20% is test primer dimer.
13. methods according to claim 1, the described amplified production being wherein less than 10% is test primer dimer.
14. methods according to claim 1, the described amplified production being wherein less than 1% is test primer dimer.
15. methods according to claim 1, the described amplified production being wherein less than 0.1% is test primer dimer.
16. methods according to claim 1, wherein said test primer is selected from candidate drugs storehouse based on one or more parameter.
17. methods according to claim 16, wherein said test primer is selected from candidate drugs storehouse based on the ability of described candidate drugs formation primer dimer at least partly.
18. methods according to claim 17, wherein said selection comprises
I () calculates major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, wherein each undesirable mark is based, at least in part, between described two kinds of candidate drugs and forms dimeric likelihood;
(ii) from described candidate drugs storehouse, the highest candidate drugs of undesirable mark is removed;
(iii) if the described candidate drugs removed in step (ii) is the member of primer pair, from described candidate drugs storehouse, another member of described primer pair is so removed; And
(iv) optionally repeating step (ii) and (iii).
19. methods according to claim 17, wherein said selection comprises
I () calculates major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, wherein each undesirable mark is based, at least in part, between described two kinds of candidate drugs and forms dimeric likelihood;
(ii) from described candidate drugs storehouse, undesirable mark is removed in the maximum quantity combination as two kinds of candidate drugs higher than the candidate drugs of the part of the first minimum threshold;
(iii) if the described candidate drugs removed in step (ii) is the member of primer pair, from described candidate drugs storehouse, another member of described primer pair is so removed; And
(iv) optionally repeating step (ii) and (iii).
20. methods according to claim 19, comprise the quantity by remaining candidate drugs in the following described storehouse of further minimizing: described first minimum threshold used in step (ii) is reduced to the second lower minimum threshold and repeating step (ii) and (iii) until in described storehouse the undesirable mark of remaining candidate drugs combination be all equal to or less than described second minimum threshold, or until in described storehouse the quantity of remaining candidate drugs reduce to desired quantity.
21. methods according to claim 19, comprise described first minimum threshold used in step (ii) is increased to the second higher minimum threshold and repeating step (ii) and (iii) until in described storehouse the undesirable mark of remaining candidate drugs combination be all equal to or less than described second minimum threshold, or until in described storehouse the quantity of remaining candidate drugs reduce to desired quantity.
22. methods according to claim 18 or 19, wherein candidate drugs selects from the group of two or more candidate drugs with the equal undesirable mark removed from described storehouse based on one or more parameter.
23. methods according to claim 1, wherein the concentration of often kind of test primer is less than 100nM.
24. methods according to claim 1, wherein the concentration of often kind of test primer is less than 10nM.
25. methods according to claim 1, wherein the concentration of often kind of test primer is less than 2nM.
26. methods according to claim 1, wherein said test primer storehouse comprises the test primer pair that at least 1,000 comprises positive test primer and negative testing primer, and wherein often pair of test primer hybridization is to target gene seat.
27. methods according to claim 1, wherein said test primer storehouse comprises the independent test primer that at least 1,000 hybridizes to different target locus separately.
28. methods according to claim 1, the GC content of wherein said test primer, between 30% and 80%, comprises end points.
29. methods according to claim 1, the scope of the GC content of wherein said test primer is less than 20%.
30. methods according to claim 1, the melting temperature(Tm) of wherein said test primer, between 40 DEG C and 80 DEG C, comprises end points.
31. methods according to claim 1, the scope of the melting temperature(Tm) of wherein said test primer is less than 5 DEG C.
32. methods according to claim 1, the length of wherein said test primer, between 17 and 35 Nucleotide, comprises end points.
33. methods according to claim 1, wherein said test primer comprises non-targeted specific marker.
34. methods according to claim 33, wherein said mark forms inner loop structure.
35. methods according to claim 35, wherein said mark is between two DNA lands.
36. methods according to claim 34, wherein said test primer comprises and has specific 5th ' district to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to described target gene seat described target gene seat.
37. methods according to claim 36, the length in wherein said 3rd ' district is at least 7 Nucleotide.
38. according to method according to claim 37, and the length in wherein said 3rd ' district, between 7 and 20 Nucleotide, comprises end points.
39. method according to claim 33, wherein said test primer comprises target gene seat is not had to specific 5th ' district, is then have specific region to target gene seat, do not have specificity and form the interior region of ring structure and have specific 3rd ' district to described target gene seat described target gene seat.
40. methods according to claim 1, the scope of the length of wherein said target amplicon is less than 15 Nucleotide.
41. methods according to claim 1, wherein said primer extension reaction condition is polymerase chain reaction condition (PCR).
42. methods according to claim 41, wherein the length of annealing steps is greater than 10 minutes.
43. methods according to claim 1, comprise further and determine at least one target amplicon of presence or absence.
44. methods according to claim 1, comprise the sequence measuring at least one target amplicon further.
45. methods according to claim 1, wherein said target gene seat is present on identical relative chromosome.
46. methods according to claim 1, at least some in wherein said target gene seat is present on different relative chromosome nucleic acid.
47. methods according to claim 1, wherein said nucleic acid samples comprises the nucleic acid through fragmentation or digestion.
48. methods according to claim 1, wherein nucleic acid samples comprises genomic dna, cDNA or mRNA.
49. methods according to claim 1, wherein nucleic acid samples comprises from single celled DNA.
50. methods according to claim 1, wherein nucleic acid samples comprises or derives from blood, blood plasma, saliva, sperm, cell culture supernatant, mucus secretion, dental plaque, stomach intestinal tissue, ight soil, urine or forensic samples.
51. methods according to claim 1, wherein said target gene seat is the section of people's nucleic acid.
52. methods according to claim 1, wherein said target gene seat comprises single nucleotide polymorphism.
Select the method for testing primer from candidate drugs storehouse for 53. 1 kinds, described method comprises:
A () calculates major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, wherein each undesirable mark is based, at least in part, between described two kinds of candidate drugs and forms dimeric likelihood;
B () removes the highest candidate drugs of undesirable mark from described candidate drugs storehouse;
If c described candidate drugs that () removes in step (b) is the member of primer pair, from described candidate drugs storehouse, so remove another member of described primer pair; And
D () be repeating step (b) and (c) optionally, thus select test primer storehouse.
Select the method for testing primer from candidate drugs storehouse for 54. 1 kinds, described method comprises:
A () calculates major part from two kinds of candidate drugs in described storehouse or the undesirable mark that likely combines on computers, wherein each undesirable mark is based, at least in part, between described two kinds of candidate drugs and forms dimeric likelihood;
B () to remove in the maximum quantity combination as two kinds of candidate drugs undesirable mark higher than the candidate drugs of the part of the first minimum threshold from described candidate drugs storehouse;
If c described candidate drugs that () removes in step (b) is the member of primer pair, from described candidate drugs storehouse, so remove another member of described primer pair; And
D () be repeating step (b) and (c) optionally, thus select test primer storehouse.
55. methods according to claim 54, comprise the quantity by remaining candidate drugs in the following described storehouse of further minimizing: described first minimum threshold used in step (b) is reduced to the second lower minimum threshold and repeating step (b) and (c) until in described storehouse the undesirable mark of remaining candidate drugs combination be all equal to or less than described second minimum threshold, or until in described storehouse the quantity of remaining candidate drugs reduce to desired quantity.
56. methods according to claim 54, comprise described first minimum threshold used in step (b) is increased to the second higher minimum threshold and repeating step (b) and (c) until in described storehouse the undesirable mark of remaining candidate drugs combination be all equal to or less than described second minimum threshold, or until in described storehouse the quantity of remaining candidate drugs reduce to desired quantity.
57. methods according to claim 53 or 54, wherein candidate drugs selects from the group of two or more candidate drugs with the equal undesirable mark removed from described candidate drugs storehouse based on one or more other parameter.
58. methods according to claim 53 or 54, wherein said undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of described target gene seat, to at the polymorphism of described target gene seat or the relevant incidence rate that suddenlys change, to described target gene seat polymorphism or suddenly change disease penetrance relevant, described candidate drugs is to the specificity of described target gene seat, the size of described candidate drugs, the melting temperature(Tm) of described target amplicon, the GC content of described target amplicon, the amplification efficiency of described target amplicon and the size of described target amplicon.
59. methods according to claim 58, wherein said undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of described target gene seat, described candidate drugs are to the size of the GC content of the melting temperature(Tm) of the size of the specificity of described target gene seat, described candidate drugs, described target amplicon, described target amplicon, the amplification efficiency of described target amplicon and described target amplicon; And wherein said test primer is used to increase and comprises from least 1,000 target gene seat in the maternal DNA of the pregnant mothers of fetus and the sample of foetal DNA to determine presence or absence fetal chromosomal abnormalities simultaneously.
60. methods according to claim 59, comprise the described DNA molecular joined to by universal primer binding site in described sample; Use at least 1,000 Auele Specific Primer and a universal primer to increase the DNA molecular engaged, produce first group of amplified production; And use first group of amplified production described at least 1,000 pair of primer amplified, produce second group of amplified production.
61. methods according to claim 58, wherein said undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of described target gene seat, described candidate drugs are to the size of the GC content of the melting temperature(Tm) of the size of the specificity of described target gene seat, described candidate drugs, described target amplicon, described target amplicon, the amplification efficiency of described target amplicon and described target amplicon; And wherein said test primer is used to increase simultaneously the DNA of the hypothesis father comprised from fetus sample at least 1,000 target gene seat and amplification simultaneously comprise from the described target gene seat in the maternal DNA of the described pregnant mothers of fetus and the sample of foetal DNA, thus determine that whether described hypothesis father is the natural father of described fetus.
62. methods according to claim 58, wherein said undesirable mark is selected from by the parameter of the following group formed based on one or more at least partly: the heterozygosis rate of described target gene seat, described candidate drugs are to the size of the specificity of described target gene seat, the size of described candidate drugs and described target amplicon; And wherein said test primer is used at least 1,000 the target gene seat using the annealing time of at least 5 minutes to increase in legal medical expert's nucleic acid samples simultaneously.
63. methods according to claim 58, comprise at least 1,000 target gene seat using described test primer simultaneously to increase in contrast nucleic acid samples with produce first group of target amplicon and described target gene seat simultaneously in amplification assay nucleic acid samples to produce second group of target amplicon; And more described first and second groups of target amplicon are not still stored in another sample to determine whether target gene seat is present in a sample, or whether target gene seat is present in described control sample and described test sample with different levels.
64. methods according to claim 63, wherein said test sample is from the doubtful individuality with the risk of relative disease or phenotype or relative disease or phenotypic increase; And one or more in wherein said target gene seat comprises the risk polymorphism that is relevant or that be correlated with described relative disease or phenotype to described relative disease or phenotypic increase.
65. methods according to claim 58, comprise use described test primer to increase to comprise simultaneously at least 1,000 target gene seat in the control sample of RNA with produce first group of target amplicon and the described target gene seat that comprises in the test sample of RNA of simultaneously increasing to produce second group of target amplicon; And more described first and second groups of target amplicon with determine described rna expression between described control sample and described test sample horizontal in presence or absence difference.
66. methods according to claim 65, wherein said RNA is mRNA.
67. methods according to claim 65, wherein said test sample is from the doubtful individuality with the risk of cancer of cancer or increase; And one or more in wherein said target gene seat comprises polymorphism that is relevant to the risk of cancer increased or that be correlated with cancer or other suddenlys change.
68. methods according to claim 65, wherein said test sample is the individuality from suffering from cancer after diagnosing; And between wherein said control sample and test sample Discrepancy Description target gene seat in described rna expression is horizontal comprise with increase polymorphism that the risk of cancer that reduces is relevant or other suddenly change.
69. methods according to claim 53 or 54, before being included in step (b) further, remove the primer pair producing the target amplicon overlapping with the target amplicon produced by another primer pair from described storehouse.
70. methods according to claim 53 or 54, in wherein said storehouse, remaining described candidate drugs can increase at least 1,000 different target gene seat simultaneously.
71. methods according to claim 53 or 54, comprise further:
E () makes the nucleic acid samples comprising target gene seat contact to produce reaction mixture with described candidate drugs remaining in described storehouse; And
F () makes described reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon.
72. 1 kinds of primer storehouses, its at least 1,000 different target gene seat that simultaneously increases is test primer dimer to make the described amplified production being less than 30%.
73. 1 kinds of primer storehouses, it increases at least 1,000 different target gene seat to make the described amplified production of at least 80% be target amplicon simultaneously.
74. 1 kinds of primer storehouses, it increases target gene seat to make at least 1 simultaneously, and in 000 different target gene seat, the described target gene seat of at least 80% is amplified.
75. storehouses according to any one of claim 72 to 74, wherein at least 5,000 different target gene seat is amplified.
76. storehouses according to any one of claim 72 to 74, wherein at least 10,000 different target gene seat is amplified.
77. storehouses according to any one of claim 72 to 74, wherein at least 20,000 different target gene seat is amplified.
78. storehouses according to any one of claim 72 to 74, wherein at least 30,000 different target gene seat is amplified.
79. storehouses according to any one of claim 72 to 74, wherein the described amplified production of at least 90% is target amplicon.
80. storehouses according to any one of claim 72 to 74, wherein the described amplified production of at least 95% is target amplicon.
81. storehouses according to any one of claim 72 to 74, wherein the described amplified production of at least 99% is target amplicon.
82. storehouses according to any one of claim 72 to 74, wherein the described target gene seat of at least 90% is amplified.
83. storehouses according to any one of claim 72 to 74, wherein the described target gene seat of at least 95% is amplified.
84. storehouses according to any one of claim 72 to 74, wherein the described target gene seat of at least 99% is amplified.
85. storehouses according to any one of claim 72 to 74, the described amplified production being wherein less than 20% is primer dimer.
86. storehouses according to any one of claim 72 to 74, the described amplified production being wherein less than 10% is primer dimer.
87. storehouses according to any one of claim 72 to 74, the described amplified production being wherein less than 1% is primer dimer.
88. storehouses according to any one of claim 72 to 74, the described amplified production being wherein less than 0.1% is primer dimer.
89. storehouses according to any one of claim 72 to 74, comprise at least 1,000 primer pair, wherein each primer pair comprises the forward primer and reverse primer that hybridize to target gene seat.
90. storehouses according to any one of claim 72 to 74, comprise the independent primer that at least 1,000 hybridizes to different target locus.
91. 1 kinds of test kits for the target gene seat in amplification of nucleic acid sample, comprise (i) storehouse according to any one of claim 72 to 74 and (iii) and use described storehouse to increase the specification sheets of described target gene seat.
92. 1 kinds for measuring the method for the chromosomal ploidy state in the fetus in breeding, described method comprises:
A () makes nucleic acid samples contact to produce reaction mixture with the primer storehouse hybridizing at least 1,000 different polymorphic locus simultaneously; Wherein said nucleic acid samples comprises from the maternal DNA of mother of described fetus and the foetal DNA from described fetus; And
B () makes described reaction mixture experience primer extension reaction condition to produce amplified production;
C () measures described amplified production to produce sequencing data with high-flux sequence instrument;
D () calculates on computers based on described sequencing data and counts at the allelotrope of described polymorphic locus;
E () creates multiple separately about the ploidy hypothesis of described chromosomal different possibility ploidy state on computers;
F () is supposed for often kind of ploidy, be that the expectation allelotrope counting at the described polymorphic locus place on described karyomit(e) builds simultaneous distribution model on computers;
G () uses described simultaneous distribution model and described allelotrope to count the relative probability of each measured on computers in described ploidy hypothesis; And
H () is by selecting the ploidy state corresponding to the hypothesis with maximum probability, the ploidy state of fetus described in interpretation.
93. 1 kinds of test chromosome comprise maternal and fetus DNA mixture sample in the method for skewed distribution, described method comprises:
A () makes described sample contact to produce reaction mixture with the primer storehouse hybridizing at least 1,000 different target gene seat simultaneously; Wherein said target gene seat is from multiple different karyomit(e); And wherein said multiple different karyomit(e) comprise at least one doubtful there is skewed distribution in described sample the first chromosome and the second karyomit(e) of at least one supposition normal distribution in described sample;
B () makes described reaction mixture experience primer extension reaction condition to produce amplified production;
C () checks order to obtain to described amplified production the sequence mark that multiple and described target gene seat aims at; The length of wherein said sequence mark is enough to distribute to specific targets locus;
D described multiple sequence mark is distributed to the target gene seat of its correspondence by () on computers;
E quantity that () measures the sequence mark aimed at the target gene seat of described the first chromosome on computers and the quantity of sequence mark of aiming at described second chromosomal target gene seat; And
F () compares quantity from step (e) on computers to determine the skewed distribution of the first chromosome described in presence or absence.
94. 1 kinds for detecting the method for presence or absence fetus dysploidy, described method comprises:
A sample that () makes to comprise the mixture of the DNA of maternal and fetus and the primer storehouse simultaneously hybridizing at least 1,000 different non-polymorphic target gene seat contact to produce reaction mixture; Wherein said target gene seat is from multiple different karyomit(e);
B () makes described reaction mixture experience primer extension reaction condition to produce the amplified production comprising target amplicon;
C () carries out quantitatively to the relative frequency from described the first and second relevant chromosomal target amplicon on computers;
D () compares the described relative frequency from described the first and second relevant chromosomal target amplicon on computers; And
E () differentiates presence or absence dysploidy based on described relevant first and second chromosomal compared relative frequencies.
Suppose that whether father is the method for the natural father of the fetus bred in pregnant mothers body for determining for 95. 1 kinds, described method comprises:
A () is increased from the multiple polymorphic locuses on the genetic material of described hypothesis father simultaneously, comprise at least 1,000 different polymorphic locus, thus produce first group of amplified production;
(b) increase simultaneously the blood sample deriving from described pregnant mothers DNA biased sample on corresponding multiple polymorphic locuses to produce second group of amplified production; Wherein said DNA biased sample comprises foetal DNA and maternal DNA;
C (), based on described first and second groups of amplified productions, use genotype measuring result measures the probability that described hypothesis father is the natural father of described fetus on computers; And
D () uses the described hypothesis father measured to be that the probability of the natural father of described fetus determines that whether described hypothesis father is the natural father of described fetus.
96. according to the method described in claim 95, comprises further and increases from multiple polymorphic locuses corresponding on the genetic material of described mother to produce the 3rd group of amplified production simultaneously; Wherein based on described first, second, and third group of amplified production, use genotype measuring result measures the probability that described hypothesis father is the natural father of described fetus.
97. 1 kinds of methods measuring the amount of two or more the target gene seats in nucleic acid samples, described method comprises:
A () uses pcr amplification to comprise the nucleic acid samples of the first standard gene seat, the second standard gene seat, first object locus and the second target gene seat to form amplified production; Wherein said first standard gene seat and described first object locus have the Nucleotide of equal amts but its sequence is different at one or more Nucleotide place; And wherein said second standard gene seat and described second target gene seat have the Nucleotide of equal amts but its sequence is different at one or more Nucleotide place;
B () is checked order to determine to compare the standard ratio of the first increased standard gene seat compared to the relative quantity of the second increased standard gene seat to described amplified production; The difference of amplification in PCR efficiency of the wherein said standard ratio described first standard gene seat of instruction and described second standard gene seat;
C () measures and compares the target rate of increased first object locus compared to the relative quantity of the second increased target gene seat; And
(d) based on the described standard ratio adjustment from step (b) from the described target rate of step (c) to determine the relative quantity of first object locus described in described sample and described second target gene seat.
98. according to the method described in claim 97, comprises the absolute magnitude measuring first object locus and described second target gene seat described in described sample further.
99. according to the method described in claim 97, comprises and measures presence or absence target gene seat in described sample.
100. according to the method described in claim 97, comprises at least 1,000 the different target gene seat that simultaneously increases.
101. according to the method described in claim 97, and wherein said target gene seat is present on identical relative chromosome.
102. according to the method described in claim 97, and at least some in wherein said target gene seat is present on different relative chromosome.
CN201280075224.8A 2012-07-24 2012-11-21 Highly multiplex PCR methods and compositions Pending CN104685064A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201261675020P 2012-07-24 2012-07-24
US61/675,020 2012-07-24
PCT/US2012/066339 WO2014018080A1 (en) 2012-07-24 2012-11-21 Highly multiplex pcr methods and compositions
US13/683,604 US20130123120A1 (en) 2010-05-18 2012-11-21 Highly Multiplex PCR Methods and Compositions
US13/683,604 2012-11-21

Publications (1)

Publication Number Publication Date
CN104685064A true CN104685064A (en) 2015-06-03

Family

ID=49997695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280075224.8A Pending CN104685064A (en) 2012-07-24 2012-11-21 Highly multiplex PCR methods and compositions

Country Status (10)

Country Link
JP (10) JP6392222B2 (en)
KR (1) KR101890466B1 (en)
CN (1) CN104685064A (en)
AU (1) AU2012385961B9 (en)
CA (1) CA2877493C (en)
HK (1) HK1211058A1 (en)
IL (1) IL236435A0 (en)
RU (1) RU2650790C2 (en)
SG (1) SG11201408813VA (en)
WO (1) WO2014018080A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107406882A (en) * 2015-04-24 2017-11-28 基纳生物技术有限公司 Identification and quantitative multiple method for minorAllele and polymorphism
CN108334745A (en) * 2018-03-19 2018-07-27 青岛理工大学 A kind of pcr process Nonlinear Hybrid Systems modeling method
CN109477138A (en) * 2016-04-15 2019-03-15 纳特拉公司 Lung cancer detection method
CN109790587A (en) * 2016-09-30 2019-05-21 富士胶片株式会社 The method of the plant degree living of the personal method of the method in its source, identification and analysis candidate stem cell is differentiated from 100pg human genome DNA below
CN113979895A (en) * 2020-07-08 2022-01-28 中国科学技术大学 Self-degradable polymer with controllable precise sequence and preparation method and application thereof
US11667958B2 (en) 2011-05-19 2023-06-06 Agena Bioscience, Inc. Products and processes for multiplex nucleic acid identification

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11111543B2 (en) 2005-07-29 2021-09-07 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US9424392B2 (en) 2005-11-26 2016-08-23 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US10083273B2 (en) 2005-07-29 2018-09-25 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US10081839B2 (en) 2005-07-29 2018-09-25 Natera, Inc System and method for cleaning noisy genetic data and determining chromosome copy number
US11111544B2 (en) 2005-07-29 2021-09-07 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
CA2731991C (en) 2008-08-04 2021-06-08 Gene Security Network, Inc. Methods for allele calling and ploidy calling
EP2473638B1 (en) 2009-09-30 2017-08-09 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US20190010543A1 (en) 2010-05-18 2019-01-10 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
WO2011146632A1 (en) 2010-05-18 2011-11-24 Gene Security Network Inc. Methods for non-invasive prenatal ploidy calling
US9677118B2 (en) 2014-04-21 2017-06-13 Natera, Inc. Methods for simultaneous amplification of target loci
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US10316362B2 (en) 2010-05-18 2019-06-11 Natera, Inc. Methods for simultaneous amplification of target loci
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
JP6328934B2 (en) 2010-12-22 2018-05-23 ナテラ, インコーポレイテッド Noninvasive prenatal testing
AU2011358564B9 (en) 2011-02-09 2017-07-13 Natera, Inc Methods for non-invasive prenatal ploidy calling
KR101890466B1 (en) * 2012-07-24 2018-08-21 내테라, 인코포레이티드 Highly multiplex pcr methods and compositions
WO2015048535A1 (en) 2013-09-27 2015-04-02 Natera, Inc. Prenatal diagnostic resting standards
US10262755B2 (en) 2014-04-21 2019-04-16 Natera, Inc. Detecting cancer mutations and aneuploidy in chromosomal segments
US10577655B2 (en) 2013-09-27 2020-03-03 Natera, Inc. Cell free DNA diagnostic testing standards
WO2015164432A1 (en) * 2014-04-21 2015-10-29 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
CN113774132A (en) * 2014-04-21 2021-12-10 纳特拉公司 Detection of mutations and ploidy in chromosomal segments
HUE046643T2 (en) 2014-11-28 2021-11-29 Uniqure Ip Bv Dna impurities in a composition comprising a parvoviral virion
US20170349926A1 (en) * 2014-12-22 2017-12-07 DNAe Group Holdings LTD. Bubble primers
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
GB2539675B (en) * 2015-06-23 2017-11-22 Cs Genetics Ltd Libraries of multimeric barcoding reagents and kits thereof for labelling nucleic acids for sequencing
WO2017044843A1 (en) 2015-09-11 2017-03-16 The General Hospital Corporation Full interrogation of nuclease dsbs and sequencing (find-seq)
CN108699595B (en) 2016-02-25 2022-11-04 豪夫迈·罗氏有限公司 Elimination of primer-primer interaction during primer extension
EP3827812A1 (en) * 2016-07-29 2021-06-02 The Regents of the University of California Adeno-associated virus virions with variant capsid and methods of use thereof
WO2018067517A1 (en) 2016-10-04 2018-04-12 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US10011870B2 (en) 2016-12-07 2018-07-03 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
CA3049139A1 (en) 2017-02-21 2018-08-30 Natera, Inc. Compositions, methods, and kits for isolating nucleic acids
WO2019010456A1 (en) * 2017-07-07 2019-01-10 Stephen Quake Noninvasive prenatal diagnosis of single-gene disorders using droplet digital pcr
KR101977976B1 (en) * 2017-08-10 2019-05-14 주식회사 엔젠바이오 Method for increasing read data analysis accuracy in amplicon based NGS by using primer remover
CA3073448A1 (en) 2017-08-23 2019-02-28 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity
EP3694993A4 (en) * 2017-10-11 2021-10-13 The General Hospital Corporation Methods for detecting site-specific and spurious genomic deamination induced by base editing technologies
AU2019256287A1 (en) 2018-04-17 2020-11-12 The General Hospital Corporation Sensitive in vitro assays for substrate preferences and sites of nucleic acid binding, modifying, and cleaving agents
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA
CA3107376A1 (en) 2018-08-08 2020-02-13 Inivata Ltd. Method of sequencing using variable replicate multiplex pcr
CN112080558B (en) * 2019-06-13 2024-03-12 杭州贝瑞和康基因诊断技术有限公司 Kit and method for simultaneously detecting HBA1/2 and HBB gene mutation
EP4004927A4 (en) * 2019-07-22 2023-08-02 Mission Bio, Inc. Using machine learning to optimize assays for single cell targeted dna sequencing
US20230018079A1 (en) * 2019-12-16 2023-01-19 Agilent Technologies, Inc. Genomic scarring assays and related methods
JP7320468B2 (en) 2020-03-10 2023-08-03 Ntn株式会社 HUB UNIT WITH STEERING FUNCTION AND VEHICLE INCLUDING THE SAME
WO2022076574A1 (en) * 2020-10-08 2022-04-14 Claret Bioscience, Llc Methods and compositions for analyzing nucleic acid
EP4292825A1 (en) 2021-03-18 2023-12-20 Canon Kabushiki Kaisha Liquid injection method, liquid injection device, and liquid cartridge
AU2022339791A1 (en) 2021-09-01 2024-03-14 Natera, Inc. Methods for non-invasive prenatal testing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001090419A2 (en) * 2000-05-23 2001-11-29 Variagenics, Inc. Methods for genetic analysis of dna to detect sequence variances
US6479235B1 (en) * 1994-09-30 2002-11-12 Promega Corporation Multiplex amplification of short tandem repeat loci
US20050227263A1 (en) * 2004-01-12 2005-10-13 Roland Green Method of performing PCR amplification on a microarray
US20060210997A1 (en) * 2005-03-16 2006-09-21 Joel Myerson Composition and method for array hybridization
US20080234142A1 (en) * 1999-08-13 2008-09-25 Eric Lietz Random Mutagenesis And Amplification Of Nucleic Acid
US20120122701A1 (en) * 2010-05-18 2012-05-17 Gene Security Network, Inc. Methods for Non-Invasive Prenatal Paternity Testing

Family Cites Families (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US300235A (en) 1884-06-10 Chaeles b
US7582420B2 (en) * 2001-07-12 2009-09-01 Illumina, Inc. Multiplex nucleic acid reactions
JP2003521252A (en) * 2000-02-07 2003-07-15 イルミナ インコーポレイテッド Nucleic acid detection method using universal priming
US6977162B2 (en) 2002-03-01 2005-12-20 Ravgen, Inc. Rapid analysis of variations in a genome
ES2338654T5 (en) * 2003-01-29 2017-12-11 454 Life Sciences Corporation Pearl emulsion nucleic acid amplification
EP1627075A4 (en) 2003-05-09 2006-09-20 Univ Tsinghua Methods and compositions for optimizing multiplex pcr primers
US8515679B2 (en) 2005-12-06 2013-08-20 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US8532930B2 (en) 2005-11-26 2013-09-10 Natera, Inc. Method for determining the number of copies of a chromosome in the genome of a target individual using genetic data from genetically related individuals
US7888017B2 (en) 2006-02-02 2011-02-15 The Board Of Trustees Of The Leland Stanford Junior University Non-invasive fetal genetic screening by digital analysis
US20070231823A1 (en) * 2006-03-23 2007-10-04 Mckernan Kevin J Directed enrichment of genomic DNA for high-throughput sequencing
JP2008125471A (en) 2006-11-22 2008-06-05 Olympus Corp Multiplex method of nucleic acid amplification
EA015913B1 (en) * 2007-01-17 2011-12-30 Учреждение Российской Академии Наук Институт Молекулярной Биологии Им. В.А. Энгельгардта Ран (Имб Ран) Method for genetically identifying a person according to the analysis of the single nucleotide polymorphism of a human genome by means of a oligonucleotide biological microchip (biochip)
WO2008093098A2 (en) * 2007-02-02 2008-08-07 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
US20090023190A1 (en) 2007-06-20 2009-01-22 Kai Qin Lao Sequence amplification with loopable primers
WO2009032779A2 (en) * 2007-08-29 2009-03-12 Sequenom, Inc. Methods and compositions for the size-specific seperation of nucleic acid from a sample
FR2925480B1 (en) * 2007-12-21 2011-07-01 Gervais Danone Sa PROCESS FOR THE ENRICHMENT OF OXYGEN WATER BY ELECTROLYTIC, OXYGEN-ENRICHED WATER OR DRINK AND USES THEREOF
EP2077337A1 (en) 2007-12-26 2009-07-08 Eppendorf Array Technologies SA Amplification and detection composition, method and kit
US20110033862A1 (en) 2008-02-19 2011-02-10 Gene Security Network, Inc. Methods for cell genotyping
WO2009146335A1 (en) 2008-05-27 2009-12-03 Gene Security Network, Inc. Methods for embryo characterization and comparison
CA2731991C (en) 2008-08-04 2021-06-08 Gene Security Network, Inc. Methods for allele calling and ploidy calling
SI2334812T1 (en) 2008-09-20 2017-05-31 The Board of Trustees of the Leland Stanford Junior University Office of the General Counsel Building 170 Noninvasive diagnosis of fetal aneuploidy by sequencing
EP2473638B1 (en) 2009-09-30 2017-08-09 Natera, Inc. Methods for non-invasive prenatal ploidy calling
CA2779750C (en) * 2009-11-06 2019-03-19 The Board Of Trustees Of The Leland Stanford Junior University Non-invasive diagnosis of graft rejection in organ transplant patients
US20110312503A1 (en) * 2010-01-23 2011-12-22 Artemis Health, Inc. Methods of fetal abnormality detection
US8574832B2 (en) * 2010-02-03 2013-11-05 Massachusetts Institute Of Technology Methods for preparing sequencing libraries
WO2011146632A1 (en) 2010-05-18 2011-11-24 Gene Security Network Inc. Methods for non-invasive prenatal ploidy calling
WO2013052557A2 (en) 2011-10-03 2013-04-11 Natera, Inc. Methods for preimplantation genetic diagnosis by sequencing
EP2426217A1 (en) * 2010-09-03 2012-03-07 Centre National de la Recherche Scientifique (CNRS) Analytical methods for cell free nucleic acids and applications
CN103620055A (en) 2010-12-07 2014-03-05 利兰·斯坦福青年大学托管委员会 Non-invasive determination of fetal inheritance of parental haplotypes at the genome-wide scale
EP3564392B1 (en) * 2010-12-17 2021-11-24 Life Technologies Corporation Methods for nucleic acid amplification
AU2011352070A1 (en) * 2010-12-30 2013-07-18 Foundation Medicine, Inc. Optimization of multigene analysis of tumor samples
WO2012103031A2 (en) 2011-01-25 2012-08-02 Ariosa Diagnostics, Inc. Detection of genetic abnormalities
AU2011358564B9 (en) 2011-02-09 2017-07-13 Natera, Inc Methods for non-invasive prenatal ploidy calling
KR101890466B1 (en) 2012-07-24 2018-08-21 내테라, 인코포레이티드 Highly multiplex pcr methods and compositions

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6479235B1 (en) * 1994-09-30 2002-11-12 Promega Corporation Multiplex amplification of short tandem repeat loci
US20080234142A1 (en) * 1999-08-13 2008-09-25 Eric Lietz Random Mutagenesis And Amplification Of Nucleic Acid
WO2001090419A2 (en) * 2000-05-23 2001-11-29 Variagenics, Inc. Methods for genetic analysis of dna to detect sequence variances
US20050227263A1 (en) * 2004-01-12 2005-10-13 Roland Green Method of performing PCR amplification on a microarray
US20060210997A1 (en) * 2005-03-16 2006-09-21 Joel Myerson Composition and method for array hybridization
US20120122701A1 (en) * 2010-05-18 2012-05-17 Gene Security Network, Inc. Methods for Non-Invasive Prenatal Paternity Testing

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11667958B2 (en) 2011-05-19 2023-06-06 Agena Bioscience, Inc. Products and processes for multiplex nucleic acid identification
CN107406882A (en) * 2015-04-24 2017-11-28 基纳生物技术有限公司 Identification and quantitative multiple method for minorAllele and polymorphism
US10865439B2 (en) 2015-04-24 2020-12-15 Agena Bioscience, Inc. Multiplexed method for the identification and quantitation of minor alleles and polymorphisms
CN107406882B (en) * 2015-04-24 2022-03-01 基纳生物技术有限公司 Multiplexing method for identification and quantification of minor alleles and polymorphisms
US11680289B2 (en) 2015-04-24 2023-06-20 Agena Bioscience, Inc. Multiplexed method for the identification and quantitation of minor alleles and polymorphisms
CN109477138A (en) * 2016-04-15 2019-03-15 纳特拉公司 Lung cancer detection method
CN109790587A (en) * 2016-09-30 2019-05-21 富士胶片株式会社 The method of the plant degree living of the personal method of the method in its source, identification and analysis candidate stem cell is differentiated from 100pg human genome DNA below
CN109790587B (en) * 2016-09-30 2023-06-13 富士胶片株式会社 Method for discriminating origin of human genomic DNA of 100pg or less, method for identifying individual, and method for analyzing degree of engraftment of hematopoietic stem cells
CN108334745A (en) * 2018-03-19 2018-07-27 青岛理工大学 A kind of pcr process Nonlinear Hybrid Systems modeling method
CN108334745B (en) * 2018-03-19 2022-02-08 青岛理工大学 Nonlinear hybrid system modeling method in polymerase chain reaction process
CN113979895A (en) * 2020-07-08 2022-01-28 中国科学技术大学 Self-degradable polymer with controllable precise sequence and preparation method and application thereof
CN113979895B (en) * 2020-07-08 2023-03-24 中国科学技术大学 Self-degradable polymer with controllable precise sequence and preparation method and application thereof

Also Published As

Publication number Publication date
AU2012385961B9 (en) 2017-05-18
JP2020054400A (en) 2020-04-09
KR20150038216A (en) 2015-04-08
JP7027468B2 (en) 2022-03-01
JP2022027975A (en) 2022-02-14
AU2012385961A1 (en) 2015-02-12
JP7348330B2 (en) 2023-09-20
AU2012385961B2 (en) 2017-04-13
JP2015526073A (en) 2015-09-10
RU2014152883A (en) 2016-09-10
SG11201408813VA (en) 2015-02-27
JP2022027971A (en) 2022-02-14
JP2022037145A (en) 2022-03-08
HK1211058A1 (en) 2016-05-13
CA2877493A1 (en) 2014-01-30
RU2650790C2 (en) 2018-04-17
JP6997813B2 (en) 2022-02-10
JP7343563B2 (en) 2023-09-12
IL236435A0 (en) 2015-02-26
KR101890466B1 (en) 2018-08-21
CA2877493C (en) 2020-08-25
JP2022051949A (en) 2022-04-01
JP2020054401A (en) 2020-04-09
JP2020054402A (en) 2020-04-09
JP6997814B2 (en) 2022-02-10
JP2018183189A (en) 2018-11-22
JP6916153B2 (en) 2021-08-11
WO2014018080A1 (en) 2014-01-30
JP6997815B2 (en) 2022-02-10
JP2020058388A (en) 2020-04-16
JP6392222B2 (en) 2018-09-19

Similar Documents

Publication Publication Date Title
US11286530B2 (en) Methods for simultaneous amplification of target loci
US11111545B2 (en) Methods for simultaneous amplification of target loci
CN104685064A (en) Highly multiplex PCR methods and compositions
US11332793B2 (en) Methods for simultaneous amplification of target loci
CN103608818B (en) The antenatal ploidy identification device of Noninvasive
US20170051355A1 (en) Highly multiplex pcr methods and compositions
US20220307086A1 (en) Methods for simultaneous amplification of target loci
US20220356526A1 (en) Methods for simultaneous amplification of target loci
EP2847347B1 (en) Highly multiplex pcr methods and compositions
US20230383348A1 (en) Methods for simultaneous amplification of target loci
CN107988343A (en) The antenatal ploidy recognition methods of Noninvasive

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1211058

Country of ref document: HK

RJ01 Rejection of invention patent application after publication

Application publication date: 20150603

RJ01 Rejection of invention patent application after publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1211058

Country of ref document: HK