InvestorsHub Logo
Followers 18
Posts 1054
Boards Moderated 0
Alias Born 12/07/2002

Re: None

Sunday, 05/30/2004 2:56:51 AM

Sunday, May 30, 2004 2:56:51 AM

Post# of 82595
Here's more information related to Affymetrix's work involving AIMs and linkage disequilibrium. I have not added any analysis but merely highlight the potential overlap wth DNAP, and the "co-incidental" use of the term "ancestry informative markers". Note that DNAP's patent application COMPOSITIONS AND METHODS FOR INFERRING ANCESTRY was first filed in August 19, 2002.

Based on this I have already included the following question in the "Science" category in the list of shareholder questions:

"Is the company's intellectual property in any way affected by the provisional US patent application number 20040072217 titled "Methods of analysis of linkage disequilibrium", which is assigned to Affymetrix, Inc.?"

United States Patent Application 20040072217
Title: Methods of analysis of linkage disequilibrium
Assignee Name and Adress: Affymetrix, INC.
Filed: June 17, 2003
Last Update: April 15, 2004

Abstract
Methods and kits for analyzing a collection of target sequences in a nucleic acid sample are provided. A sample is amplified under conditions that enrich for a subset of fragments that includes a collection of target sequences. Methods are also provided for analysis of the above sample by hybridization to an array, which may be specifically designed to interrogate the collection of target sequences for particular characteristics, such as, for example, the presence or absence of one or more polymorphisms. Methods of estimating the extent of linkage disequilibrium in a region or population by determination of the ancestral and non-ancestral alleles are also provided.

...

27. A method to identify at least one ancestry informative marker comprising: a. determining the allele frequency for each of a plurality of SNPs in each of two populations; b. calculating an F.sub.ST value for at least one SNP in said plurality of SNPs; and c. identifying at least one SNP whose F.sub.ST value is greater than 0.3.

...

FIELD OF THE INVENTION

[0002] The invention relates to methods for enrichment and amplification of sequences from a nucleic acid sample and highly parallel methods of determining the genotypes of SNPs. In many embodiments a generic method of complexity reduction is combined with an array of probes to a collection of SNPs to determine genotype. In one embodiment, the invention relates to determining regions of low or high linkage disequilibrium across the whole genome. In another embodiment the invention relates to identification of ancestral alleles of human polymorphisms. The methods may be used to identify human chromosomal regions of low linkage disequilibrium and to determine haplotype maps. The present invention relates to the fields of molecular biology and genetics.

SUMMARY OF THE INVENTION

[0003] In one embodiment a method for identifying the ancestral allele of a human single nucleotide polymorphism is provided. Genomic DNA samples from at least two higher primate species are amplified by a method to reduce complexity, for example amplification with a single primer, and hybridized to a nucleic acid array comprising allele specific probes to at least 5,000 human SNPs. Hybridization patterns for the primates are analyzed to identify at least one human SNP that is homozygous for the same allele in both of the higher primate species. The allele present in both higher primate species is assigned as the ancestral allele state of that human SNP.

[0004] In another embodiment the extent of linkage disequilibrium in the chromosomal region near at least one human SNP is estimated by determining the ancestral allele for a plurality of human SNPs, identifying at least one human SNP allele that is the ancestral allele; and predicting low linkage disequilibrium across the chromosomal region near the SNP allele that is the ancestral allele or predicting low linkage disequilibrium across the chromosomal region near a non-ancestral allele. The non-ancestral allele is any allele that is not the ancestral allele. The higher primate species may be, for example, chimpanzee, gorilla or orangutan. The region may include the region 1, 10, 100 or 200 kb from the SNP in either direction.

[0005] In another embodiment the extent of linkage disequilibrium in the chromosomal region near a plurality of human SNPs is estimated by determining the ancestral allele for a plurality of human SNPs according to the method of claim, identifying the ancestral and non-ancestral alleles for each human SNP and predicting low linkage disequilibrium across the chromosomal region near each ancestral allele and high linkage disequilibrium across the chromosomal region near each non-ancestral allele.

[0006] In another embodiment a pattern of regions of high linkage disequilibrium across a chromosome is established by identifying at least one non-ancestral allele in a chromosome of an individual, determining the ancestral state of the human SNP in a population of individuals, identifying at least one human SNP that is found to be non-ancestral in a population of individuals with a frequency greater than 0.3, predicting regions of high linkage disequilibrium near a frequent non-ancestral allele on a chromosome of the population; and establishing a pattern of regions of high linkage disequilibrium across a chromosome. In another embodiment a pattern of regions of high linkage disequilibrium is established across a plurality of chromosomes by establishing a pattern of regions of high linkage disequilibrium across one chromosome and repeating for at least one other SNP located on a second chromosome.

[0007] In another embodiment a linkage disequilibrium map is established across human chromosomes by identifying at least one non-ancestral allele of a first human SNP, identifying chromosomal regions localized less than 200 kilobases from the at least one non-ancestral allele, identifying at least one other human SNP within this region; grouping the first SNP and the SNP or SNPs identified into blocks, and predicting high linkage disequilibrium between SNPs within these blocks. This may be used to establish a haplotype map. In another embodiment a computer and computer code are used to estimate haplotype diversity within blocks of SNPs and may be used to establish a haplotype map.

[0008] In another embodiment a haplotype map across a human chromosome is established by identifying human SNPs that are non-ancestral, identifying chromosomal regions localized less than 200 kb from the non-ancestral allele, identifying human SNPs within these regions, grouping the SNPs into blocks of high linkage disequilibrium, estimating haplotype diversity within these blocks via an haplotype estimation software and establishing an haplotype map.

[0009] In another embodiment the haplotype map that is established is used to search for complex disease genes.

[0010] In another embodiment the extent of linkage disequilibrium in the human genome is estimated by determining the allele frequencies for a first plurality of SNPs in a population, determining the ancestral allele for a second plurality of SNPs contained in the first plurality of SNPs, comparing the allele frequency of a SNP to the frequency of the ancestral allele for that SNP for a plurality of SNPs from the second plurality of SNPs in the population sample to generate a correlation coefficient for the population; determining high frequency for an ancestral allele if the correlation coefficient is greater than 0.8, identifying a chromosomal region nearby a frequent ancestral allele; and inferring low linkage disequilibrium across the region.

[0011] Populations that may be compared include any distinct populations, for example, geographically distinct human populations. The methods may be used to determine which population is more ancient.

[0012] In another embodiment at least one ancestry informative marker is identified by determining the allele frequency for each of a plurality of SNPs in each of two populations, calculating an F.sub.ST value for at least one SNP in the plurality of SNPs and identifying at least one SNP whose F.sub.ST value is greater than 0.3 or greater than 0.4.

[0013] In another embodiment a haplotype is identified in a region of high linkage disequilibrium in a first population of individuals, wherein a haplotype comprises at least two linked SNP alleles, by identifying a non-ancestral allele of a first SNP in the first population wherein the non-ancestral allele is not present in a second population, genotyping at least one additional SNP that is within 100 kb of the first SNP; and determining which allele of the additional SNP is linked to the non-ancestral allele of the first SNP.

...

[0142] 3. Discovery of Ancestry-Informative Markers (AIMs)

[0143] In one embodiment the F.sub.ST statistic is calculated and used to identify SNPs that are ancestry-informative markers (AIMs), F.sub.ST is an estimate of the geographic structure between two populations, for each SNP. F.sub.ST values vary from 0 to 1; as allele frequency differences between populations become more pronounced, F.sub.ST values increase. When calculating 0.061, 0.094 and 0.065 for SNPs in an African-American versus Caucasian population, African-American versus Asian populations and Caucasian versus Asian populations the mean F.sub.ST values are typically less than 0.1 indicating that the majority of markers show very small inter-population frequency differences. However, there is a subset of SNPs whose allele frequencies differ significantly in one population versus the other two. These SNPs, called ancestry-informative markers, or AIMs, can be used to map complex diseases using admixture-generated linkage disequilibrium, or MALD. See Collins-Schramm, H. et al., Am. J. Hum. Genet. 70, 737-750 (2002), Briscoe, D. et al., J. Hered. 85, 59-63 (1994), Parra, E. J. et al., Am. J. Hum. Genet. 63, 1839-1851 (1998), and McKeigue, P. M. et al., Ann. Hum. Genet. 64, 171-186 (2000) each of which is incorporated herein by reference in its entirety.