DnaPrint's technology from 2003 is still much further ahead than other methods available.
Thus, a potential weakness of the MHMM, compared with an HMM, is its requirement for richer genetic information on the ancestral populations. Fortunately, high-density SNP platforms are becoming more available and less expensive.
Ancestry inference, whether for mapping disease loci or for conducting gene-association studies, is a critical component of genetic analysis in an admixed population. LD between tightly linked markers within ancestral populations complicates such analyses. One option to circumvent the background LD problem is to eliminate markers that are in LD in each ancestral population. Toward this end, a panel of ancestry-informative markers (AIMs) has been developed for admixture mapping in African Americans. Such a map does not exist for other admixed populations but may become available in the near future. However, as Patterson et al.5 recognize, admixture mapping cannot replace genotype- or haplotype-based association analyses. First, there is considerable risk in genotyping a large number of AIMs, which are tailored for one special design. The superiority of admixture mapping over conventional association approaches hinges on the assumption that the frequency of the risk allele differs greatly between ancestral populations. While this may sometimes be the case, genetic differentiation between ancestral populations will generally not be sufficiently large.14 Furthermore, in the event that admixture mapping is not successful, the researchers cannot use the genotype data for conventional analyses, because the AIMs are chosen to eliminate background LD and thus are very far apart.
-------------------------------------------------------------
The present invention is based on the identification of ancestry informative markers (AIMs) useful for inferring a level of population structure of an individual, which, in turn, allows an inference as to various traits of the individual. Further, the AIMs of the present invention are demonstrated to correlate with a trait, regardless of whether the marker is in linkage disequilibrium with a gene or locus known to be involved in the trait. As such, the AIMs of the present invention are distinguishable from previously described markers, which only were considered useful if they were linked with a trait, i.e., if the marker was physically close to a gene known to be involved in the trait as characterized, for example, in having a low cross-over percentage with respect to gene (or locus) known to be involved in (or associated with) the trait. In contrast, there is no requirement that the markers (AIMs) useful in the present methods be in linkage disequilibrium with a gene/trait and, in fact, AIMs that are disclosed herein as correlating with a trait can be located on different chromosomes from each other and from a gene/locus known to be associated with the trait.
Previously, efforts have been made to control the two sources of population structure, including sampling effects and natural human demography, which were believed to confound efforts to identify markers of genes associated with particular traits. However, as disclosed herein, population structure is reflective of human demography, and markers that correlate with a trait value are useful as reporters of structure that correlate with trait value (rather than markers in LD with phenotypically active loci), and, therefore, provide a valuable tool that enables accurate classification in a cost-effective and practical manner. Alleles associated with a trait due to population structure are not linked to phenotypically active loci, but are merely correlated with trait value because they are enriched for in branches of the human family tree for which the trait value is more common. As disclosed herein, the distribution of trait values among the various branches of the human family tree are such that accurate classification can be obtained only through an appreciation of that structure, rather than a full understanding of the biological mechanism of the trait, and, as a result, markers that were considered false positives when considered with respect to their use for identifying phenotypically active loci, in fact, can enable accurate classification analysis; i.e., they are true positives provided the structure from which they were derived is reflective of human demography rather than sampling effects. The present methods are based on correlation between markers and BGA, where BGA is itself on some level of complexity correlated with a trait value, not linkage or linkage disequilibrium.