Sunday, August 5, 2012

Whole Genome Analysis of Kurds (930K SNPs) Part 1

Today, I want to present my first approach to analyze the whole genome of 16 Kurdish participants and their genetic relationships based on 930K SNPs. Others analyzed Kurds, too, but they focused on the SNPs that could be compared to raw data of scientific data. I used all available SNPs from chromosome 1-22. Additionally, measured the distance of two individuals by counting the SNPs that are completely different for both alleles, e.g. AA vs CC or AA vs TT but not AC vs AA. About 6.1% (5.7%-7.3%) of the SNPs between Kurds are completely different.
I visualized these results by using a rooted ReticulateNetwork (EqualAngle180) from SplitsTree.


It is clearly visible that KD011/KD012 and KD10/KD014 are closely related. Interestingly, people from the Dersim region (Alevi Kurmanji and Zaza) are the closest to the root of the network of the Kurds.
As you can see the other differences aren't that great this is why I zoomed into the center (see below).
 Most of the cross-connections show how these individuals are distantly related to each other.

Dodecad K12b visualization:

I used the same visualization method (ReticulateNetwork (EqualAngle180) from SplitsTree) for the adjusted distances of Dodecad K12b ADMIXTURE results. The goal is to compare the approach presented above with ADMIXTURE Euclidean distances.


ADMIXTURE is "annotating" and grouping the SNPs into a defined number of components and then it compares the components, not the SNPs themselves. Thus, ADMIXTURE cannot "see" shared DNA segments==> ADMIXTURE does not pick up that KD011 and KD012 are closely related; it also does not pick up that KD010 and KD014 are closely related.
However, ADMIXTURE can "see" overall similarities in the genome and can group based on that. Again, people Alevi Kurmanji and Zazas are the closest to the root of the network of the Kurds. 

2 comments:

  1. what means
    closest to the root of the network of the Kurds ?

    what exactly is the network of kurds?
    the typical Aryan root who kurds and kurdish came from ?

    ReplyDelete
    Replies
    1. The genetic data are presented in ReticulateNetwork, this is a mathematical method like e.g. Convex hull.
      http://en.wikipedia.org/wiki/Convex_hull
      It has nothing to do with social networks. etc.

      "closest to the root of the network of the Kurds" means something like the "least common denominator" of all the tested individuals, it does not necessarily mean the ancestral root. I will visualize and explain it more in my next post.

      Delete