Thursday, June 27, 2013

The color of the eyes: 7 HERC2 variants in the Eurasian gene pool

In my previous post I briefly described the presence of 7 different HERC2 haplotypes in the Kurdish gene pool. Today, I want to show HERC2 haplotype data of populations from Eurasia. The blogger Davidski provided me with the raw data for HERC2 from the Human Genome Diversity Project (HGDP) and similar databases. I focused on Eurasian populations that had information for the following SNPs: rs12913832, rs7183877, rs11635884, rs11636232, rs8043281, rs6497284, rs8028689, rs9302376, rs16950960, rs8039195, rs16950987, and rs1667394. I added the Kurdish data that I collected.

In most Eurasian individuals I was able to explain their HERC2 genotypes by combining 2 of the 7 HERC2 haplotypes, however, in a few individuals the data set was not complete, in a few individuals other haplotypes must have been present but could not determined. Those data are labeled as "not determined".

Please take the percentages with a grain of salt because the number (N) of tested individuals per population is low, sometimes with only N=1, these data should be ignored.

Reminder: The phenotype of HERC2 haplotype#1 and #2 are light eye colors.

The annotated data can be seen here.


N #1 #2 #3 #4 #5 #6 #7 Not Determined
Abhkasian 18 17% 6% 22% 14% 17% 22% 3% 0%
Adygei 14 25% 4% 14% 4% 21% 32% 0% 0%
Altaians 12 0% 0% 13% 0% 13% 17% 0% 58%
Armenian 27 20% 2% 17% 9% 2% 9% 4% 37%
Ashkenazy_Jews 21 31% 14% 10% 2% 12% 5% 2% 24%
Balkar 16 38% 6% 13% 9% 22% 9% 3% 0%
Balochi 15 10% 3% 10% 17% 23% 23% 7% 7%
Bedouin 40 10% 0% 19% 14% 1% 19% 15% 23%
Belorussians 7 36% 64% 0% 0% 0% 0% 0% 0%
Bengali 1 50% 0% 50% 0% 0% 0% 0% 0%
Brahmins_TN 9 6% 0% 11% 6% 6% 17% 11% 44%
Brahui 20 3% 0% 13% 10% 20% 13% 3% 40%
Bulgarian 13 12% 4% 12% 0% 0% 4% 0% 69%
Burusho 25 10% 8% 16% 0% 12% 14% 4% 36%
Buryats 15 7% 0% 10% 0% 57% 23% 3% 0%
Cambodian 10 0% 0% 25% 10% 30% 25% 0% 10%
Chamar 9 6% 0% 33% 11% 0% 11% 17% 22%
Chechen 18 31% 11% 14% 3% 19% 22% 0% 0%
Chenchus 4 0% 0% 0% 0% 0% 0% 0% 100%
Chukchis 11 0% 0% 41% 0% 45% 9% 5% 0%
Chuvash 16 28% 31% 19% 0% 13% 6% 3% 0%
Cochin_Jews 3 0% 0% 17% 0% 17% 33% 0% 33%
Cypriots 12 33% 0% 21% 13% 17% 13% 4% 0%
Dai 10 0% 0% 35% 0% 10% 55% 0% 0%
Daur 9 0% 0% 17% 0% 56% 28% 0% 0%
Dharkars 8 0% 0% 25% 6% 31% 19% 19% 0%
Dolgans 6 0% 0% 25% 0% 50% 25% 0% 0%
Druze 20 18% 0% 3% 3% 8% 13% 3% 55%
Dusadh 6 0% 0% 25% 0% 25% 8% 8% 33%
Egyptians 11 23% 0% 18% 18% 14% 18% 9% 0%
Erzya 9 39% 56% 0% 0% 6% 0% 0% 0%
Evenkis 11 0% 0% 32% 0% 45% 23% 0% 0%
French 27 31% 30% 19% 4% 6% 11% 0% 0%
French_Basque 21 14% 24% 19% 7% 26% 10% 0% 0%
Georgians 14 46% 0% 7% 4% 18% 21% 4% 0%
Gond 2 0% 0% 25% 0% 0% 25% 0% 50%
Hakkipikki 4 0% 0% 13% 0% 13% 38% 13% 25%
Hazara 17 6% 0% 26% 3% 29% 26% 3% 6%
Hezhen 7 0% 0% 43% 0% 36% 21% 0% 0%
Hungarians 18 17% 50% 14% 0% 14% 6% 0% 0%
Iranian_Jews 4 13% 0% 13% 0% 0% 63% 13% 0%
Iranians 16 3% 6% 28% 3% 13% 28% 13% 6%
Iraqi_Jews 9 17% 0% 28% 17% 11% 28% 0% 0%
Jordanians 19 13% 8% 29% 5% 13% 21% 11% 0%
Kanjars 7 0% 0% 14% 0% 14% 64% 7% 0%
Kargopol_Russian 25 36% 46% 8% 2% 4% 4% 0% 0%
Karitiana 6 33% 0% 17% 0% 33% 0% 0% 17%
Kol 14 0% 0% 36% 4% 21% 25% 14% 0%
Koryaks 10 0% 0% 40% 0% 50% 0% 10% 0%
Kshatriya 7 0% 0% 29% 0% 29% 43% 0% 0%
Kumyk 13 35% 12% 15% 0% 19% 12% 0% 8%
Kurd 26 19% 4% 19% 15% 21% 17% 4% 0%
Kurmi 1 0% 0% 0% 0% 0% 0% 100% 0%
Kurumba 3 33% 0% 17% 0% 33% 0% 17% 0%
Lambadi 1 0% 0% 50% 0% 0% 50% 0% 0%
Lebanese 4 13% 0% 38% 38% 0% 13% 0% 0%
Lebanese_Christian 24 31% 6% 15% 10% 4% 19% 6% 8%
Lebanese_Druze 23 17% 0% 28% 11% 13% 13% 13% 4%
Lebanese_Muslim 25 14% 8% 14% 12% 12% 32% 8% 0%
Lezgins 16 19% 3% 22% 6% 22% 25% 3% 0%
Lithuanians 10 35% 60% 0% 0% 0% 5% 0% 0%
Makrani 20 10% 3% 18% 8% 33% 20% 5% 5%
Malay 87 1% 0% 22% 13% 28% 29% 5% 2%
Miaozu 10 0% 0% 25% 0% 25% 50% 0% 0%
Moksha 5 30% 50% 10% 0% 0% 10% 0% 0%
Mongola 10 0% 0% 35% 0% 25% 30% 0% 10%
Mongolians 9 0% 0% 28% 0% 50% 11% 0% 11%
Mumbai_Jews 4 13% 0% 25% 0% 0% 0% 38% 25%
Muslim_India 5 10% 0% 20% 40% 0% 20% 10% 0%
NAN_Melanesian 10 0% 0% 20% 0% 0% 15% 65% 0%
Nganassans 9 0% 0% 28% 0% 56% 11% 6% 0%
Nihali 1 0% 0% 0% 0% 50% 50% 0% 0%
Nogay 14 39% 0% 21% 4% 21% 11% 4% 0%
North_Italian 13 42% 27% 8% 4% 8% 8% 4% 0%
North_Kannadi 4 25% 0% 25% 38% 0% 13% 0% 0%
North_Ossetian 13 31% 15% 35% 8% 4% 8% 0% 0%
Orcadian 11 27% 41% 5% 5% 18% 5% 0% 0%
Oroqen 9 0% 0% 17% 0% 61% 22% 0% 0%
Palestinian 31 23% 0% 13% 6% 8% 26% 15% 10%
Pathan 21 19% 2% 14% 7% 24% 17% 17% 0%
Piramalai 8 0% 0% 31% 6% 25% 19% 19% 0%
Pulliyar 1 0% 0% 0% 0% 100% 0% 0% 0%
Romanians 14 29% 25% 21% 4% 14% 7% 0% 0%
Russians 5 60% 20% 0% 0% 10% 10% 0% 0%
Sardinian 24 19% 4% 40% 6% 10% 17% 0% 4%
Saudis 19 3% 0% 26% 0% 8% 13% 24% 26%
Selkups 9 39% 44% 11% 0% 6% 0% 0% 0%
Sephardic_Jews 18 25% 11% 25% 3% 11% 14% 11% 0%
She 9 0% 0% 33% 0% 11% 56% 0% 0%
Sindhi 14 4% 0% 29% 7% 14% 32% 7% 7%
Singapore_Indian 78 8% 2% 26% 4% 17% 26% 12% 6%
Spanish 11 32% 5% 32% 9% 9% 5% 0% 9%
Surui 3 17% 0% 33% 0% 50% 0% 0% 0%
Syrians 15 30% 0% 13% 3% 23% 20% 3% 7%
Tadjik 14 7% 11% 18% 7% 36% 18% 4% 0%
Tamil_Nadu 1 0% 0% 50% 0% 50% 0% 0% 0%
Tharus 2 25% 0% 25% 0% 0% 25% 25% 0%
Tibeto-Burman_Burmese 14 0% 0% 32% 18% 18% 21% 11% 0%
Tibeto-Burman_Garo 2 0% 0% 25% 0% 50% 25% 0% 0%
Tu 8 0% 0% 13% 0% 31% 44% 0% 13%
Tujia 10 0% 0% 5% 5% 45% 35% 10% 0%
Turks 19 21% 8% 29% 8% 13% 16% 5% 0%
Tuscan 8 25% 19% 6% 0% 19% 31% 0% 0%
Tuvinians 13 0% 0% 42% 0% 35% 23% 0% 0%
Ukrainian 20 43% 38% 8% 0% 8% 5% 0% 0%
Uttar_Pradesh 5 0% 0% 60% 0% 20% 20% 0% 0%
Uygur 10 10% 0% 20% 0% 35% 35% 0% 0%
Uzbeks 15 3% 13% 27% 23% 13% 20% 0% 0%
Velamas 7 7% 0% 21% 0% 43% 0% 29% 0%
Xibo 9 0% 0% 11% 0% 39% 39% 0% 11%
Yakut 20 0% 0% 40% 0% 43% 18% 0% 0%
Yemenese 7 7% 0% 14% 7% 7% 14% 21% 29%
Yemenite_Jews 15 7% 0% 27% 3% 0% 27% 10% 27%
Yizu 10 0% 0% 35% 0% 15% 40% 0% 10%
Yukaghirs 4 0% 13% 38% 0% 38% 13% 0% 0%



Unfortunately, this data set does not include many Germanic speaker populations (Austrians, Germans, Swiss). In the previous HERC2 analysis these populations showed peak frequencies for haplotype#1.

Haplotype#1 is ancestral towards haplotype#2. Peak frequencies of haplotype#2 can be found in Belorussians, Lithuanians, some Uralic language speakers from Russia (Moksha, Selkups). Interestingly, these populations show no or very little haplotype#3, the ancestral haplotype of #1 and #2.

Haplotype#3 peaks in populations of East-Siberia (Hezhen, Tuvinians, Chukchis, Koryaks, Yakut, Yukaghirs), West-Asia (Lebanese, North Ossetians, Turks, Jordanians, Lebanese Druze, Iranians, Iraqi Jews). Interestingly, in East-Siberia haplotype#3 correlates with the presence of haplotype#5 and #6. Highest frequencies of haplotype#3 in Europe can be found in Sardinia and Spain.

Edit: July 02, 2013:

I got some questions why some populations have let's say 20% Branch/haplotype #1 and #2 but not 20% of the population has light eye colors. The reasons is because haplotype 1 and 2 are recessive.
Thus, in order to get light eye colors not one but 2 copies/alleles are needed, one inherited from the father, one from the mother.
How to calculate frequency of light eyes in a population based on my presented tables (based on the Hardy-Weinberg principle):
Frequency of light eyes in a population = (%ht1 +%ht2)2

Example1: Germans have 46% ht1 and 33% ht2.
(46%+33%)2
= (0.46 + 0.33)2
= 0.792
= 0.62
= 62%
62% of the Germans have light eye color based on the HERC2 genotype.


Related:
The color of the eyes: 7 HERC2 variants in the Kurdish gene pool 
The color of the eyes: 7 HERC2 variants in the Eurasian gene pool 
The color of the eyes: at least 17 HERC2 variants in Human gene pool

14 comments:

  1. Thanks for the work, Palisto.

    In any case, don't you think that ht3, ht5 and ht6 are actually widespread through Eurasia rather than focusing on this or that peak?

    It would have been interesting to see African frequencies.

    ReplyDelete
  2. ht3 frequencies are relatively high all over Asia but low in Europe, especially in Eastern Europe. Therefore, it is very likely that ht1 and its descendent ht2 did not originate in Eastern Europe.
    It is interesting that ancestral ht3 % is anti-proportional towards ht2 % and ht1 % in Europe; this phenomenon cannot be seen in Asia. There must have been a strong selection for it in Europe.

    "It would have been interesting to see African frequencies."
    Most African samples would end up as "not determined". They cannot be explained with ht#1-7.

    ReplyDelete
    Replies
    1. Eastern Europe is one of the few exceptions. In fact the only populations I detected lacking in ht3, ht5 and/or ht6 are Eastern Europeans and some Indian populations, but all the rest have them (except Yemeni Jews, who lack ht5). Would the data be sorted by regions, patterns would be easier to spot.

      "Most African samples would end up as "not determined"."

      Aha. That's very very interesting to know. I wonder if all the three primary branches you determined in the previous post descend from a single OoA haplotype or several and how do they relate with the African HERC2 structure.

      Delete
    2. I determined 6 new African specific HERC2 haplotypes from homozygous African individuals. I called them ht8-13. Ht8 is by far the most frequent and widespread (Bantu origin?), also found in Puerto Ricans and African Americans. Previously determined haplotypes ht6 and ht7 are also pretty frequent in Africa. Ht9 is probably the most ancestral, I need make a new tree based on the latest data.

      See all 13 haplotypes in spreadsheet:

      https://docs.google.com/spreadsheet/ccc?key=0AgVZU9mN1R6ldEo2aG5FZGM0bmlqcXRXa1h3UDJuN3c#gid=3

      Delete
    3. Great! I must congratulate you again for your work on this matter, Palisto, amounting to true academic research. Thanks.

      I can see that ht10 is the haplotype ancestral to ht4 and ht5 (and therefore also to ht3, ht1 and ht2), while ht9 is ancestral to ht10 and ht6. Ht7 seems derived from ht12 instead. Let me guess: these haplotypes are most common in East Africa, right?

      You tell me but, on quick preliminary look, this is my reconstructed "trees":

      ht12 → → ht11 → ht13 → → ht 8
      ht12 → ht7
      ht9 → ht10 (incl. ht4, ht5, ht3, ht1 and ht2)
      ht9 → ht6


      Ht9 and ht12 differ by just one mutational step. Still, in order to determine the root it'd be necessary to compare with an outgroup such as Neanderthals or Chimpanzees, because it could be anything in Africa (ht12, ht11, ht13, ht9, etc. or even the "missing link" between two of those).

      Delete
  3. Someone should now make a map of Eurasia and North Africa showing the frequencies of...

    1) ht1

    2) ht2

    3) ht3

    4) other

    ReplyDelete
    Replies
    1. How to add ht3, which is found almost everywhere, would be so informative? I feel totally unmotivated to do that: it's just everywhere (but Eastern Europeans and some Indians).

      I am already seeing the distribution patterns from Palisto's list (although sorting it by regions would help) and the interesting stuff, in my understanding, is not in the common haplotypes (at most where they are amiss) but in the rare ones like ht4, ht7 and undetermined... and also diversity (which seems greater by far in India) and how all that connects to the African cradle, so far unknown.

      Whatever the case, I'm sure that "someone" can be you David: it's not difficult beginning with a spreadsheet and you will develop new artistic skills. I use GIMP (open source, free, pretty good) but you can do with almost whatever and all you really need is motivation and some time. I lack both right now, especially the motivation.

      Delete
    2. Well, the fact that ht3 is basically missing from Eastern Europe is very important. It means that Eastern Europe was not the source of the modern European gene pool to any significant degree. Indeed, since ht1 and ht2 are most common there, then the natural conclusion is that Eastern Europe was largely populated from Central Europe.

      Delete
    3. It looks like that indeed, although I would not rely on a single marker alone. When looking at the frequencies of ht1 and ht2, I also got the impression that, since ht1 (ancestral to ht2) dominates in Central Europe (notably Switzerland but Austria, North Italy, etc. also) and ht2 in North and Eastern Europe (but also proportionally more common in Western Europe), a reasonable speculation is that ht2 arose in Central Europe and expanded from there in all directions, including parts of Asia.

      But it is a bit uncertain because we can't rely just on IE expansion patterns to explain the current distribution of ht2 in Asia, which is relatively frequent in populations which were never IE-speakers or otherwise IE-influenced like Jordanians or the Burusho. These are however small populations (Jordanians are now millions but historically they were surely a much smaller population) allowing for localized founder effects. I guess it can be argued (with some uncertainty) that it was IE expansion which spread ht2 in Asia (but surely not in Europe).

      Ht1 is probably of West Asian origin IMO, even if it experienced its main expansion (and diversification to ht2?) with the colonization of Europe in the early Upper Paleolithic.

      Delete
    4. Of great interest for a reconstruction of the origins of ht1 is that the Karitiana carry it at frequencies of 33% (n=6). The Karitiana are pretty much pure Native Americans, so this fact seems to evidence that ht1 existed already in the Aurignacian or Gravettian periods, when the Altai UP population which acted as patrilineal seed of Native Americans formed originally.

      Delete
  4. Haplotype#3 peaks in populations of East-Siberia (Hezhen, Tuvinians, Chukchis, Koryaks, Yakut, Yukaghirs), West-Asia (Lebanese, North Ossetians, Turks, Jordanians, Lebanese Druze, Iranians, Iraqi Jews). Interestingly, in East-Siberia haplotype#3 correlates with the presence of haplotype#5 and #6. Highest frequencies of haplotype#3 in Europe can be found in Sardinia and Spain.

    According to these results, Siberian populations are generally high in ht3, and all Siberian populations high in ht3 lack ht1 and ht2 except Yukaghirs, which explains the general lack of light eyes in indigenous Siberians. So Siberia cannot be the place of origin of ht1 and ht2. West Asian populations in general, on the other hand, are not only high in ht3 but also have appreciable proportions of ht1 and/or ht2, making West Asia the most likely place of origin of ht1 and ht2 and hence light eyes. In my opinion, ht1 originated somewhere in the Fertile Crescent while ht2 originated somewhere in the northern parts of West Asia such as Asia Minor, the Armenian Highland, the Caucasus and Atropatene (BTW, the Caucasus is genetically a part of West Asia).

    ReplyDelete
    Replies
    1. By the way, the Yukaghir ht2 can be easily explained by the fact that some of the Yukaghir samples have significant recent Russian admixture. The ht2-carrying Yukaghir sample is most probably one of those samples.

      Delete
  5. This comment has been removed by the author.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete