Tuesday, July 31, 2012

DNA tribes presents Kurdish data

DNATribes is a DNA testing company that I personally cannot recommend because there are other testing opportunities available that are better and for free (Dodecad, Harappa, Eurogenes, gedmatch, Interpretome, etc.).
However, I was notified that DNATribes recently added Kurds to their reference samples, so it is worth to take a look at their data presented in this pdf. In the below image the most important information about Kurds is summarized:

DNATribes is calling the most prominent component in Kurds the "Persian" component (or "world region") regardless of the fact that it is peaking in Kurds (39.5%) and not in Iran (22.0%). Turkmen also have a high portion of this component (32.6%). The description of this component states: "Lake Urmia, Zagros and Elburz Mountains." Lake Urmia is not Persian, the Zagros Mountains are not Persian, either. The name "Persian" is totally misleading for this component, it should be called "Iranian" to be fair. Some people still seem to have problems to distinguish between "Iranian" and "Persian".  This is as ignorant as putting Germans and Austrians into one group...oh wait, DNATribes actually did that as well.


Here are populations with more than 6% of the "Persian" component based on DNATribes:




Persian
Kurdish
39.50%
Turkmen
32.60%
Persian Qatar 25.60%
Iran
22.00%
Assyrian
16.80%
Turkey
16.10%
Jewish Iraq 14.30%
Makrani Pakistan 12.90%
Lezgin Caucasus 12.70%
Jordan
11.80%
Armenian
11.80%
Balochi Pakistan 10.90%
Tajik
10.90%
Lebanon
10.00%
Syria
9.70%
Palestinian Israel 9.60%
Kumyk
9.40%
Druze Israel Carmel Israel 8.80%
Yemen
7.70%
Georgia Caucasus 6.90%
Brahui Pakistan 6.80%
Adyghe
6.70%
Bedouin Negev Desert 6.30%
Jewish Morocco 6.10%














The closest to Kurds at DNATribes (based on Euclidean distance) are:



1 Kurdish
2 Iran
3 Persian Qatar
4 Turkmen
5 Turkey
6 Jordan
7 Syria
8 Armenian
9 Assyrian
10 Lebanon
11 Palestinian Israel
12 Druze Israel Carmel Israel
13 Tajik
14 Jewish Iraq
15 Nogay
16 Sephardic Jewish Bulgaria Jewish
17 Jewish Morocco
18 Yemen
19 Ashkenazi Jewish Europe Jewish
20 Kumyk

Tuesday, July 24, 2012

Eurogenes presents Kurdish and Assyrian data

Davidski from Eurogenes Genetic Ancestry Project was so kind to focus on the Kurdish and Assyrian samples in his latest update of the project and compared them with available autosomal data of 23 populations. For this he used a program called SupportMix that helps determining ancestral origin of genomic segments.
 
For those who are interested in the who is who: Here is the key for most Kurds

Kurd4=KD007 Kurmanji (Zakho)
Kurd5=KD023 Kurmanji (Dohuk)
Kurd6=KD013 Sorani (Koysinjaq)
Kurd7=KD012 Sorani (Sulaymaniyah and Darband)
Kurd8=KD009 Sorani (Sulaymaniyah)
Kurd9=KD008 Yezidi (Iraq)
Kurd10=KD024 Zaza (Dersim)
Kurd11=KD014 (Sulaymaniyah)

For more information about the individuals, see results of
HarappaWorld
Dodecad K12b
Eurogenes K12b
McDonald


Update:
The given names of the segments are somehow surprising, anyways, here is the summary of the first 5 chromosomes. Note: Calculated percentages take into account the different sizes of each chromosome.

I also sorted it by region:

Please note the higher percentages of Kurds for Iranians, Europeans and South Asians (mostly Indo-European). The Northeast Caucasian connection is stronger for Kurds, while the Northwest Caucasian connection is much stronger for Assyrians. Even though the signal for Northwest Caucasian is not strong, this is something I have seen in the Y-Chromosome, too.
Y chromosome of Kurds in Iran
Kurdish Y chromosomes of a Central Anatolian village

Update2:

Now, all chromosome are analyzed, here are the results, sorted by region:

Overall, Kurds and Assyrians are pretty similar, the previously observed differences (see above) shrinked, e.g. the higher percentages of Kurds for Europeans shrinked, the difference compared to Assyrians is only obvious for Ukrainians, the values for "Tuscan" is nearly the same. The clearest difference can seen for "Saudi_Arabian" and for South Asia.  The Northeast-Caucasian signal is pretty high for both.

Thus, I decided to show the same analysis but sorted by the differences between Kurds and Assyrians.

Assyrians share more with "Saudi_Arabian", "Palestinian", "Abkhasian", "Druze", "West Asian Jewish", and "Moroccan", so Assyrians are closer to Semitic language speakers + NW-Caucasian speakers.

Contrary, Kurds share more with "Brahui_Balochi", "Chamar_Kannadi_Kol", "Iranian", "Pathan", "Chechen_Lezgin", "Ukrainian", "Brahmin-Kshatriya", "Buryat_Tuvan", "Chuvash", and "Tuscan", so besides the Altaic and Dravidian language speakers, Kurds are closer to Indo-European language speakers and NE-Caucasian speakers.

Interestingly, Assyrians are not closer to Armenians, Syrians and Cypriots as previously stated elsewhere, suggesting a complex ancestry of these populations without a clear Semitic domination in the past.

Summary:
Kurdish ancestors have seen more of the world.

Friday, July 20, 2012

Y chromosome of Kurds in Iran Part2

Today, I want to take a deeper look into the new data of Grugni et al., 2012.

E1b1b1a1a-M34:
The frequency of M34 in Iranian Kurds is striking (13.4%). None of the other ethnic groups in Iran has such a high frequency. We have one Kurdish participant from Turkey with this haplogroup.  in our

J2a3*-Page55:
This haplogroup and its subbranches (J2a3a-M47, J2a3b*-M67, J2a3b1-M92, J2a3h-M530) seem to play a major in Kurdish ancestry. Unfortunately, 23andme is not testing some of these haplogroups, this is probably why 3 of the Kurdish DNA project participants are labeled as "J2" only.

J1-M267:
Grugni et al., 2012 observed 1.7% (1 out of 59) of this haplogroup in Iranian Kurds. The paper has some STR data that can used for comparison including 4 Iraqi Kurds.

  • Iranian Kurd: There is no perfect STR match for the Iranian Kurd (DYS19, DYS388, DYS389I, DYS389B, DYS390, DYS391, DYS392, DYS439:    13,15,12,17,25,12,11,12) of this study. The closest match in this study is an Iraqi Kurd (13,15,12,17,25,11,11,12) with one step distance. I went through my database and did not find a close match. 
  •  Iraqi Kurd#1 (13,15,12,17,25,11,11,12): this Kurd has a 1 step distance to the Iranian Kurd and no further close matches.
  • Iraqi Kurd#2  (14,13,12,16,24,10,11,12): this Kurd has two perfect matches in Eastern Turkey, most likely Kurds as well. No further close matches could be found. 
  • Iraqi Kurd#3 (14,13,13,16,23,10,11,13): One perfect match was found in Mazandaran. No further close matches could be found.
  • Iraqi Kurd#4 (14,13,14,16,23,10,11,11): No further close matches could be found.
  • We have one Kurdish Feyli participant with the J1-M267 haplogroup, however, we don't have his STR data to determine his haplotype.  
 Second look at the data (assuming correct calculations and translations): It seems like the DYS389I value does not have to be increased by 3. At least the results make much more sense when DYS389I is not increased:
  • Iranian Kurd: There is no perfect STR match for the Iranian Kurd (DYS19, DYS388, DYS389I, DYS389B, DYS390, DYS391, DYS392, DYS439:    13,15,12,17,25,12,11,12) of this study. The closest match in this study is an Iraqi Kurd (13,15,12,17,25,11,11,12) with one step distance. I went through my database and found a perfect match. It is another Kurd (Kurdish individual K9 from Tofanelli et al., 2009). 
  •  Iraqi Kurd#1 (13,15,12,17,25,11,11,12): this Kurd has a 1 step distance to the Iranian Kurd and the other Kurd described above (Kurdish individual K9 from Tofanelli et al., 2009).
  • Iraqi Kurd#2  (14,13,12,16,24,10,11,12): this Kurd has two perfect matches in Eastern Turkey, most likely Kurds as well. No further close matches could be found. 
  • Iraqi Kurd#3 ( 14,13,13,16,23,10,11,13): this haplotype seems to be more in the center of cluster. One perfect match was found in Mazandaran. I was also able to find several other perfect matches:  3 Tabasarans (D131, D148, D150: Tofanelli et al., 2009), one Central Italian (PICE15: Tofanelli et al., 2009),  and one South Lebanese Shiite Muslim (Zalloua et al., 2008). Iraqi Kurd#3 has one step distances to one Beirut Sunni Muslim (Zalloua et al., 2008), 7 Tabarasans (D133, D134, D135, D138, D142, D144, D152: Tofanelli et al., 2009), 4 Avars (D112, D098, D108, D109: Tofanelli et al., 2009), one individual from Region 7 in Turkey (Central Anatolia; Cinnioglu et al., 2004), one from Region 3 in Turkey (Northeast Anatolia), one Kubachi (D050: Tofanelli et al., 2009), one Lak (D078: Tofanelli et al., 2009).
  • Iraqi Kurd#4 (14,13,14,16,23,10,11,11): this Kurd has several perfect matches, one Lak (D062: Tofanelli et al., 2009), one Kubachi (D044: Tofanelli et al., 2009), one from Northern Portugal (Port22: Tofanelli et al., 2009), and one South Lebanese Shiite Muslim (Zalloua et al., 2008). Iraqi Kurd#4 has one step distances to 6 Laks (D063, D065, D061, D070, D081, D091: Tofanelli et al., 2009), 5 Avars (D099, D100, D102, D103, D104: Tofanelli et al., 2009), 2 Tats (D028, D039: Tofanelli et al., 2009), 2 Tajik (Haber et al., 2012), one North Lebanese Sunni Muslim (Zalloua et al., 2008), one from Southern Portugal (Port15: Tofanelli et al., 2009), one Kubachi (D045: Tofanelli et al., 2009), one Tabasaran (D126: Tofanelli et al., 2009), two individuals from Region 3 in Turkey (Northeast Anatolia), and one individual from Region 4 in Turkey (Eastern Anatolia; Cinnioglu et al., 2004).
  • We have one Kurdish Feyli participant with the J1-M267 haplogroup, however, we don't have his STR data to determine his haplotype.  
 It is striking that the same J1-M267 haplotypes are found in Kurds, Tabasarans, Avars, and Laks. This is very similar to the previous findings with 'Dogukoy' Anatolian Kurds and their Y-chromosome J2a3b*-M67 haplotypes that are very similar and/or identical to the ones found in Northeast Caucasian language speakers.


J2a3b*-M67:
  • Kurd#1 (DYS19, DYS389I, DYS389B, DYS390, DYS391, DYS392, DYS393:     14,13,19,22,10,11,12): this Kurd has two perfect matches from Albania and one from Saudi-Arabia (C_061: Abu-Amero et al., 2009). Kurd#1 has one step distances to one Tadjik from  (Haber et al., 2012), 2 individuals from Central Italy, one from Central Portugal, one from South Portugal (all from Tofanelli et al., 2009), 2 Saudi Arabians (Abu-Amero et al., 2009), one individual from Dubai (Alshamali et al., 2009), and one Lenkoran [North Talysh] (Roewer et al., 2009).
  • Kurd#2 (15,13,20,23,10,11,12): this Kurd has three perfect matches, one from Caucasus, one from Crete, one from Cyprus. Kurd#2 has one step distance to one Iraqi Kurd (#56 from Sternersen et al.)
  •  Kurd#3 (15,14,22,23,10,11,12): this Kurd does not have perfect matches. Kurd#3 has a one step distance to an Iraqi Kurd (#56 from Sternersen et al.)
J2a-M92:
  • Kurd (DYS19, DYS389I, DYS389B, DYS390, DYS391, DYS392, DYS393   14, 13, 20, 22, 10, 11, 12): this Kurd has several perfect matches, two individuals from Crete, two from Greece, one from Iran/Isfahan, and two from Turkey. 
Q1a2-M25: 
The presence of Q1a2-M25 in Turkmen reminds me of the Gokcumen paper of Central Anatolian villages. Gokcumen et al observed 13.3% haplogroup Q in the village 'Gocmenkoy' (Residents of 'Gocmenkoy' identify themselves with the Afsar clan of the Oguz tribe), whereas haplogroup Q was rare or not observed in the other Central Anatolian villages. 

Thursday, July 19, 2012

Y chromosome analysis of Iran Part 1

A new publication came out very recently focusing on the Y-chromosome haplogroups of ethnic groups in Iran. I believe that this paper somehow fits into the conversion that we had here (about the connection between Kurds and other Iranians). The problem before this paper was that we hardly could get any more information about the ethnic backgrounds of the tested people from Iran but this paper will help us.

A total of 59 Kurds were tested, these are the observed haplogroups. I highlighted the haplogroups that were confirmed in our KurdishDNA project. Please note that 23andme does not test for some of the J2 subbranches, i.e. J2a*-M410, J2a3*-Page55):

1 x E1b1b1a1b-M78 (=1.7%)
8 x E1b1b1a1a-M34 (=13.6%)
3 x E1b1b1c-V13 (=5.1%)
2 x G1-M285 (=3.4%)
2 x G2* (=3.4%)
3 x G2a* (=5.1%)
1 x I2-M438 (=1.7%)
1 x J1-M267 (=1.7%)
2 x J1c3-P58 (=3.4%)
1 x J2a*-M410 (=1.7%)
3 x J2a3*-Page55 (=5.1%)
1 x J2a3a-M47 (=1.7%)
4 x J2a3b*-M67 (=6.8%)
1 x J2a3b1-M92 (=1.7%)
4 x J2a3h-M530 (=6.8%)
1 x L1-M76 (=1.7%)
2 x R2-M124 (=3.4%)
1 x R1*-M173 (=1.7%)
12x R1a-M17 (=20.3%)
1 x R1b-M343 (=1.7%)
5 x T-M70 (=8.5%)


One aspect that the reader Kurti mentioned is the African component in some Iranians. Now, we can see this African component in the Y-Chromosome repertoire of Afro-Iranians, 25% (3 out of 12) of them have the haplogroup E1b1a1, a haplogroup that was also observed in neighboring Gheshmi and in Balochistan at lower frequencies (1/49=2%; 1/24=4%).
The authors write:
"A clear African component is observed in Hormozgan where noteworthy is the presence of the sub-Saharan haplogroup E-M2 in the Afro-Iranian ethnic group."
Probably, the most interesting finding of the paper is the discovery of haplogroup IJ in Iran ( I am not convinced yet), it is the missing link between the Middle Eastern haplogroup J and the European haplogroup I. Again, no STR data are provided for this haplogroup, so the actual diversity of it cannot be predicted.

Unfortunately, the authors failed to provide STR data of all observed haplogroups, they are just giving STR data for haplogroup J. The paper clearly focuses on haplogroup J1 and J2 and does not bother much about the other haplogroups. Strangely, the number of the provided STRs varies from haplogroup to haplogroup, it is not clear to me why.
Additionally, quiet a few SNPs were not tested, so the annotation of some of the observed "root haplogroups" might be off.  This makes a detailed analysis and comparison impossible but I will try to comment on some of the results, especially J1 and J2.



PLoS ONE 7(7): e41252. doi:10.1371/journal.pone.0041252

Ancient Migratory Events in the Middle East: New Clues from the Y-Chromosome Variation of Modern Iranians

Viola Grugni et al.


Knowledge of high resolution Y-chromosome haplogroup diversification within Iran provides important geographic context regarding the spread and compartmentalization of male lineages in the Middle East and southwestern Asia. At present, the Iranian population is characterized by an extraordinary mix of different ethnic groups speaking a variety of Indo-Iranian, Semitic and Turkic languages. Despite these features, only few studies have investigated the multiethnic components of the Iranian gene pool. In this survey 938 Iranian male DNAs belonging to 15 ethnic groups from 14 Iranian provinces were analyzed for 84 Y-chromosome biallelic markers and 10 STRs. The results show an autochthonous but non-homogeneous ancient background mainly composed by J2a sub-clades with different external contributions. The phylogeography of the main haplogroups allowed identifying post-glacial and Neolithic expansions toward western Eurasia but also recent movements towards the Iranian region from western Eurasia (R1b-L23), Central Asia (Q-M25), Asia Minor (J2a-M92) and southern Mesopotamia (J1-Page08). In spite of the presence of important geographic barriers (Zagros and Alborz mountain ranges, and the Dasht-e Kavir and Dash-e Lut deserts) which may have limited gene flow, AMOVA analysis revealed that language, in addition to geography, has played an important role in shaping the nowadays Iranian gene pool. Overall, this study provides a portrait of the Y-chromosomal variation in Iran, useful for depicting a more comprehensive history of the peoples of this area as well as for reconstructing ancient migration routes. In addition, our results evidence the important role of the Iranian plateau as source and recipient of gene flow between culturally and genetically distinct populations.

Wednesday, July 18, 2012

mtDNA haplogroups of Kurds: Behar et al., 2008

I found another source for mtDNA for Kurds. Behar et al., 2008 has 12 Iraqi Kurds;

1 x F1b,
5 x H,
2 x J1c,
1 x K1,
1 x M30,
1 x T1a,
1 x T2

Tuesday, July 17, 2012

Some clarification for Xing et al data...

Kurti, one of the readers of this blog, mentioned that Kurds and Iranians are genetically different. As example/evidence, he presented this:

"Here is a something from Dienekes using Iraqi Kurdish samples from Xings et al.
http://dodecad.blogspot.de/2010/12/structure-in-west-asian-indo-european.html
"

 I heard this argument a lot. Let me briefly explain why this post of Dienekes is a bad example/evidence.

 There is a very good reason why Dienekes never used the Xing et al dataset for comparison again. Kurti was referring to Dienekes' plot which is very noisy caused by the low number of compared SNPs.
 It messed up the position of Kurds in the plot and all other populations in the plot. For example, just take a look at the Turks merging with Armenians in the plot. We now know this is not correct. There are major differences but the plot with the low K value could not see it in the relatively low number of SNPs. Dienekes never repeated any analysis with Xing et al because of that. However, Razib Khan, another science blogger started to use this noisy plot of Dienekes for his simplistic argument that Turks are just assimilated Armenians.

The Xing et al data are based on 250,000SNPs, but most of these SNPs are not part of other genetic studies, so a comparison is nearly impossible. Zack from the Harappa Ancestry project is trying to use Xing et al., but he has to reduce the number of SNPs down to 29,000 SNPs!

Zack wrote:
"This dataset is valuable because it contains several South Asian, Central Asian, Southeast Asian and Caucasian groups. However, it does not have a good SNP overlap with 23andme and the other datasets. It has only about 29,000 SNPs in common with 23andme v2 data. Combining HapMap, HGDP, SGVP, Behar et al and Xing et al with 23andme data leaves us with 25,000 SNPs. Due to that, I'll be using Xing et al data for only a few analyses."
Zack also wrote:
"Do note that the Xing results were computed with a smaller number of SNPs and thus might be noisy."

Monday, July 16, 2012

mtDNA Haplogroup H15

Today, I want show some data about mtDNA haplogroup H15 because H15 is relatively rare and there are already two individuals with H15 in this Kurdish DNA project, one is H15a (Kurdistan-Iraq), the other is H15b (Kurdistan-Iran).

The current scientific database (Genbank) has 12 fully sequenced H15's, presented here. One of them is the Kurdish H15a individual.

H15a 11410 
1. AY495146(European) Coble
H15a1 (57G) 14953
 2  AY713995(India) Palanichamy
3. EF657726 mtDNA545(Europe) Herrnstadt
4. FJ348171 Irene
5. HM852832(Iranian 36) Schoenberg  
6. JN807313(Iraq Kurdish) FTDNA      
H15b 3847
7. AY738960(Italy) Achill1         
8. EU600353(Druze) Shlush      
9. FJ384437 Fendt         
10. FJ384438 Fendt        
11. JF901940(Armenian) FTDNA
12. JN651417(Armenian) FTDNA   


I made a lineage tree of the fully sequenced H15 samples:


 H15a:
The closest individual to the Kurdish mtDNA H15a1 is from Iran.
The FJ348171 sample from South Tirol is considered to be H15a1 but it has a lot of additional mutations, so it is isolated from the other H15a1 individuals in this lineage tree.

 H15b:
Since the Kurdish H15b individual is not fully sequenced he cannot be inserted into the lineage tree (somewhere into the green area = H15b), however, based on the 23andme data some branches can be excluded for the Kurdish H15b:
The Kurdish H15b does not have the G10993A mutation of the Italian individual AY738960.
The Kurdish H15b does not have the T15115C mutation of the Armenian individual JF901940.

Other data sources:

The FTDNA Haplogroup H&HV mtGenome Project: H15 has 6 individuals with H15:

2 x H15 {Cluster B}
Kit numbers 167699 and 8202: one individual from Wuschewier, Markisch-Oderland, Brandenburg/Germany, and one individual from Shiraz, Iran

3 x H15b [Phylotree V14]
Kit numbers 200 (from Poland), 176696 (Armenian), and 189873 (Armenian)
These two Armenians at FTDNA are the same as the Armenians at Genbank (see lineage tree above)

1 x H15b1 [Phylotree V14]
Kit number 70730 (this individual is from the USA)

At 23andme there are at least 7 individuals with mtDNA H15:

1 x Sorani Kurd (H15a1 from Iraq)
1 x American (USA; H15a)
1 x unknown (H15a)
1 x Sorani Kurd (H15b from Iran)
1 x Lebanese Maronite (H15b)
 2 x Americans (USA; H15b)


Friday, July 13, 2012

Ashkenazi-Levites and Kurds

I was recently contacted by an individual who is very interested in the history and genetics of Ashkenazi-Levites, Iraq, and ancient Khazaria due to his ancestry. He read my post about Ashkenazi-Levites, but he thinks that the ancestry of R1a1a Levites cannot be pinpointed because this haplogroup is "everywhere, in Nepal, in Ireland".



This is why I decided to give some more details about my thought process.



Regarding the R1a1a Ashkenazi-Levites I recently did a new analysis here using more Y-STRs, a total of 67 STRs. Unfortunately, all Iranian samples and the Kurdish sample "H1483" don't have 67 STRs analyzed, so they are not included in this Y-STR67 analysis. I am pretty sure that the Iraqi Kurdish individual H1483 would end up as the closest to the R1a1a Ashkenazi-Levite cluster in a hypothetical STR67 analysis and I will explain it in detail.



Another analysis was done by Vladimir Semargl comparing all R1a individuals that have analyzed 111 STRs (and a few 67STRs). Again, no Kurds or Iranians are included because of the low number of tested STRs.
 The blue area on the top of the lineage tree is the R1a1a Levite cluster.

So based on this Y-STR111 lineage tree, the closest one to the R1a1a Levite cluster is "Quassad_116213".
Now, compare the R1a1a Levite modal haplotype (MODE) with Quassad's results and the Iraqi Kurdish H1483 (just focus on the 43 STRs that were measured by all three):

2. C1. Z93+ L342+ L657- Ashkenazi-Levite (Type "A" / "AJ")
MIN                       
13 24 15 10 11-14 12 12 10 13 11 29 14 9-10 11 11 24 14 20 29 12-12-12-15 11 10 19-23 14 15 16 19 33-37 11 11 11 8 17-17 8 11 10 8 11 10 12 22-22 15 10 12 12 12 8 13 23 21 12 12 10 13 10 11 12 13 32 13 9 17 12 26 27 19 12 12 12 12 10 9 12 11 10 11 11 30 12 12 25 13 9 10 20 15 19 11 23 15 12 15 24 12 23 19 10 15 17 9 11 11
MAX


13 25 17 10 11-15 12 12 12 14 11 32 15 9-11 11 11 24 14 21 32 12-12-15-16 12 12 19-23 15 17 20 22 36-40 15 11 11 9 17-17 8 12 10 8 12 10 12 22-22 15 10 12 12 15 8 16 24 22 13 12 12 13 10 11 12 13 32 15 9 17 12 28 27 19 12 12 12 12 10 9 12 11 10 11 11 30 12 12 25 13 9 10 20 15 20 11 23 15 12 15 25 13 23 19 10 15 17 9 11 11
MODE


13 25 16 10 11-14 12 12 10 13 11 30 14 9-10 11 11 24 14 20 30 12-12-15-15 11 11 19-23 14 16 19 20 35-38 14 11 11 8 17-17 8 12 10 8 11 10 12 22-22 15 10 12 12 14 8 14 23 21 12 12 11 13 10 11 12 13 32 15 9 17 12 27 27 19 12 12 12 12 10 9 12 11 10 11 11 30 12 12 25 13 9 10 20 15 20 11 23 15 12 15 25 12 23 19 10 15 17 9 11 11


116213 KhalilQussad 13 24 16 11 11-14 12 12 10 13 11 30 16 9-10 11 11 24 14 20 30 12-12-15-15 11 11 19-23 17 16 19 16 34-37 14 11 11 8 17-17 8 11 10 8 11 10 12 22-22 15 10 12 12 13 8 15 24 21 13 12 11 13 11 11 12 13 30 15 9 15 12 27 27 19 13 13 13 12 10 9 12 10 10 11 11 30 11 14 25 13 9 10 19 15 19 11 23 16 12 15 24 12 23 19 10 14 17 9 11 11
H1483 Kurd 13 25 16 10 11-14 12 12 10 13 11 31 16 9-10 11 11 24 14 20 32 12-12-15-15 11 11 19-23 15



13 11
















15

13
























11 30 12 14 24 13 9










23




11
Results:
Distance 116213 vs R1a1a Modal of Ashkenazi-Levite: 8 STR distance steps out of 43 STRs (19%)
Distance H1483 vs R1a1a Modal of Ashkenazi-Levite: 9 STR distance steps of 43 STRs (21%)

The Palestinian Khalil Qussad has a distance of 8/43, and the Iraqi Kurd shows slightly more distance steps (9/43) to the R1a1a Modal haplotype of Ashkenazi-Levites. But not only the number of differences is important but also where these differences are and how "untypical" they are for Ashkenazi-Levites:
Check if the differences are within the range of R1a1a Ashkenazi-Levites variation or not (See Minimum and Maximum values).

STR differences:
H1483:  6  STRs (orange) within Ashkenazi-Levite variation, 3 STRs (yellow) outside
116213: 3 STRs (orange) within Ashkenazi-Levite variation, 5 STRs (yellow) outside

So, even though the Kurd H1483 is more distance steps away from the Ashkenazi-Levite modal haplotype than "116213", he is still closer to the Ashkenazi-Levite modal haplotype because the distance steps are on more variable STRs.

The genetic distance should be determined by the STR distance steps and the variability of the STRs.  I tried to combine and automate both factors. This is what I was trying to explain in the post "How to read STR data".

And finally, I am presenting an analysis using 43 STRs (all tested STRs of the Iraqi Kurd H1483) to find the best matches for the Ashkenazi-Levite modal haplotype. All L342+ individuals with more than 12 tested STRs were used, the data are sorted by this new approach (see second column "compared STRs in table below). Please also note that some of the Top matches only have 34 STRs compared, so these results are less accurate. The Kurd H1483 and the Khalil Quassad (116213) are highlighted in bold and yellow.

Top30 matches for the Ashkenazi-Levite modal haplotype (including Ashkenazi-Levites):


Top30 matches for the Ashkenazi-Levite modal haplotype (excluding Ashkenazi-Levites):

Conclusions:
1. By now, the closest non-Jewish individual to the R1a1a Ashkenazi-Levite modal haplotype is the Iraqi Kurdish individual H1483.

2. The 111 STR lineage tree designed by Vladimir Semargl is great but it is only based on the distance steps (third color-coded column in tables above); it does not address the variability of the different STRs and it does not reflect the diversity. A much better tree by Marko Heinila is posted here.

Thursday, July 12, 2012

mtDNA Haplogroup U1a1


Today, I want show some data about mtDNA haplogroup U1a1 because U1a1 is relatively rare and there are already two individuals with U1a1 in this Kurdish DNA project. To clarify, the focus of this post is U1a1, not U1a.


The current scientific literature has 9 fully sequenced U1a1's, presented here
 2. AY289073(Koraga/India) Ingman-Gyll
3. AY882396(Adygei = Circassian) Achilli-Rengo
4. DQ112932.2(Africa) Kivisild
5. EF556161 Behar2008 (5 out 82 = 6.1% of Iranian Jews)
6. EF657399 mtDNA250(Europe) Herrnstadt
7. EU597497(Adygei = Circassian, Russia) Hartmann
9. GU218692(Greece) FTDNA
10. HM156682(India) Govindaraj
11. HQ615882(Italy) FTDNA 

I should note that one Indian sequence  [EU872049(India) Bhat] was removed from GenBank because the sequence could not be confirmed. So the number of fully sequenced U1a1 is actually 9 and not 10 as presented by Ian Logan on his website (#8 in Ian Logan's list).

I made a lineage tree of the fully sequenced U1a1 samples:


Other data sources:
 
The Armenian DNA project does not have any mtDNA U1a1.
The Arab DNA project does not have any mtDNA U1a1.
The Jewish DNA project does not have any mtDNA U1a1.
The Assyrian DNA project does not have any mtDNA U1a1.

The Finland DNA project has 2 individuals with U1a1:
142783     Britta Laurintytar Kruskopp, Kannus? Suomi
213047 Thors
 
The global mtDNA haplogroup U1 project has 3 individuals (the Finnish sample same as above):
1 x Barbados
1 x Finland (142783 Britta Laurintytar Kruskopp, Kannus? Suomi)
1 x unknown

At 23andme there are at least 8 individuals with mtDNA U1a1:
1 x Zaza (from Turkey)
1 x Sorani (from Iraq)
1 x Assyrian (from Hakkari/Turkey)
1 x Turk (with maternal Circassian ancestry from Malatya)
1 x from Bethlehem
3 x Americans (unknown ancestry; from US and Canada)

Kurdish mtDNA data VI

By combining the KurdishDNA data from this blog and the data from Quintana-Murci et al., 2004, we have the following numbers for Kurdish mtDNA (N=37):

1x C4b (Alevi Kurmanji)
1x G2a (Sorani)
2x H (Kurds from Iran; Quintana-Murci et al., 2004)
1x H5a1 (Sorani)
1x H13a2 (Alevi Kurmanji from Dersim)
1x H14 (Yezidi)
1x H15a1 (Sorani; mtDNA fully sequenced here and here)
1x H15b (Sorani)
2x HV* (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV1 (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV2 (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV (Sorani)
1x HV (Kurmanji from Zakho)
1x I (Kurds from Iran; Quintana-Murci et al., 2004)
1x I5a (Zaza from Dersim)
1x J1 (Kurds from Iran; Quintana-Murci et al., 2004)
1x J1b (Sorani)
1x J1c (Alevi Kurmanji from Dersim)
1x J2a1a  (Kurd from Turkey)
1x J2b1 (Kurd from Iraq/Iran)
1x K (Kurds from Iran; Quintana-Murci et al., 2004)
1x N1b1 (Alevi Kurmanji from Dersim)
1x U1a1 (Zaza)
1x U1a1 (Sorani)
1x U5  (Kurds from Iran; Quintana-Murci et al., 2004)
1x U5a1 (Kurmanji from Dohuk)
4x U7 (Kurds from Iran; Quintana-Murci et al., 2004)
2x U8b (Kurds from Iran; Quintana-Murci et al., 2004)
1x U8b (Feyli)
2x W (Kurds from Iran; Quintana-Murci et al., 2004)

Since Quintana-Murci et al., 2004 did not analyze all 23andme mtDNA hapogroups it would be interesting to summarize all data as if the data were part of the original paper from 2004:


1x C (1/37 = 3%)

1x G (1/37 = 3%)
7x H  (7/37 = 19%)
4x HV*  (4/37 = 11%)
1x HV1  (1/37 = 3%)
1x HV2  (1/37 = 3%)
2x I (2/37 = 5%)
3x J1  (3/37 = 8%)
2x J2 (2/37 = 5%)
1x K  (1/37 = 3%)
1x N1b (1/37 = 3%)
2x U1a (2/37 = 5%)
2x U5 (2/37 = 5%)
4x U7 (4/37 = 11%)
3x U8b (3/37 = 8%)
2x W (2/37 = 5%)

http://kurdishdna.blogspot.com/

Sunday, July 8, 2012

Harappa Ancestry Project presents Kurdish data

Zack from the Harappa Ancestry Project was so kind to focus on the Kurdish samples in his monthly update of the project and compared them with available autosomal Kurdish data.

I used his results to calculate the Euclidean distances of the different Iranian groups in his data set.
Based on his analysis, the Kurds in the Harappa Ancestry Project are closest to Iranians.

Here are the TOP30 matches for Kurds in the Harappa Ancestry Project:


The biggest surprise to me is one Bulgarian participant (HRP0209) at #17 in the Top30 list of Harappa Kurds. I am not sure how to interpret his small distance to other Kurds/Iranians.

Top30 matches for HRP0209 are:

Update: Mystery solved. 
HRP0209 is not Bulgarian, but Kurmanji Kurd from Turkey (KD006). For some reasons, he prefers to be mislabeled in various projects (e.g. "BGTR1" on Eurogenes).

Saturday, July 7, 2012

R1a1a L657+

In my previous post I used my approach to look into Y-STR67 values of individuals of the FTDNA R1a1a and Subclades Y-DNA Project.

Today, I want to risk a deeper look into L657+. (2. D. Z93+ L342+ L657+ Central&Southwest Asia). Here are the Top30 matches of the L657+ Modal haplotype:

 
Obviously, the L657+ cluster is not narrow at all. The closest L657+ individuals to the  L657+ Modal haplotype are from the Arabian peninsula (one exception: N2358 Philip, Ayroor, Kerala, India).

L657+ shows high variance, a lot of L657+ are not in the Top30 matches of the modal haplotype. For some of those we have additional information about the location:

M6986 (#52),
M7443 (#61),
N102178 (#74),     Pakistan
N22414 (#80),       India
M6736 (#81),        Iran
M7417 (#82),
U2810 (#83),         Pakistan
112208 (#94),        Kazakhstan
M6740 (#95),        Saudi-Arabia
216619 (#104),      Pakistan
N12617 (#119),     India
U2321 (#128)        India

I want to discuss the bolded ones a bit.

N102178    (Mehrdad, Lahore/Pakistan).
N102178 is clear outlier.


N22414    (Luddan Singh Ranu, 1800s, Manki, Punjab/India)
N22414 shows some similarities to U2321 who is also from Punjab/India.

 M6736    (Iran)
M6736 is similar to L657+ individuals from the Arabian peninsula (one exception: N2358 Philip, Ayroor, Kerala, India).

 U2810    (Tharn Bajwa 1890 Pakistan)
U2810 shows small similarities to L657+ individuals from the Arabian peninsula
112208    (Babasan tribe from Northern Kazakhstan)
112208 is a clear outlier.


M6740    (Mecca/Saudi Arabia)
M6740 shows small similarities to L657+ individuals from the Arabian peninsula.

 216619    (K Khan Qureshi, c.1830-1900 Pakistan)
216619 is a clear outlier. Interestingly, he shows some similarities to one individual, 209438 from Bitlis close to Lake Van. Unfortunately 209438 did not test for L657.


N12617    (Jagarnath Dixit ca 1490-1550 India)
N12617 is a clear outlier.


U2321  (Amar Sandhu, Jalandhar, Punjab, India)
U2321 shows some similarities to N22414 who is also from Punjab/India (see above).

Conclusions:
1. The L657+ group does not show a clear narrow cluster when looking at Y-STR67 values.
2. Just the individuals from the Arabian peninsula show a cluster.
3. The L657+ individuals of the "Al Tamimi" and the "Al Rass" family (Saudi-Arabia) seem to be very closely related.
4. The high number of "Al Rass"/"Al Tamimi"  data in the FTDNA L657+ group (5/22) is shifting the modal haplotype towards Saudi Arabia. Thus, I am proposing an adjusted modal haplotype for L657+.

Changes are listed below:



DYS391 DYS389i DYS389ii Y-GATA-H4 DYS607 CDYa CDYb DYS534 DYS444
FTDNA 10 13 30 11 17 36 40 13 14
Adjusted 11 14 31 12 16 35 41 14 13


 Here are the Top30 matches of the adjusted L657+ Modal haplotype:

Now, the closest to modal haplotype is the L657+ individual N2358 Philip, Ayroor, Kerala, India followed by L657+ individuals from the Arabian peninsula ("Al Rass"/"Al Tamimi" families).


Still, a lot of L657+ are not in the Top30 matches of the modal haplotype but they moved up in the ranking (new number in italic):

M6986 (#52==>#39),
M7443 (#61==>#5),
N102178 (#74==>#54),     Pakistan
N22414 (#80==>#9),       India
M6736 (#81==>#37),        Iran
M7417 (#82==>#59),
U2810 (#83==>#105),         Pakistan
112208 (#94==>#46),        Kazakhstan
M6740 (#95==>#62),        Saudi-Arabia
216619 (#104==>#110),      Pakistan
N12617 (#119==>#123),     India
U2321 (#128==>#67)        India