I. So what can we learn from the presented HarappaWorld results?
Only the components that
are higher than 1% and occur in all presented samples can be considered
as 'Kurdish' or 'Iranian'. Components that only occur in some samples
are more likely due to individual 'family history'.
This helps us excluding most components that are not 'Iranian', i.e.
SE-Asian,
Siberian,
NE-Asian,
Papuan,
American,
Beringian,
San,
E-African,
Pygmy, and
W-African.
The South-Asian component is a tricky one since it is completely
missing in one individual (Sorani Kurd5), but present in low amounts
(Average 2%) in the other individuals.
The typical admixture of KurdishDNA based on HarappaWorld consists of 5 components (average of all 14 individuals):
Caucasus (43%),
Baloch (26%),
SW-Asian (14%),
Mediterranean (7%),
NE-European (6%), and
Other (6%).
II. What can we learn from the presented Dodecad K12b results?
Only the components that
are higher than 1% and occur in all presented samples can be considered
as 'Kurdish' or 'Iranian'. Components that only occur in some samples
are more likely due to individual 'family history'.
This helps us excluding most components that are not 'Iranian', i.e.
Siberian,
NW-African,
SE-Asian,
E-African,
E-Asian, and
Sub_Saharan
Again, the South-Asian component is a tricky one since it is completely
missing in one individual (Sorani Kurd5), but present in low amounts
(Average 2%) in the other individuals.
The typical admixture of KurdishDNA based on Dodecad K12b consists of 5 components (average of all 14 individuals):
Caucasus (41%),
Gedrosia (27%),
SW-Asian (14%),
Atlantic-Mediterranean (7%),
N-European (6%), and
Other (5%).
The outcome of both HarappaWorld and Dodecad K12b is the same.
Last question for today: Do we see any gradient of these 5 components within the 14 samples?
Yes, we do. Between samples from the North vs from the South. The Northern samples have more of the 3 components Caucasus, Atlantic-Mediterranean/Mediterranean and N-European/NE-European, while the Southern samples have more of the 2 components Gedrosia/Baloch and SW-Asian.
Palisto,
ReplyDeleteI think you should conduct your own ADMIXTURE analyses instead of using ready-made calculators. There are lots of freely available databases. This will allow you to make direct and realistic comparisons between your own Kurdish samples and the Kurdish samples of the other databases and also the samples of the other ethnic groups in those databases.
Hi Onur,
DeleteI am always appreciating your comments at Dienekes' blog (in case you are the same Onur). Anyways, in order to do my own ADMIXTURE analyses, I have to invest much more time and I have to learn much more about it. Furthermore, I am not sure if the outcome would be any different/better. The results of Dienekes and Harappa for Kurds are essentially the same. Instead of running my own ADMIXTURE analyses, I tried to optimize the ready-made ADMIXTURE calculators by introducing the "adjusted Euclidean distance" based Fst values. As far as I know Zack from the Harappa project is the only science blogger who implemented this approach (he calls it "Weighted distance"), see here:
http://www.harappadna.org/2012/03/ref3-admixture-dendrograms/
"So let's try a dendrogram of all these populations' average admixture results. Instead of using regular Euclidean distance, I used some weighting based on Fst distances between admixture components, very similar to what Palisto did."
And here:
http://www.harappadna.org/2012/03/harappa-oracle/
"I am using Dienekes' code with a couple of changes. One of them is using weighted distance based on Fst divergences between ancestral components. Because of that it is several times slower than DodecadOracle. I plan to offer an option soon to switch between Euclidean distance and Fst-weighted distance."
In the last weeks I focused more on Y-haplogroups (especially R1a1a) and tried to set-up a STR/SNP-Y-Chromosome database for the Middle East and neighboring regions from scientific publications and elsewhere (6000 samples so far).
I am always appreciating your comments at Dienekes' blog (in case you are the same Onur).
DeleteThanks. I know you from your comments in the Harappa project but have only very recently found this blog. I really appreciate your entrepreneurship and the depth of your research here on this blog.
Anyways, in order to do my own ADMIXTURE analyses, I have to invest much more time and I have to learn much more about it.
I see.
Furthermore, I am not sure if the outcome would be any different/better.
It would surely be better, as you would have the opportunity to directly compare your Kurdish samples to other Kurdish samples and other ethnic groups and thus you and all of us would better understand the genetics of your Kurdish samples and Kurds in general.
The results of Dienekes and Harappa for Kurds are essentially the same.
The two calculators are pretty similar in many ways. So I am not surprised at the results.
BTW, for interracial comparisons (e.g., Caucasoidness vs. Mongoloidness) I prefer low Ks (but not the lowest ones), while I prefer high Ks for intraracial comparisons (e.g., North Europeanness vs. West Asianness).
In case you don't know, Razib has a blog thread devoted to teaching how to use the ADMIXTURE software for complete amateurs:
Deletehttp://blogs.discovermagazine.com/gnxp/2011/03/analyzing-ancestry-with-admixture-step-by-step/
Even Maju, who is less mathematically-oriented than you, did some comprehensive ADMIXTURE analyses after reading Razib's instructions on that thread.
I agree that low Ks for interracial comparisons and high Ks for intraracial comparisons are better, it just makes sense.
ReplyDeleteThanks for the link, Onur. I will give it a try.
Palisto, what is your explanation for the Mongoloid components that are seen in Iranic and Slavic peoples in generally very small amounts but appreciably increase in amount from west to east? How did they end up in those populations in your opinion? The above noise level presence of the ASI-related South Asian component in Iranic peoples has a more straightforward explanation: the geographical, historical and linguistic connections with South Asia. But the Mongoloid components are more difficult to explain, primarily due to the higher number of probable sources.
ReplyDeleteI don't have a crisp answer for this.
ReplyDeleteTo give a good answer for this question, one could look into autosomal information:
1. Locate the exact position of all "Mongoloid" segments in the Iranian and Slavic genepool, first just pick one segment of one individual.
2. Check the ancestry of all known individuals sharing the same "Mongoloid" segment with this one individual.
3. Do step2 for all found "Mongoloid" segments of all available individuals of one ethnic group.
4. Hopefully, a pattern can be seen, e.g. Altaic connection and/or Finnic connection.
mtDNA and Y-Chromosomal data might help, too. In this regard, Y-haplogroup N, C-M126, C3-M127, Q-M25 and O3 in Iranic and/or Slavic people seems to be worth analyzing. I also think that mtDNA D4j and U4 are interesting for that.
The most immediate source of data at our disposal to investigate this topic seems to be the haplogroup data. But autosomal chromosomal segments, too, can be examined with the currently available tools and data. Hopefully, as whole genome analyses become cheaper and more prevalent, we will have more definitive answers on this topic. Another source of data of vital importance is ancient DNA studies, which have significantly increased in number during the last few years and will likely keep their pace unabated if not accelerate more during the following years. We must be ready for surprises.
ReplyDeleteBTW, northern Arabs have Mongoloid component levels similar to those of Kurds, their northern and eastern neighbors.
This comment has been removed by the author.
DeleteMy results showed 10% baloch does that mean i have baloch ancestrt
ReplyDelete