KurdishDNA: How to read STR data

I know that there is a lot of controversy about the usage of Y-STR data and I agree with most of them. However, sometimes there are no other data available (e.g. Y-SNP data) and in those cases Y-STR can help a little bit to understand the observed pattern within one haplogroup subbranch.
Looking at STR databases (e.g. ysearch.org or semargl.me/en/dna/ydna/tools/asd-classic/) can be painful and useless, the reason for this is that the relationship between two individuals is solely based on STR differences, but these differences are not "weighted" in any sense, they just focus on "Distance markers" and "Distance steps".

Everyone who took a look at one of the FTDNA project quickly realizes that some Y-STRs are more variable than others. In the L342+ group of FTDNA R1a1a and Subclades Y-DNA Project the following order of variance can be observed in the first 25 Y-STRs (from low to high variability):

DYS388 DYS437 DYS392 DYS455 DYS448 DYS393 DYS454 DYS426 DYS447 DYS438 DYS390 YCAIIb DYS459a DYS385a DYS389ii CDYa DYS389i YCAIIa DYS464c DYS449 DYS19 DYS464d DYS464a DYS385b DYS459b CDYb DYS460 Y-GATA-H4 DYS391 DYS607 DYS439 DYS456 DYS570 DYS458 DYS576 DYS442 DYS464b

Differences in DYS464b are more common than differences in DYS388, so differences in DYS464b are "less important" than differences in DYS388. Any ranking should be weighted according to the variability of the STRs. The variance of some Y-STRs was calculated and published previously; YHRD listed them here.

This additional information can be used to better rank the best matches for an individual. The less Y-STRs are available for a comparison the more this approach is useful.

Of course, I did a first ranking test using a Kurdish individual:
H1483 (Z93+, L342+, L657-) in the FTDNA R1a1a and Subclades Y-DNA Project (focusing on 34 STRs and individuals that have these 34 STRs tested).

Top30 matches:

Next, this approach was expanded using 67 Y-STRs of the Arabic modal haplotype (2. C6. Z93+ L342+ L657-, Arabic). Top30 matches are:

Obviously, the Arabic cluster is a very narrow one with low variance. It cannot be old.

Then, R1a1a Ashkenazi-Levite modal haplotype was tested:

Similar to the Arabic cluster the Ashkenazi-Levite cluster is also pretty narrow with low variance. It cannot be old, either.

KurdishDNA

Friday, July 6, 2012

How to read STR data

No comments:

Post a Comment