Thursday, February 28, 2013

Indo-European Swadesh Mapping #002 "YOU"

Today, I want to present a map for the word "YOU".

For more info, read my initial blog post.

How to see interactive maps:
1. Download map (gmp file).
2. Go to
3. Click "Project".
4. Upload map from your computer (the gmp file you just downloaded).
5. Click "Open".
6. Enjoy.

If you are interested to see the names of the included languages, I made a spreadsheet.


One interesting thing I noted is that the well conserved "tu" is followed by "v/w" in some languages.  These languages are all present in Asia.

Tocharian B: tuwe
Classical Armenian: dow
Old Persian (West-Iranian): tuvam, θuvām
Sodgian (East-Iranian): tyw
Sariqoli (East-Iranian): tɛw
Kata/Kati (Nuristani): tʷi
Vedic Sanskrit (Indo-Aryan): tvám

In some Indo-Aryam languages the "v" changed into a "m":
Nepali: timi, ta
Assamese: ti, tay, tumi, āpuni
Bengali: tumi
Oriya: tumɔ, to, apɔṇɔ-nkɔ,

The antrogenica user linkus showed me that "v/w" is actually also present in Baltic languages in some of the declensions of "you". In Slavic languages it is a "b". I also found the "you" declension of the reconstructed proto-Indo-European, which shows both "w" and "b". I also added German.

Proto-Indo European Lithuanian Latvian Polish Russian German
Nom. *túh₂ tu tu ty ты du
Gen. *téwe ~ *toy tavo/tavęs tavs ciebie/cię тебя deiner
Dat. *tébʰye ~ toy tau tev tobie/ci тебе dir
Acc. *twé ~ te tave tevi ciebie/cię тебя dich
tavimi tevi tobą тобой
tavyje tevī tobie (о)тебе

Another thing that caught my attention is the prefix "e" in Gorani dialects, e.g. "etu" and Greek. Similar structures with "e":

Greek: εσύ/esi
Macho from Topzawa (Kurdistan-Iraq; Gorani): tu, etî
Shabaki from Qahrawa (Gorani): tu, etû
Bajalani from Arpaîi (Gorani): etci

Indo-European Swadesh Mapping #001 "I"

This is a new series about the Indo-European language family. Again, my approach is purely descriptive, I don't want to hypothesize about the origin and ancestral location of proto-Ind0-European language (PIE). Instead I thought it might be nice to have maps showing world wide variation of Indo-European languages on a map. In order to prevent discussions I put the location of PIE in the middle of the Black Sea...

Today, I want to present a map for the word "I".

The map shows just the word for "I", not the corresponding language. The words are extracted from various sources, mostly from the Indo-European lexical cognacy database. To see the interactive maps, download the gmp files I created and open them at

If you are interested to see the names of the included languages, I made a spreadsheet.

I included living and extinct languages, my main focus are the Swadesh words and the Indo-Iranian languages.

Monday, February 25, 2013

Haplogroup J1 (L222.2-) tree STR67

Today, I want to present the haplogroup J1 tree with STR67 data. Most of the tested individuals are from the Arabian peninsula, from a narrow cluster, and have the haplogroup J1b2b1 (aka J1c3d2) L222.2+, which could be seen in my previous J1 tree. I excluded those L222.2+ individuals from the new tree, and Roberto Raciti helped me collecting all J1 L222.2- individuals (L222.2- status confirmed or predicted) with STR67 data (N=1039).
I should note that STR trees do not necessarily branch along the SNPs, and this is even more true for trees based on "only" 67 STR values.

In this tree we have 3 individuals with Zaza/Kurdish ancestry, two from the Northern part and one from the Central part. In the tree I highlighted them in red. 
E11334 (Suleyman Efendi 19th century Askale Erzurum, Turkey)
N91920 (Kurdish Serzer, 1805 - 1846, Turkey)
N88767 (Sulaymania, Iraq (Kurdistan) of sharif descent)

Rectangular tree of J1 (L222.2-) tree STR67 (pdf)

Polar tree of J1 (L222.2-) tree STR67 (pdf)

Pdf files are also available in the data sink.

Due to the large size of the tree I split the tree image into pieces (see below).

The oldest branch of J1 are the recently discovered Z2223 individuals (highlighted in yellow at the top).  Interestingly,  E11334 (originally from Erzurum) belongs to this very old J1 Z2223+ branch. Probably due to its age this Z2223 branch has been discovered in various regions of the world:
1x Turkey (E11334 Suleyman Efendi 19th century Askale Erzurum)
2x Germany (18215, Daniel May, 1722-1821 and N50051, Konrad Francois Schnabel b. 1755 Martinhagen Germ.)
1x Ireland  (211228, Michael Brethour, 1774 - 1862)
1x Oman (M6600)
1x Qatar (208949, albloushi)
1x Unknown Origin (154731)

N91920 is not part of the J1c3* Jewish Cluster A but is present at the root of this Jewish Cluster.
The characteristic SNP of this Ashkenazi Jewish cluster is L817. Similar to the R1a1a Ashkenazi-Levite cluster, we have now a J1 Ashkenazi cluster that shows similarities to the STR data of a Kurdish individual. It would be interesting to see if N91920 is L817+.
Two samples from Germany are located at the root of this large and "oldest" Ashkenazi Jewish cluster as well:
21075 (Jacob Graber) and 187564 (Louis Graber).

Update April 18, 2013:
N91920 is now confirmed to be L817+ L818+ L816-, just like the Ashkenazi Jewish cluster.
21075 (Jacob Graber) is now confirmed to be L817+ L818+ L816- as well.
The Ashkenazi Jewish cluster is L817+ L818+ L816+, that means that the L816 is younger than L817. The closest and the only Middle Eastern relative of the L816+ Ashkenazi Jewish cluster is Kurdish.  

N88767 forms a "mini"cluster (highlighted in light blue) with N101140 from Italy (Alphonso Villani 1885-1961), however, the branch length of N88767 indicates that this calculated proximity is not strong.


Friday, February 8, 2013

Cultural Distance Calculator Part 3

Today, I want to present network clusters for cultures to compare them with each other.

It is interesting to note that...
1. Folk music traditions are more determined by the maternal lineages (correlation with mtDNA-haplogroups),
2. "Mother" languages  are more determined by the paternal lineages (correlation with Y-Chromosome-haplogroups).
3. Apparently folk stories traditions are more determined by the paternal lineages as well.
4. The resulting culture is a mix of these traditions (+ religious believes + the history).

1. Network based on Folk music traditions (Pamjav et al.):

2. Language families in Europe:

Y-haplogroups in Europe:

3. Cultural distance network for Folk tales (from Ross et al.)

Cultural distance network for Folk tales (based on Supplementary table S8 from Ross et al. and SplitsTree software):

 It's great to know that the cluster networks can be regenerated with the raw data distance matrix and SplitsTree. The only differences between the published network and the network I generated are the positions of the "Germans" and "Swedish in Finland" that appear to be more central in the cluster.

4. Cultural distance network based on Geert Hofstede (Europe only):

Based on Hofstede's parameters for culture, Europe splits into Northern and Southern group with some intermediates

Cultural distance tree based on Geert Hofstede (Old world only):
Europe: red
Middle East and North Africa: green
Asia: green asparagus
New World "Latin America": brown
New World "English-speaking": grey 
Pacific: pink
Africa: black

