Total Pageviews

Monday, November 4, 2019

Kurdish mtDNA data XII

Just an update with fully sequenced mtDNA (N=209):


1x B4b1a (Kurd from Turkey) 
1x C (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x C4b (=C4b+A248G, A14566G, T16519C, A249A (not deleted)) (Alevi Kurmanji)
1x C5c1b (also called C5c-C16324T) with 315.1C, G7521A, C9431A, T11465C, C16093T (back mutation) (Derenko et al., 2013; Kurd from Iran)(KC911629 fully sequenced here and here)
2x D (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x D4 T16090C/T, T16223C, T16362C, A73G, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
2x D4 T16223C, T16362C, A73G, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007
1x F1b1* T204C, 309.1C, 315.1C, 522-, 523-, C3533T, A8718G, G13759A, A15398G,  A16183-,  T16519C (Kurd from Iran; Derenko et al., 2013)(KC911336 fully sequenced here and here)
1x G2a (=G2a+T16172C) (Sorani)
8x H (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
6x H (Kurds from Georgia; Comas et al., 2000)
2x H with T16311C  (Kurds from Georgia; Comas et al., 2000)
1x H CRS (Kurds from Iran; Quintana-Murci et al., 2004)
1x H T16209C, 44.1C, T57C, A93G, A263G, 309.1C, 309.2C, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x H
G16129A, C16248T, T195C, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x H
CRS, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007) 
1x H with C16218T (=H1ag1a or H1aq1 or H20) (Kurds from Georgia; Comas et al., 2000)
1x H with C16192R, C16261T (similar to: JN415470(Italy-LHON) Achilli Haplogroup H 19-AUG-2012 (Kurds from Georgia; Comas et al., 2000)
1x H5'36 (Kurd from Turkey)
1x H5  16051G, 16255A, 16304C, 16319A, 16327T, 263G, 315.1C, 456T (Kurds from Iraq; Al-Zahery et al., 2012)   
1x H5a1 (H5a1+T16304G, A3397G, G5471A) (G5471A usually in H5b) (Sorani)
1x H13a2b2 (Alevi Kurmanji from Dersim)
1x H13c1 (H13c1+T3398C, T10463C, T15394C) (Alevi Kurmanji from Sivas and Erzincan)
1x H13c2 with A153G, 309.1C, 309.2C, 315.1C, 522-, 523-, C15406T (Kurd from Iran; Derenko et al., 2013)(KC911276 fully sequenced here and here)
1x H14a with T16311C, C16256T, T16352C (=H14a +T16311C ) (Kurds from Georgia; Comas et al., 2000)
1x H14b T3197C (=H14b+C4086T, A16265T) (Yezidi) 
1x H15a1 (=H15a1+309.1C, 309.2C, 315.1C, A15316G) (Sorani; mtDNA fully sequenced here and here)
1x H15b (Sorani)
1x H15b T16086C (close to EU600353(Druze) Shlush)(Kurds from Iran; Quintana-Murci et al., 2004)
1x H20 (3/4 Zaza from Bingol, 1/4 (paternal grandmother) Kurmanji from Bitlis)
5x HV (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x HV* CRS (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV* C16174T (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV C16174T, C41T, A214G, A263G, 309.1C, 309.2C, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x HV A73G, T391C, G16153A (Kurmanji from Zakho)
1x HV1 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x HV1a1 C16067T, C16355T, C150T, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x HV1a1 C16067T, C16355T (=HV1a1) (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV1b2 (HV1b2+T3398C G13368A, T16519C)(Yezidi Kurd from Georgia)
1x HV2 C16168T, T16189C, T16217C, C16287T (Kurds from Iran; Quintana-Murci et al., 2004)
1x HV2a1 (T246C, C16214T, A16335G (=HV2a1) additional G13708A (Kurmanji Sunni from Sirnak/Hasankeyf/Nisebin/Mardin/Kamishli)
1x HV14 (T480C, G4655A, T15115C =HV14) (Kurmanji from Diyarbakir (Amed)/Turkey)
1x HV14 T16311C, G4655A, T15115C (Sorani)
1x I (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x I with G16129A, C16223T (Kurds from Georgia; Comas et al., 2000)
1x I1 with 291.1A, 294.1T, 309.1C, 315.1C, T2244C, T7705C, T14757C, T16519C (Kurd from Turkey; Fernandes et al., 2012)(JQ245776 fully sequenced here and here)
1x I1a G16129A, C16168T, T16172C, 16173, C16223T (Kurds from Iran; Quintana-Murci et al., 2004)
1x I1a1d with G16129A, C16223T, T16172C, T16189C, C16083T, C16355T (=I1a1d+C16083T, C16355T) (Kurds from Georgia; Comas et al., 2000)
1x I5a pre-I5a3  because G5231A, A15052G = I5a3 but still C150C, T6278T = I5a; additional   C16301T (Zaza from Dersim)
1x I5a (Zaza from Baltas/Varto, Turkey)
1x I5a (Kurd from Turkey)
1x J* (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
2x J1b (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x J1b C16069T, T16126C, G16145A, C16222T, A16235G, C16261T, T16311C (Kurds from Iran; Quintana-Murci et al., 2004)
1x J1b1b1 T10410A, C16069T, T16126C, G16145A, C16261T, C16290T, A73G, A263G, C271T, C295T, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
2x J1b1b1 T10410A, C16069T, T16126C, G16145A, C16261T, A73G, A263G, C295T, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
2x J1b1b1 C16069T, T16126C, G16145A, C16261T, T16519C, A73G, A263G, C295T, 309.1C    315.1C, C462T, T489C, 523DEL, 524DEL   (very close to JF939049(Armenian))(Kurds from Iraq; Al-Zahery et al., 2012)
3x J1b2 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x J1b3 (Kurd from Turkey)
1x J1b3 T1822C, A8460G, T16311C  (=J1b3+T1822C+T16311C) (Sorani)
1x J1b3b A73G, A263G, C295T, T489C, A750G, A1438G, A2706G, G3010A, T4216C, A4769G, C7028T, G8269A, A8460G, A8860G, A10398G, A11251G, G11719A, A12612G, G13708A, C14766T, A15326G, C15452A, T15530C, C16069T, T16126C, G16145A, C16222T, A16235G, C16261T (1/2 Alevi Kurmanji paternally, 1/2 Sunni Kurmanji maternally from Bingol, Kighi, Turkey)
1x J1b3b ( Kurd from Dersim)
1x J1b4 C16069T, T16126C, G16145A, C16222T, C16261T C16278T, C16287T (Kurds from Iran; Quintana-Murci et al., 2004)
1x J1c (=J1c+G185T, 4812A, C16290T, T16519C) (Alevi Kurmanji from Dersim)
1x J1c2m (old J1c2a)  C16069T, T16126C, 16148T, A73G, 185A, 228A, A263G, C295T, 315.1C, C462T, T489C, 523DEL, 524DEL  (close to JQ797801 from Romania and  JQ797802 from West-Siberia (Khanty))(Kurds from Iraq; Al-Zahery et al., 2012)  
1x J1d (=J1d+A15218G, T16519C) (Feyli, originally from Iran)
1x J1d (Kurd from Iraq)
1x J1d3b (=J1d3b+10742G, T11353C, 12425G, T16519C (Kurmanji from Adıyaman and Gaziantep; now living in Konya area)
1x J2a1a1 C16069T, T16126C, G16145A, A16182C, A16183C, T16189C, 16193.1C, T16231C, C16261T, A73G, C150T, C152T, T195C, C198T, A215G, C295T, 315.1C, T319C (Kurds from Iran; Derenko et al., 2007
1x J2a1a1 (=J2a1a1+A10044G, G11914A, C16264T)  (Kurd from Turkey)
3x J2b (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x J2b1 (Kurd from Iraq/Iran)
1x J2d C16069T, T16126C, C16193T, A73G, C152T, A263G, C295T, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x JT with T16126C, C16067T, T16311C (=JT) (Kurds from Georgia; Comas et al., 2000)
1x K (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x K 16129, T16224C, T16311C (= K1a11 or K2b2) (Kurds from Iran; Quintana-Murci et al., 2004)
1x K with T16224C, T16311C (Kurds from Georgia; Comas et al., 2000)
2x K with T16224C, T16311C, T16093C, C16260T (=K1a1+C16260T, or K1a17a+T16093C)(Kurds from Georgia; Comas et al., 2000)
1x K with T16224C, T16311C, A16240G (=K+A16240G) (Kurds from Georgia; Comas et al., 2000)
1x K with T16224C, T16311C, A16272G (=K+A16272G) (Kurds from Georgia; Comas et al., 2000)
1x K1a T16093C, T16224C, T16311C (Kurds from Iran; Quintana-Murci et al., 2004) 
1x K1a9 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011) 

1x K1b1c with 309.1C, 315.1C, T650C, A5811G, G8545A, T16519C (Kurd from Iran; Derenko et al., 2013)(KC911572 fully sequenced here and here)
1x L3e5 16037G, A16041G, C16223T, A73G, C150T, A263G, 315.1C, T398C, 523DEL, 524DEL
(Kurds from Iraq; Al-Zahery et al., 2012)
1x M/C (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x M1a1 G16129A, 16182C, 16183C, T16189C, C16223T, T16249C, T16311C, T16359C, C16360T, T16519C, A73G, T195C, A263G, 309.1C, 309.2C, 315.1C, T489C (Kurds from Iraq; Al-Zahery et al., 2012)
5x N (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x N1b (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x N1b1 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x N1b1 (=N1b1+C16176G, C1703A, C3921A, G7337A, T16519C, probably N1b1a6 because JQ245756 (Kurd from Turkey) has same G7337A mutation, see below) (Alevi Kurmanji from Dersim)
1x N1b1 with C16223T, G16145A, C16176G (=N1b1) (Kurds from Georgia; Comas et al., 2000)
1x N1b1a6 with 309.1C, 315.1C, 522-, 523-, G7337A, G9133A, A14690G (Kurd from Turkey; Fernandes et al., 2012)(JQ245756 fully sequenced here and here)
1x N2a with C16223T, T16086C, G16153A, G16319A (=N2a+T16086C) (Kurds from Georgia; Comas et al., 2000)
1x R (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
2x R0 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x R0 16519C, 16524C, A263G, 315.1C (Kurds from Iraq; Al-Zahery et al., 2012)
1x R0 16368C, 16519C, A263G, 309.1C, 315.1C (Kurds from Iraq; Al-Zahery et al., 2012)
1x R0a(60.1T) 309.1C, 315.1C, 523-, 524-, 573.1C, 573.2C, 573.3C, 573.4C, 573.5C, A7364G, C9027T, G9489A, A10006G, A15505G, C16168T, C16264T, C16295T (Kurd from Turkey; Gandini unpublished)(KP407079 fully sequenced here and here)
1x R2 with C16071T, G16145A, C16234 (=R2 +G16145A+C16234) (Kurds from Georgia; Comas et al., 2000)
1x T* T16126C, T16189C, 16193.1C, T16249C, C16294T, T16304C, A73G, T146C, T195C, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
4x T1 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x T1 (Alevi Kurmanji from Dersim/Turkey) 
1x T1a1'3 T16126C, A16163G, C16186T, T16189C, C16294T, A73G, T152C, T195C, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x T1a2b with C12633T, G5460A, G11914A, T16311C, T16519C (Feyli)
1x T1a7 T16126C, A16163G, C16186T, T16189C, G16274A, C16294T, T16519C, A73G, A263G, 309.1C, 315.1C, A512G  (close to EU935435(Egypt) Kujanova and JQ798027 (Israel))(Kurds from Iraq; Al-Zahery et al., 2012)
1x T1b T16126C, A16163G, T16189C, T16243C, C16294T, A73G, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007) 
1x T1b T16126C, A16163G, T16189C, T16243C, A16247G, C16294T, T16519C, A73G, 152C, A263G, 309.1C, 315.1C, 524.1A, 524.2C (Kurds from Iraq; Al-Zahery et al., 2012)
1x T1b3 with A1888G (back mutation), T5492C, T16519C (Pereira et al. 2017; Kurd from Turkey)(KX440310 fully sequenced here and here)
3x T2 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x T2a1b2b with T16126C, C16294T, C16296T, C16256T, A16317G (=T2a1b2b +A16317G) (Kurds from Georgia; Comas et al., 2000)
1x T2b (Kurd from Turkey)
1x T2c1 (Kurd from Iran, originally from Amed)
1x U1a (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x U1a'c  A16182C, A16183C, T16189C, T16249C, A73G, T146C, A263G, C285T, 309.1C (similar to HM852844(Iranian 9) Schoenberg) (Kurds from Iran; Derenko et al., 2007)
1x U1a'c with A16182C, A16183C, T16189C, T16249C (=U1a'c) (Kurds from Georgia; Comas et al., 2000)
1x U1a1 A14070A, T16163C (Zaza)
1x U1a1 G16129A, A16183C, T16189C, 16193.1C, T16224C, T16249C, T16288C, C16295T, A73G, C150T, T195C, A263G, C285T, 309.1C, 309.2C, 315.1C, A385G (Kurds from Iran; Derenko et al., 2007)
1x U1a1 A16183-, 16193.1C, C16193T.2C, T16249C, T72C, A73G, T195C, A263G, C285T, 309.1C, 315.1C, A385G (Kurds from Iran; Derenko et al., 2007)
1x U1a1a (Sorani) with A11467G,  A12308G,  G12372A, C285T,  T12879C,  A13104G, A14070G, G15148A, A15954C, T16249C, C2218T, G14364A, T16189C, G4991A, G6026A, T7581C, A385G, 3158.1T, G3591A, A13422G, G9575A, C2836T, G4659A, 573.1C, 573.2C, A10283G, (309.1C), (315.1C), (523-), (524-), (16182C), (16183C), (16519C) (=U1a1a; shares C2836T, G4659A mutation with Indian samples HM156682(India) Govindaraj) (fully sequenced here and here)
1x U1b C16111T, 16214A, T16249C, G16319A, C16327T, T16519C, A73G, T146C, T152C, A263G, C285T, 315.1C, 572T (Kurds from Iraq; Al-Zahery et al., 2012)
1x U2 (Alevi with Zaza ancestry)
1x U2e1a with A16051G, T152C, A508G, A3720G, A5390G, T5426C, C6045T, T6152C, A10876, T13020C, T13734C, A15907G, G16129C, T16362C, C340T, C11197T, T11732C, G7337A, A15218G, T16311C, T16519C (Sorani from Sulaymaniyah/Iraq)
1x U3 with A16343G (=U3) (Kurds from Georgia; Comas et al., 2000)
1x U3a (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x U3a with C150T, A14139G,  T15454C, A2294G,  T4703C,  G9266A, T6518A (flip!),  A10506G,  C13934T,  G16390A C2766A, 10790C, G16129A, 16257T, T16519C (Sorani from Sulaymaniyah/Iraq); (similar to HM852895(Georgian45) Schoenberg also with C2766A)
1x U3b with (U3b+G3591A, C15946T, A16203G, T16311C, T16356C, C4640T (flip!) (Kurmanji Sunni from Hakkari/Mardin) 
1x U3c with A16343G, C16193T, T16249C (=U3c) (Kurds from Georgia; Comas et al., 2000)
1x U3c C16193T, T16249C, A16343G, G16526A, A73G, C150T, A263G, 315.1C  (close to HM852797(Azeri34) and HM852803(Azeri42) Schoenberg)(Kurds from Iraq; Al-Zahery et al., 2012)   
1x U4 T16356C, T16519C, A73G, T195C, A263G, 309.1C, 315.1C, G499A, 524.1A, 524.2C
(Kurds from Iraq; Al-Zahery et al., 2012)
1x U5 T16093C, T16189C, C16270T (Kurds from Iran; Quintana-Murci et al., 2004)
1x U5a1 (Kurmanji from Dohuk)
1x U5a1a (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x U5a1a1 (maternally Kurdish from Ardalan)
4x U7 (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x U7 (Zaza from Turkey)
1x U7 (Zaza from Dersim)
1x U7 (Kurd from Rojhelat, Iran)
1x U7 C16069T, A16227G, C16278T, A16318C, T16359C (very close to HM852853(Turk 187) Schoenberg) (Kurds from Iran; Quintana-Murci et al., 2004)
1x U7 C16192T, A16309G, A16318T (Kurds from Iran; Quintana-Murci et al., 2004)
1x U7 A16309G, A16318T (Kurds from Iran; Quintana-Murci et al., 2004)
1x U7 A16309G, A16318T (Kurds from Georgia; Comas et al., 2000)
1x U7a A16309G, A16318T, T16519C, A73G, C151T, T152C, A263G, 309.1C, 315.1C, 523DEL, 524DEL (Kurds from Iraq; Al-Zahery et al., 2012)
1x U7a2a1 with 315.1C, 522-, 523-, T7870C, T16519C (Kurd from Iran; Derenko et al., 2012)(KC911509 fully sequenced here and here)
1x U7a3a1e with A249G, 315.1C, 522-, 523-,  G16319A, T16519C (Kurd from Iran; Sahakyan et al., 2017)(KY824911 fully sequenced here and here)
1x U7a4 T16126C, C16148T, A16309G, A16318T, A73G, T146C, C150T, C152T, T195C, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x U7a4 T16126C, C16148T, A16318C, A73G, T146C, C151T, C152T, T195C, A263G, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x U7a4 T16126C, C16148T, A16318T, A73G, T146C, C151T, C152T, T195C, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x U7a'5 G16129A, A16318T, A73G, C151T, C152T, A263G, 309.1C, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x U7a13 with A16309G (Kurds from Iran; Quintana-Murci et al., 2004)
1x U8b C16111T, T16172C, A16183C, T16189C, T16311C (Kurds from Iran; Quintana-Murci et al., 2004)
1x U8b (Feyli)
1x U8b (Zaza from Sivas, originally from Dersim)
1x U8b1a1 A16066G, G16129A, A16183C, T16189C, C16234T, A73G, G94A, T195C, A263G, 309.1C, 309.2C, 315.1C (Kurds from Iran; Derenko et al., 2007)
1x U8b1a1 A16066G, G16129A, C16169T, A16183C, T16189C, C16234T, T16311C (Kurds from Iran; Quintana-Murci et al., 2004)
1x W (Kurds from Saqqez, Kurdistan-Iran; Farjadian et al., 2011)
1x W C16223T, C16292T (Kurds from Iran; Quintana-Murci et al., 2004)
1x W3 G16153A (=could be W3a1 because "KF450952(HGDP00243-Pathan) Lippold" and "KJ445931(HGDP00243-Pakistan) Zheng" have same mutation), C16223T, C16292T, C16294T, T16519C, A73G, T152C, A189G, C194T, T195C, T199C, T204C, G207A, A263G, 309.1C, 315.1C (Kurds from Iraq; Al-Zahery et al., 2012)
1x W3a1 with 309.1C, 315.1C, G1709A, C10845T, G16244A, T16368C, T16519C (Kurd from Turkey; Fernandes et al., 2012)(JQ245760 fully sequenced here and here)
1x W3b with 315.1C, T1413C, C1693T, A2258G, G8950A, A11467G, C12063T, C14557T, T14634C, T16519C (Kurd from Turkey; Fernandes et al., 2012)(JQ245757 fully sequenced here and here)
1x W3b T195C, T204C, G207A, T1243C, A3505G, G5460A,   G8251A, G8994A, A11947G, G15884C, C16292T, C194T, T1406C, T199C, G12923T,  315.1C, T4216C, T16519C (Kurdistan, Iran; Olivieri et al, 2013)(KF146279 fully sequenced here and here)
1x W4a C16223T, C16292T, C16286T (Kurds from Iran; Quintana-Murci et al., 2004)
1x W4d with T119C, T146C, 309.1C, 315.1C, 523-, 524-, A6977G, C13011T, G13145A, A15308G, T16519C (Kurd from Turkey; Fernandes et al., 2012)(JQ245758 fully sequenced here and here)
1x W6 with C16292T, C16192T, C16223T,  T16324C (=W6) (Kurds from Georgia; Comas et al., 2000)
1x W6c1a with 309.1C, 315.1C, C16292T, T16325C, T16519C (N91920     Kurdish Gule, ≈1850, Ahlat/Kurdistan-Turkey)
1x W6c1a with 309.1C, 315.1C, C8874T, T16519C (Kurd from Turkey (Tatvan); KF553923 fully sequenced, see here and here; shares 309.1C, 315.1C, C8874T, and T16519C with two Armenians: EU515252 and KX363871)
1x W9 with 309.1C, 315.1C, 522-, 523-, A5128G, G16129A, T16519C (Kurd from Turkey; Fenandes et al., 2012)(JQ245759 fully sequenced here and here)
1x X with T16189C, C16278T, C16186T (=X+C16186T) (Kurds from Georgia; Comas et al., 2000)
1x X2e T16124C, A16182C, A16183C, T16189C, 16193.1C, T16223C, C16278T, T16325C, A73G, A153G, T195C, A263G, 308.1A, 309.1C, 315.1C, C338T (Kurds from Iran; Derenko et al., 2007)

Monday, May 22, 2017

First Human Ancestor Came from Europe Not Africa

Today, a new paper came out questioning the OOA hypothesis.

PlosOne article:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0177127

Abstract

The split of our own clade from the Panini is undocumented in the fossil record. To fill this gap we investigated the dentognathic morphology of Graecopithecus freybergi from Pyrgos Vassilissis (Greece) and cf. Graecopithecus sp. from Azmaka (Bulgaria), using new μCT and 3D reconstructions of the two known specimens. Pyrgos Vassilissis and Azmaka are currently dated to the early Messinian at 7.175 Ma and 7.24 Ma. Mainly based on its external preservation and the previously vague dating, Graecopithecus is often referred to as nomen dubium. The examination of its previously unknown dental root and pulp canal morphology confirms the taxonomic distinction from the significantly older northern Greek hominine Ouranopithecus. Furthermore, it shows features that point to a possible phylogenetic affinity with hominins. G. freybergi uniquely shares p4 partial root fusion and a possible canine root reduction with this tribe and therefore, provides intriguing evidence of what could be the oldest known hominin.

...

In this study, we propose based on root morphology a new possible candidate for the hominin clade, Graecopithecus freybergi from Europe. Graecopithecus is known from a single mandible from Pyrgos Vassilissis Amalia (Athens, Greece) [38] and possibly from an isolated upper fourth premolar (P4) from Azmaka in Bulgaria [39] (Fig 1A and 1B). A new age model for the localities Pyrgos Vassilissis and Azmaka, as well as the investigations on the fauna of these localities [40] confirms that European hominids thrived in the early Messinian (Late Miocene, 7.25–6 Ma) and therefore existed in Europe ~ 1.5 Ma later than previously thought [39]. This, and recent discoveries from Çorakyerler (Turkey), and Maragheh (Iran) demonstrate the persistence of Miocene hominids into the Turolian (~8 Ma) in Europe, the eastern Mediterranean, and Western Asia [41, 42].

Newsweek article:
http://www.newsweek.com/first-hominin-europe-east-africa-human-evolution-613494

Monday, May 1, 2017

Lactase persistence gene: MCM6 of Denisovans found in East Asia

Today, I obtained genotype data comprising MCM6 SNPs from HapMap for 2436 individuals from 26 HapMap populations (ESN, GWD, LWK, MSL, YRI, ACB, ASW, CLM, MXL, PEL, PUR, CDX, CHB, CHS, FIN, GBR, IBS, TSI, CEU, BEB, ITU, JPT, KHV, PJL, STU, GIH). Haplotype annotation of each individual can be downloaded here. I focused on all the SNPs that show more variation (more than 1% of the alternative base). I excluded all SNPs that have varying numbers of bases (insertion/deletion), and I excluded rs4988274 and rs55809728 because they did not show a clear pattern (hypervariability?). I excluded all haplotypes that were only present in 2 or less individuals. Using the remaining 203 SNPs I was able to identify 76 haplogroups. Please note that there are many more haplotypes (especially in Africa); I just focused on most common ones as stated above. Please also note that I changed the nomenclature at this stage because the following three reasons:
1. Some of the populations identified in Enattah et al., 2008 are not present in the HapMap dataset. Thus, some of the identified haplotypes in Enattah et al., 2008 are missing here.
2. Because this time I used more SNPs most of the haplotypes split into several haplotypes:
    ht1=> ht17-30 (blue)
    ht2=> ht31 (turquoise)
    ht3=> ht32 (turquoise)
    ht4=> ht59-73 (orange)
    ht5=> ht74-76 (red)
    ht6=> ht49-57 (green)
    ht13=> ht46-48 (brown)
    ht16=> ht45 (cayenne)

Additionally, I observed some other groups:

ht6-ht116 (violett): mostly present in Africa
ht35-ht38 (salmon): worldwide in low frequencies
ht40-ht44 (yellow): mostly present in Africa

3. I wanted to sort the haplotypes based on the phylogenetic tree.

I also checked the data of Denisova and Neanderthals for these 203 SNPs, and predicted the missing SNPs.

Find below the phylogenetic tree of the 76 MCM6 haplotypes including Denisova and Neanderthals.

SNPs of 76 haplotypes including the Denisova and Neanderthal SNPs (predicted ones in grey) and frequencies of all 76 haplogroups can be downloaded here (with summary at bottom).


Distribution of the main branches of MCM6:





Every branch of the phylogenetic tree of MCM6 tells its migration story. Based on the generated data I will try to postulate the steps of these migrations.


Grey branch (Denisova/Neanderthals/previous and current ht5):
ht5 is derived from the Denisova genome. The frequency of the ht5 suggests that the admixture Denisova and our ancestors occurred in Asia, which is in accordance with the current state of scientific knowledge.


Green branch (ht49-57; previously ht6):



Africa: ht50 is the ancestor of this green branch, and is only found in individuals with African ancestry (7-16%). ht50 was not part of OOA (Out of Africa). The alternative hypothesis would be that ht50 derived from ht52.
ht51 is derived from ht52 and is only present in Africans and Afro-Americans (2% in Gambians, 0.8% African Ancestry in Southwest US).
OOA: ht52 is derived from ht50 and can be found in Africa (1-4%) and all parts of Eurasia. It is especially common in East Asia (12-21%). ht52 was part of OOA. ht52 split into two groups, one that kept staying in Eastern Eurasia and one that moved to Western Eurasia.
Eastern Eurasia: ht53 is derived from ht52 and can not be found in Africa (no back migration to Africa), but it can be found in Eastern Eurasia, especially South Asians (2-8%) and East Asians (4-5%), and made it into native Americans (11% in Peruvians). Given its very low frequencies in Southern Europe and its lack in Africa I assume that it very rare in the Middle East.
Western Eurasia: ht56 is also derived from ht52 and can be found in Western Eurasia. ht56 is the most successful ht among the green branch (ht6) reaching 2-5% in Africa, 2-15% in Americans (probably through European ancestry because lowest level found in Peruvians and highest in Colombians and Puerto Ricans), 1-18% in South Asia, and 10-30% in Europe. It is basically not present in East Asia (0-1%).
ht54, ht55, and ht57 are derived from ht56 and show a similar world wide distribution with ht57 being the most successful. They show no migration to Africa.
ht54 can be found in Europe (0-1%) and South Asia (0-1%).
ht55 can be found in Europe (1-2%), South Asia (0-4%), East Asia (0-1%), and Americans (0-1%; probably through European ancestry because found in Colombians and Puerto Ricans but not found in Peruvians)
ht57 can be found in Europe (2-5%), South Asia (0-2%), and Americans (1-5%; probably through European ancestry because lowest level found in Peruvians and highest in Colombians and Puerto Ricans).
ht49 emerged after a crossing-over event between ht56 and ht18, ht20-ht32 (most likely ht29). Thus, its position in the phylogenetic tree is misleading.
For Kurds, I expect ht56 and ht52 to be the most common ones among this branch. If there is any unknown LP persistance genotype among Kurds then it is probably a subbranch of ht52 or ht56. Unfortunately, 23andme do not help to distinguish between the hts of the green branch (ht6).

Blue branch (ht17-30; previously ht1):

ht21 is the ancestor of this blue branch. ht21 can only be found in Africans (0-3%) and Afro-Americans (0-1%).
ht22 is derived from ht21. ht22 can only be found in Africans (0-3%) and Afro-Americans (0-1%).
ht23 is derived from ht22. ht23 can only be found in Africans (2-5%), Afro-Americans (2-4%), and Americans (0-1%), probably through African ancestry.
OOA: ht29 is derived from ht21. ht29 can be found in all parts of the world: in Africans (2-6%), in native Americans (3-10%; not through European ancestry because highest levels found in Peruvians and Mexicans), in East Asians (24-42%), in South Asians (4-15%), and in Europeans (1-4%).
Five subbranches are derived from ht29: a) ht24, b) ht25, c) ht26, d) ht28, and e) ht30.
a) ht24 is present in individuals from Southern Europe (0-1%) and Afro-Americans (0-1%).
b) ht25 is only present in South Asia (0-1%).
c) ht26 is present in Europe (0-1%) und South Asia (0-1%).
d) ht28 is only present in East Asia (3-12%).
e) ht30 is present in  native Americans (6-12%; probably not through European ancestry because the highest levels are found in Peruvians and Mexicans), in Europeans (1-13%; North-South gradient=British 1% and Toscana 13%), and in South Asians (3-9%).
ht18 emerged after a crossing-over event between ht29 and ht49-53, ht56, or ht57. Similar to ht49 in the green branch (ht6), its position in the phylogenetic tree is misleading.

Turquoise branch (ht31-ht32; previously ht2-ht3):
ht31 (ht2) is derived from ht30. ht31 is very rare but plays a key role for the European lactase persistence (ht3). As mentioned above, I showed that it is present in Kurds (2.5%), Iraqi Jews (4.7%), Pakistanis (2-6%), Kalash (14%), and Arabs from the Middle East (2,5%). Now, I could also confirm its presence in Americans (0-2%; through Lebanese ancestry?), Toscana (1%), and South Asians (1%).
ht32 (ht3) is derived from ht31, and is the most frequent MCM6 genotype reaching 71% in Northern Europe and only 8% in Toscana. ht32 is responsible for European lactase persistence but it can be found elsewhere, too: frequencies of 11-31% in Americans (through European ancestry because the lowest levels are found in Peruvians), and 5-25% in South Asians (Northwest/SouthEast gradient with Punjabis being the highest and Tamils being the lowest). As mentioned above, I showed that it is present in Kurds (5%), Iranians (5%), Ob-Ugrics (5%), Arabs from Iraq, Syria, Lebanon and Palestine (13%), Morocco (17%), Saharawi (23%), and Fulani Sudanese (33%).

Orange branch (ht59-73; previously ht4):

ht61 is the ancestor of the orange branch. ht61 can only be found in South Asia.
ht62 is derived from ht61. ht62 can be found in Africa (1-3%), in Europe (0-2%), and in South Asia (0-3%). In South Asia there is a North/South gradient: in Europe the distribution is less clear.
ht63 is derived from ht62 and is solely African (0-1%).
ht64 is derived from ht63 and is also solely African (1-6%).
ht69 is derived from ht62 and is basically African, too (1-4%). Three subbranches emerged from ht69: a) ht70-71, b) ht65-68, and c) ht74-76 (red branch ht5).
a) ht70 is rare in Africa (0-1%), high among native Americans (5-22%) and South Asians (8-16%). 1-5% of Europeans (North/South gradient) and 8-13% of East Asians have ht70.
ht71 is derived from ht70, and is only present in South Asia (3-5%).
b) ht65-68 are all solely African (together 6-12%).
c) ht74-76 is the red branch (ht5; see below)
ht60 emerged after a crossing-over event between ht18, ht28-30, or ht31 and ht62, hg36-40 or ht69. Similar to ht49 in the green branch (ht6) and ht18 in the blue branch (ht1), its position in the phylogenetic tree is misleading.

Red branch (ht74-76; previously ht5):
ht75 is the ancestor of the red branch and can be found in all parts of the world: 1-4% in Africans, 6-10% in  Americans, 7-10% East Asians, 3-14% in Europeans (North/South gradient), and 7-9% in South Asians. Two subbranches emerged from ht75: a) ht76 and b) ht74 -40
a) ht76 is derived from ht75 and is present especially in Native Americans 3-13% (not through European ancestry because the highest levels are found in Peruvians and Mexicans), it is also present in Europe 0-2%, and South Asians 0-3% but not in Africans and East Asians.
b) ht74 and hg40 are derived from ht75 and are basically only present in South Asians (1-6%) with a clear North/South gradient.  ht72 was a little bit more successful and made it to South Asians (1-3%), East Asians (0-3%), Europeans (1-3%), and especially in Native Americans 1-7% (not through European ancestry because the highest levels are found in Peruvians and Mexicans).

Brown branch (ht46-48; previously ht13):
hg28 is the ancestor of the brown branch and is solely African (0-2%) as well as all of its derived subbranches ht47, hg30, and ht48.
hg27 emerged after a crossing-over event between hg28 and hg20-21 or hg23-25. Similar to ht49 in the green branch (ht6), ht18 in the blue branch (ht1), and ht60 in the orange branch (ht4), its position in the phylogenetic tree is misleading.

The remaining branches (hg1-25) are mostly African. Exceptions are hg5 (see grey branch), hg18-19 (0-6% in East Asians), and hg13 (1-2% in South Asians, 0-1% in Europeans and East Asians).

Yellow branch (ht40-44):

ht43 is the ancestor of the yellow branch. The yellow branch is solely African.

Update May 7th, 2017:
Kurd from anthrogenica.com helped me out to collect some more data. Thanks!

Skythian Sarmatian: blue branch ht17-31 excluding ht24, however, rs4988186=T (usually in orange/yellow branch ht33-44)
Skythian Pazyr: blue branch ht24
Skythian Aldy: ht30-31 (blue/turquoise branch)
Skythian Volga: blue branch ht18, ht20-ht31, ht34

Sunday, December 11, 2016

Archery in Iran, Horse riding in Iran

Drawing recently discovered near Khomein show among other engravings a person riding a horse with archery. Based on Dutch archeologists some of the engravings are up to 40,000 years old.
Link1, link2.

Courtesy from AFP

Thursday, December 1, 2016

Enattah et al., 2008: Convergence of lactase persistence

Today, I want to dig a little bit more into the genetics of lactase persistence. The goal is still to identify unknown MCM6 haplotypes in Middle Eastern populations that could lead to lactose tolerance and to identify the root of the known MCM6 haplotypes that result in lactase persistence.
Previously, I postulated the following phylogentic tree for the MCM6 gene (based on 23andme data):






I used genetic data from Enattah et al., 2008 and identified 9 SNP of MCM6 that are also tested at 23andme. Based on these 9 SNPs I identified ht1-6, the Saudi-Arabian lactase persistence (ht8), the East African lactase persistence (ht13), and several more. You can download the table here.


In the table above it becomes obvious that Saudi-Arabian lactase persistence (ht8) is derived from ht5, both share "C" at rs4988243. The European lactase persistence (ht3) is derived from ht2, and the East African lactase persistence (ht15) is derived from ht1.

From the paper, Enattah et al., 2008:
The European T(-13910) and the earlier identified East African G(-13907) LP allele share the same ancestral background and most likely the same history, probably related to the same cattle domestication event. In contrast, the compound Arab allele shows a different, highly divergent ancestral haplotype, suggesting that these two major global LP alleles have arisen independently, the latter perhaps in response to camel milk consumption. These results support the convergent evolution of the LP in diverse populations, most probably reflecting different histories of adaptation to milk culture.
Previously, I showed that Kurds, Iraqi Jews and Pakistanis have ht2. Now, I could show that ht2 is also present in Arabs from the Middle East and the Kalash. I determined frequencies of these haplotypes based on data from Enattah et al., 2008
I also added the Kurdish data to this table. You can download the table here:



In Eurasia, the vast majority are ht1 to ht6. In Africa, the MCM6 gene is more diverse.
Not surprisingly, Kurds show similarities with Iranians in their MCM6 repertoire and frequencies.

Since these haplotypes are just based on 9 SNPs they are less reliable, I decided to expand the number of SNPs gradually. To define further SNPs of the Saudi-Arabian lactase persistence I asked several Saudi-Arabians to participate in a small survey. I could confirm that the Saudi-Arabian lactase persistence (ht8) is derived from ht5 and correlates with the expected phenotype (=lactose tolerant). I also observed ht13 in one individual and added both data (ht8 and ht13) in the expanded list of SNPs that can be checked on 23andme. The other haplotypes are extracted (ht9-12, ht14-18) are extracted from Enattah et al., 2008.

You can download the table here:

Related post: 
Lactose intolerance: Six MCM6 variants in the Kurdish gene pool / Why 22018A is strongly, but not completely related to lactase persistence 

Lactose Tolerance

Friday, June 12, 2015

Dodecad K12b values of Bronze Age individuals living in Armenia

A new paper in Nature (Allentoft et al., 2015) presenting the genomes of ancient 101 Eurasians is currently fascinating geneticists but also linguists and historians. Some of these tested Bronze Age individuals lived in nowadays Armenia.

Here, I want to use their Dodecad K12b values (shared by Mfa from Corduene.blogspot.com) to calculate the closest reference populations for these individuals. Most of the tested individuals from Armenia can be resembled the best as 50-75% Iranian (including Kurds) or Lezgin and 25-50% European. In most cases, Kurds and especially Zaza are among the Top10 reference populations for these ancient genomes, see data below.

In terms of genetic composition the Southern Caucasus was more Iranian-like and European-like than it is today. The shift from Bronze Age to now could be explained by several Semitic empires and three Semitic religions (Judaism, Christianity and Islam) that expanded from South to North and probably impacted local populations of the Northern Middle East. 

 Interestingly, when traveling through Kurdistan a phenotypic gradient can be seen depending on the altitude: populations from hidden villages in the higher Mountains appear to have lighter skin (and lighter eye color) than the populations from towns in the valleys. Kurds also claim that in ancient times their people were more fair-skinned than they are today.

I believe the same (more Iranian-like and European-like than it is today) will be seen for genomes from Kurdistan. I also believe that the European-like character South of the Caucasus was introduced before the Neolithic expansion, hopefully, ancient DNA will prove this one day. As I mentioned earlier I believe that R1a and R1b originated in or near Kurdistan; two of the Bronze Age samples from Armenia are R1b (RISE397 and RISE413).


RISE413 MBA (1906 BC): 
TOP reference populations based on Dodecad K12b:

1 19.6 Turkmens_Y
2 20.2 Turkish_Aydin_Ho
3 20.7 Zaza_P
4 21.0 Turkish_Istanbul_Ho
5 21.3 Turkish_Kayseri_Ho
6 21.4 Tajiks_Y
7 21.7 Iranians
8 21.9 Iranian_D
9 22.1 Kurd_D
10 22.5 Kurds_Y

Oracle K12B:

# Distance
1 7.2639 69.1% Iranians + 30.9% Orcadian
2 7.2701 69.2% Iranians + 30.8% Orkney_1KG
3 7.2770 69% Iranians + 31% Irish_D
4 7.3007 68% Iranians + 32% CEU30
5 7.3294 68.1% Iranians + 31.9% English_D
6 7.3327 67.9% Kurd_D___ + 32.1% Argyll_1KG
7 7.3436 66.4% Iranians + 33.6% Mixed_Germanic_D
8 7.3579 68.4% Iranians + 31.6% Cornwall_1KG
9 7.3697 67.9% Iranians + 32.1% Kent_1KG
10 7.4109 68.1% Iranian_D + 31.9% Argyll_1KG

RISE416 MBA (1643 BC):

1    16.8    O_Italian_D
2    18.4    TSI30
3    18.5    N_Italian_D
4    18.9    Tuscan
5    19.3    C_Italian_D
6    20.5    North_Italian
7    22.7    S_Italian_Sicilian_D
8    23.2    Greek_D
9    23.4    Sicilian_D
10    24.7    Baleares_1KG

Oracle K12B:

#    Distance   
1    7.2386    75.3% N_Italian_D + 24.7% Balochi
2    7.4475    74.6% N_Italian_D + 25.4% Makrani
3    7.6021    77% N_Italian_D + 23% Brahui
4    7.6188    77% TSI30 + 23% Brahui
5    7.6521    52.4% Lezgins__ + 47.6% Pais_Vasco_1KG
6    7.6804    55% Kurds_Y___ + 45% Pais_Vasco_1KG
7    7.6909    73.3% North_Italian + 26.7% Balochi
8    7.8920    75.5% TSI30 + 24.5% Balochi
9    7.9160    55.2% Iranian_D + 44.8% Pais_Vasco_1KG
10    7.9334    72.6% North_Italian + 27.4% Makrani


RISE423 MBA (1402 BC):
1    14.8    Kumyks_Y
2    15.5    Turkish_Istanbul_Ho
3    16.1    Lezgins__
4    16.5    Turkish_Aydin_Ho
5    16.8    Turkish_Kayseri_Ho
6    16.9    Zaza_P___
7    18.5    Turks___
8    18.9    Turkish_D
9    19.0    Kurds_Y___
10    19.1    Chechens_Y

Oracle K12B:
#    Distance   
1    4.6697    73.6% Kurds_Y___ + 26.4% Belorussian
2    4.6825    72.5% Kurds_Y___ + 27.5% Mixed_Slav_D
3    4.7566    72.7% Kurds_Y___ + 27.3% Polish_D
4    4.7988    71.3% Kurds_Y___ + 28.7% Ukranians_Y
5    4.8007    73.2% Kurds_Y___ + 26.8% Russian_D
6    4.9732    75.6% Kurds_Y___ + 24.4% Lithuanian_D
7    4.9897    73.2% Kurd_D___ + 26.8% Belorussian
8    5.0133    76.7% Kurds_Y___ + 23.3% Lithuanians
9    5.0365    70.9% Kurd_D___ + 29.1% Ukranians_Y
10    5.0987    72.1% Kurds_Y___ + 27.9% Mordovians_Y

RISE408 LBA (1209 BC):
 TOP reference populations based on Dodecad K12b:
# Distance ID
1    18.0    Lezgins__
2    19.3    Kumyks_Y
3    21.4    Turkish_Istanbul_Ho
4    21.5    Turkish_Aydin_Ho
5    22.0    Chechens_Y
6    22.6    Turkish_Kayseri_Ho
7    22.7    Turkmens_Y
8    23.0    Zaza_P___
9    24.0    Tajiks_Y
10    24.3    Iranian_D

Oracle:

# Distance

1    4.7378    79.2% Lezgins__ + 20.8% French_Basque
2    5.2651    72.7% Lezgins__ + 27.3% Cataluna_1KG
3    5.5218    73.7% Lezgins__ + 26.3% Valencia_1KG
4    5.5712    73.9% Lezgins__ + 26.1% Cantabria_1KG
5    5.5818    74.3% Lezgins__ + 25.7% Aragon_1KG
6    5.6722    73.8% Lezgins__ + 26.2% Castilla_La_Mancha_1KG
7    5.6816    73.3% Lezgins__ + 26.7% Spanish_D
8    5.7200    73% Lezgins__ + 27% Spaniards
9    5.8826    71.1% Lezgins__ + 28.9% French
10    5.9217    71.1% Lezgins__ + 28.9% French_D


RISE412 LBA (1193 BC):
 TOP reference populations based on Dodecad K12b:
# Distance ID1    9.6    Adygei
2    11.2    Balkars_Y
3    11.7    Kumyks_Y
4    11.8    North_Ossetians_Y
5    13.1    Chechens_Y
6    13.2    Turks___
7    13.2    Turkish_D
8    14.1    Armenians
9    14.7    Turkish_Istanbul_Ho
10    15.8    Armenians_15_Y
11    16.2    Armenian_D
12    16.4    Turkish_Kayseri_Ho
13    17.5    Lezgins__
14    18.3    Abhkasians_Y
15    18.5    Georgia_Jews
16    18.7    Zaza_P___

Oracle K12B:

#    Distance   
1    3.6433    77.2% Adygei + 22.8% Sephardic_Jews
2    4.0513    77.5% Adygei + 22.5% S_Italian_Sicilian_D
3    4.0932    76.4% Adygei + 23.6% Ashkenazi_D
4    4.1480    75.4% Adygei + 24.6% Ashkenazy_Jews
5    4.2166    77.9% Adygei + 22.1% Sicilian_D
6    4.2440    80% Adygei + 20% Morocco_Jews
7    4.3430    64.7% Chechens_Y + 35.3% Cypriots
8    4.6281    60.3% Adygei + 39.7% Turkish_D
9    4.8258    75% Adygei + 25% Druze
10    4.9047    75.9% Adygei + 24.1% Lebanese
11    4.9790    84.3% Armenians + 15.7% Lithuanians
12    5.0526    77.3% Adygei + 22.7% Greek_D
13    5.0618    64.2% Abhkasians_Y + 35.8% Bulgarian_D
14    5.0744    82% Armenians + 18% Russian_B
15    5.0939    83.6% Armenians + 16.4% Lithuanian_D
16    5.1290    78.9% Adygei + 21.1% Syrians
17    5.1671    81.7% Armenians + 18.3% Mordovians_Y
18    5.1701    70.9% Balkars_Y + 29.1% Cypriots
19    5.2275    65.7% Abhkasians_Y + 34.3% Romanians
20    5.2709    67.1% Adygei + 32.9% Turkish_Kayseri_Ho


RISE396 LBA (1192 BC):
 TOP reference populations based on Dodecad K12b:
# Distance ID
 1    12.0    Turkish_Istanbul_Ho
2    12.6    Kumyks_Y
3    13.5    Turkish_Kayseri_Ho
4    14.5    Turkish_Aydin_Ho
5    14.5    Turkish_D
6    14.6    Turks___
7    15.3    Zaza_P___
8    16.1    Lezgins__
9    17.1    Chechens_Y
10    18.1    Kurds_Y___

Oracle K12B:
#    Distance
1    4.2060    58.2% Lezgins__ + 41.8% Greek_D
2    4.4492    61.5% Lezgins__ + 38.5% Sicilian_D
3    4.5387    60.3% Lezgins__ + 39.7% Ashkenazi_D
4    4.7105    63.8% Lezgins__ + 36.2% C_Italian_D
5    4.7696    58.9% Lezgins__ + 41.1% Ashkenazy_Jews
6    5.2481    63.3% Lezgins__ + 36.7% O_Italian_D
7    5.4614    66.2% Lezgins__ + 33.8% Tuscan
8    5.5673    67% Lezgins__ + 33% TSI30
9    5.8915    62.9% Lezgins__ + 37.1% Sephardic_Jews
10    6.0393    58% Kurds_Y___ + 42% Bulgarian_D

RISE407 LBA (1115 BC):
 TOP reference populations based on Dodecad K12b:
# Distance ID
 1    19.1    Lezgins__
2    19.6    Turkmens_Y
3    20.0    Tajiks_Y
4    20.7    Kumyks_Y
5    21.3    Zaza_P___
6    21.6    Turkish_Aydin_Ho
7    21.7    Turkish_Istanbul_Ho
8    22.0    Iranian_D
9    22.5    Kurd_D___
10    22.5    Kurds_Y___

Oracle K12B:
#    Distance
1    9.0414    77.7% Lezgins__ + 22.3% Pais_Vasco_1KG
2    9.1657    70.8% Iranian_D + 29.2% Norwegian_D
3    9.4091    70.2% Kurd_D___ + 29.8% Swedish_D
4    9.4630    79.2% Lezgins__ + 20.8% French_Basque
5    9.5599    70.4% Kurd_D___ + 29.6% Norwegian_D
6    9.6359    70.1% Kurds_Y___ + 29.9% Swedish_D
7    9.6374    71.4% Lezgins__ + 28.6% Portuguese_D
8    9.6513    72.7% Lezgins__ + 27.3% Cataluna_1KG
9    9.6726    71.2% Lezgins__ + 28.8% Extremadura_1KG
10    9.6800    73.6% Lezgins__ + 26.4% Castilla_La_Mancha_1KG


Tuesday, February 24, 2015

Lactose intolerance: Six MCM6 variants in the Kurdish gene pool / Why 22018A is strongly, but not completely related to lactase persistence

In a previous post I discussed the possibility of unknown MCM6 haplotypes in Middle Eastern populations that could lead to lactose tolerance. I wrote:
Not all lactose tolerance SNPs are known, several SNPs resulting in lactose tolerance still need to be discovered.

Today, I want to present 6 haplotypes (ht1-6) I could discover in the Kurdish gene pool.

ht3 represents the "European" haplotype which correlates with lactose tolerance. Individuals with "CT" or "TT" at rs4988235 (also called C/T-13910) are genetically lactose tolerant; at 23andme it is "AG" or "AA" (because MCM6 gene is in minus orientation but 23andme does not address it).
Interestingly, 13910T (rs4988235: "A" at 23andme) and 22018A (rs182549: "T" at 23andme) are strongly linked with one another, at least for ht3, the "European" haplotype. However, this is not always the case. There is some literature about 22018A, see here, here, here, and here.
ht3 itself is derived from ht2, which only carries the 22018A mutation. ht2 is fairly rare and does not provide lactase persistence which explains why 22018A is strongly, but not completely related to lactase persistence. 
The origin of the "European" lactose persistence 13910T (rs4988235: "A" at 23andme) can be found in populations where the frequency of 22018A (ht2 + ht3) is higher than 13910T (ht3).  

In Kurds (N=20) the frequency of 22018A is 8% (ht2 + ht3), the frequency of 13910T is 5% (ht3). Both, ht2 and ht3 are derived from ht1, which has a frequency of 20% among Kurds.

Raz et al., 2013 found that Iraqi Jews (N=96) have a frequency 8.3% for 22018A (ht2 + ht3) and 3.6% for 13910T (ht3), which fits quite well with the Kurdish results.

Find below the frequency of the six MCM6 haplotypes in the Kurdish gene pool:


ht1 ht2 ht3 ht4 ht5 ht6
20% 3% 5% 15% 10% 48%



Find below the six haplotypes (ht1 - ht6) identified in the Kurdish gene pool (ht7 was identified in the Denisova genome. The coverage of the Neanderthal genome was too low to identify the haplotype.)

SNP ht1 ht2 ht3 ht4 ht5 ht6 ht7
rs4988283 C C C C C C C
rs4988262 C C C C C C T
rs2082730 T T T T T T T
rs4988243 T T T T C T T
rs4954492 A A A A A A A
rs41525747 G G G G G G G
rs4988236 G G G G G G G
rs4988235 G G A G G G G
rs41380347 A A A A A A A
rs4988234 C C C C C C C
rs4988233 G G G G G G G
rs2304369 G G G G G G G
rs4988232 G G G G G G G
rs4988226 G G G A A A A
rs309180 A A A G G G G
rs309181 G G G C C C C
rs3213871 C C C T T C C
rs4988203 C C C C C C T
rs182549 C T T C C C C
rs4988199 A A A A A A A
rs4988189 T T T T T T T
rs4988186 G G G G G G G
rs309176 C C C T T T T
rs3087343 T T T T T G T
rs4988177 T T T T T T T
rs3087353 C C C C C C T
rs2289049 G G G G G G A
rs2070068 G G G G G G A
rs1435577 C C C C C G C
rs3769001 A A A A A G A
rs1057031 G G G G G A A

Find below the phylogenetic tree of MCM6 haplotypes.



Coehlo et al., 2005 predicted that 22018A mutation occured prior to the 13910T mutation.

Bersaglieri et al., 2004 provided the frequency of both 22018A and 13910T in various populations showing that the frequency of 22018A (ht2 + ht3) is higher than 13910T (ht3) in Pakistan (excluding Kalash).
Summary:
ht2 (22018A only) is the ancestor of ht3 (22018A + 13910T). ht2 can be found among Kurds, Iraqi Jews and Pakistanis.

I suspect that lactase persistence is also present among a subgroup of ht6.