This post is part of a draft on South Siberian language homelands and Sprachbünde.
At least three major genetic changes have been described to date involving the Lena and Kolyma regions in East Siberia, and all are probably associated with some of the archaeological and linguistic developments that led to the known Early Modern distribution of languages in the Russian Far East and in Northern America.
The following is a tentative description of such intertwined linguistic-archaeological-genetic developments, based on the few available data from each field. For this guesswork, first genetic-archeological results, and then plausible linguistic-archaeological-genetic associations are offered below.
- A Summary of NE Siberian Population History
- Dene-Yeniseian and Na-Dene
- Eskimo-Aleut
- Chukotko-Kamchatkan
PLEASE NOTE. Many of the Y-SNP calls from ancient samples referred to below have been analyzed by the FamilyTreeDNA Haplotree team formed by phylogeneticist Michael Sager and Göran Runfeldt from the R&D team. Those ancient samples with validated haplogroup inferences are marked by a hyperlink to the FTDNA Haplotree. Occasionally, though, such hyperlinks are also used in the text when discussing Y-SNP branches in general, without referring to specific ancient samples. For a quick reference of ancient samples, you can check out the Ancient DNA Dataset, also visually in an Online Web Map, in SNP Tracker, or in AncientDNA.info.
1. A Summary of NE Siberian Population History
1. The Ancient North Siberian (ANS) ancestry (represented by the 38,000-year-old Upper Palaeolithic Yana individuals) received gene flow from a post-LGM ancestry – whose best current proxy is 15,000-year-old Khaiyrgas-1 – forming the ancient Palaeo-Siberian (PS) ancestry found in a ca. 7800 BC individual from the Duvanny Yar site, Kolyma River, belonging to the Sunnagin cultural complex. This Kolyma1 sample, of Y-DNA hg. Q1a-pre-Z36017, is in turn the closest representative of northern Native American (NNA) ancestry, as found in the Alaskan Shuhá Káa individual from On Your Knees Cave, of hg. Q1b-L53.
NOTE. That gene flow forming Proto-American lineages was predicted in Sikora et al. (2019) as a Devil’s Gate-related influx over a mainly ANS-related population, a claim that has been refined in the preprint by Ning et al. (2020) based on their new Houtaomuga transect, showing that Early Neolithic populations (ca. 5600 BC) from the Amur River Basin (ARB) share a similar ancestry with Kolyma1, and that they contributed to the ancestry of a newly reported Upward Sun River 1 (USR1) individual from the Denali tradition, possibly representing a newly described Ancient Beringian ancestry group, which might complicate the previous discussion about ancient intrusions through the Beringia Strait (see below).
Nevertheless, their tentative assessments of succeeding population (and associated linguistic) movements in North-East Siberia and North America – including Syalakh, Bel’kachi, and Ymyyakhtakh cultures – were based on two Magadan Bronze Age samples and later Iron Age / early medieval individuals. The more recent Circum-Baikal transect from Kılınç et al. (2021) offers much closer proxies for these population movements, and their interpretations supersede those from Ning’s preprint, which need to be reviewed accordingly (although Kılınç et al. could not take into account the findings of Ning et al.). The single reported Houtaomuga Early Neolithic Y-DNA hg., N1b1, also suggests that their relationship is not so close as inferred from formal stats alone.
2. A Yakutia-Lena Neolithic ancestry is represented by a ca. 4800 BC Syalakh-related Matta-1, and by a ca. 4200 BC Bel’kachi-related Onnyos-1. This ancestry is intermediate between Khaiyrgas-1 and the later Yakutia-Lena-Kolyma Bronze Age ancestry, and corresponds to the previously described Palaeo-Eskimo ancestry, found widely distributed among North American and Russian Far Eastern populations (see below). The most recent Neolithic representative of the two, Onnyos-1, shows a basal Y-DNA hg. Q1b-YP4010.
3. A Yakutia-Kolyma Bronze Age ancestry is represented first by two 2500-2000 BC Ymyyakhtakh samples recovered from the same grave at Kyordyughen 1: 40/50-year-old Ind 1, N4a1 (ca. 2600 BC), of basal Y-DNA hg. N-CTS9239 (probably N1a-pre-M2019); and the upper Ind 2, N4b2 (ca. 2250 BC), of a basal hg. N1a-L1026, splitting the previously defined N-L1026 trunk. They are currently the most ancient representatives of the so-called Neo-Siberian (NS) ancestry, which seems not particularly close to two previously reported female Magadan BA samples from Ol’skaya (ca. 1100 BC), who display a probable recent admixture with Palaeo-Siberians.
2. Dene-Yeniseian and Na-Dene
The Dene-Yeniseian hypothesis regards the Ket language spoken in the Yenisei River Basin as genetically related to the widespread Na-Dene language family in North America. Na-Dene comprises Tlingit and the recently extinct Eyak in Alaska, along with over thirty Athabaskan languages spoken from the western North American Subarctic to pockets in California (Hupa), Oregon (Tolowa) and the American Southwest (Navajo, Apache) (Krauss 1976). Pre-Proto-Na-Dene is believed to have spread from Alaska ca. 3000-2500 BC.
Sampled Ancient Athabaskans from Alaska (ca. AD 1200) show a 30-40% contribution from Paleo-Eskimo ancestry – complementing the pre-existing ancestry of Northern First Peoples – in an admixture event estimated to have happened roughly during the formation of the Proto-Na-Dene community. To complicate things, these two Ancient Athabascan samples (together with a 19th-century one of hg. Q1b-Y4276) suggest a Y-chromosome bottleneck under Q1b-FGC8436 lineages, in common with an ancient sample (ca. AD 880) from a likely Uto-Aztecan-speaking population from San Nicolas Island, in California.
Their closest patrilineal relatives are represented in ancient DNA by Eskimo-Aleut-speaking early medieval samples from Beringia (ca. AD 700-1000), of hg. Q1b-Z35703, under the same Q-Y4303 branch. Their common connection with parent Q1b-M3, the most widespread Proto-American lineage (and found almost exclusively in that continent), further dilutes any potential patrilineal connection of Ancient Athabascans with the Sialakh and Bel’kachi cultures, traditionally believed to be the ultimate Siberian vectors of Pre-Proto-Na-Dene.
Still, the finding of Bel’kachi-related hg. Q1b-YP4010 in a 2,000-year-old North American sample from Lovelock Cave, Nevada, is possibly directly linked to the Southern Athabascan expansion, supporting that some Cis-Baikal LN patrilines survived among ancient Na-Dene speakers. Subclades of hg. Q1b-YP4010 shown by Onnyos-1 are later found widespread among Cis-Baikal Late Neolithic and Early Bronze Age individuals, most of them attributed to the Glazkovo culture. In fact, their ancestry is shared by Cis-Baikal LN/EBA individuals featuring – among others – hg. Q1b-Y11938, a haplogroup shared with the few sampled modern Kets.
The population movement represented by Palaeo-Eskimo ancestry is thus probably the most relevant for a hypothetical Dene-Yeniseian connection before the (Pre-)Proto-Na-Dene expansion and eventual admixture with North American First Peoples, since Baikal LN/EBA samples show both Y-DNA lineages – Q1b-Y11938 closely related to modern Kets, and Q1b-YP4010 linked to the Paleo-Eskimo Syalakh/Bel’kachi-related expansions.
The ancestor of Common Yeniseian (dated earlier than ca. 1000 BC), Proto-Yeniseic, can be dated to a considerably earlier period (possibly ca. 3000-2000 BC), and Na-Dene to a roughly similar time (ca. 3500-2500 BC), which – based on the innovations of the latter – allows for a Dene-Yeniseian split ca. 7000-5000 BC (cf. Vajda in Flegontov et al. 2017). The Baikal LN/EBA-related split in population genomics is visible ca. 7,000 years ago, showing that a Na-Dene – Yeniseian connection is not far-fetched in terms of reconstructible languages or tight link in population genomics.
NOTE. For comparison, guesstimates for a reconstructible Indo-Anatolian are ca. 7,000-6,500 ybp, which based on the developments of Khvalynsk implies a potentially much earlier Early PIE achievable through internal reconstruction alone.
Despite the lack of direct samples from the relevant cultural groups, Vajda (in Flegontov et al. 2017, from the Reich Lab) believes that the Arctic Small Tool tradition (ASTt) is the most likely vector of Na-Dene, and that later steps of the linguistic expansion are likely connected to the spread of “Paleo-Eskimo” groups that brought other elements of North Asian material culture and folklore (Alekseenko 1995; Berezkin 2015), like the bow and arrow technology, thought to have been introduced into California 1,500 years ago by the ancestors of the Hupa and other Pacific Coast Athabaskans (Golla 2011:245).
Nevertheless, based on the shared ancestry among Northern Pacific groups and the highly variable linguistic guesstimates, it is still possible that the arrival of Proto-Na-Dene was linked to the formation of the Northern Archaic people, as previously proposed (e.g. Esdale 2008, Potter 2008). After all, their Northern Archaic tradition (ca. 5000-4500 BC) probably involved a mixture of Syalakh/Bel’kachi-related population with back-migrating peoples bringing Archaic Cultural Diffusion to Alaska, which would justify the presence of Q1b-M3 among early Athabascans. Further, the role of the recently described USR1-related Ancient Beringian population in these cultural and ethnolinguistic developments is unclear.
NOTE. Indeed, there is not sufficient data to discard new waves of Q1b-M3 from North-Eastern Siberia to North America. For an interesting but light and illustrated read on potential population movements through Alaska, check e.g. Tremayne (2019).
3. Eskimo-Aleut
Eskimo-Aleut consists of a branch containing the closely related Eskimoan languages (Yup’ik, Iñiupiaq, etc.), which split probably ca. 500 BC, and the more divergent Aleut branch. The latter shows possible signs of substrate admixture, or at least of rapid phonological and morphological change (Fortescue 1998: 35-37), which could make the estimated separation from the proto-language trunk ca. 2000 BC (Krauss 1980:7) appear older than it actually is.
Based on those rough linguistic guesstimates, it has been proposed that the original Palaeo-Eskimo founding population spoke Proto-Eskimo-Aleut (Fortescue 2017). Traditionally, the ASTt has been considered closely related to the expansion of the ancestors of modern Eskimo people, due to the partial cultural continuity in archaeology from Palaeo-Eskimos to Neo-Eskimos. On the other hand, the genetic evidence is not so clear, and continuity in subsistence economy and culture is to be expected among Palaeo-Arctic hunter-fishers, as is commonly found in Northern Eurasia, too.
While there is no ASTt sample from Alaska, an ASTt individual from the Palaeo-Inuit Saqqaq cultural complex in Greenland shows affinities with Russian Far Eastern populations (Rasmussen et al. 2010), which can be more precisely described today as derived ca. 90% from the Yakutia-Lena Neolithic cluster (and 10% from West Eurasians). Furthermore, there seems to be a recent replacement event marked by the spread of the Northern Maritime tradition starting 2,000 years ago, which morphed into the Thule tradition ca. 1,000 years ago, spreading rapidly eastwards in the centuries after that.
Partially supporting the traditional picture of continuity from Palaeo-Eskimos to Neo-Eskimos, i.e. the back-and-forth population movements of distantly related ‘Proto-Palaeo-Eskimo’ populations despite the clear intrusion of a new ancestry from the west, is the presence of Q1a-Z36017 lineages derived from Kolyma1 in the ancient Saqqaq sample (ca. 2220-1650 BC), in multiple 2,000-1,000-year-old Beringia Iron Age and early medieval individuals, and also in 1,000-year-old Late Dorset and Thule samples from NE Canada.
Despite the closely related Palaeo-Eskimo and Neo-Eskimo ancestries, the spread of Paleo-Eskimo ancestry could have accompanied different cultures, and indeed different populations sharing the same simplistically described cultural traits, so it cannot be discarded that the Saqqaq individual merely represented the latest bottleneck in the stepped migration of Bel’kachi-related populations that – at least initially – spoke (Pre-)Proto-Na-Dene. In any case, the population directly ancestral to Yupik- and Inuit-speaking groups probably crossed the Bering Strait some 2,000 years ago.
4. Chukotko-Kamchatkan
The Chukotko-Kamchatkan family consists of two divergent branches, Chukotian and Itelmen (i.e. Kamchatkan), whose genetic relationship is generally accepted. Whereas Chukotian is easily reconstructible based on synchronic Chukotian languages (Kassian 2020), 18th and 19th c. data on extinct Eastern and Southern Itelmen languages is not very reliable nor consistent. In any case, Proto-Chukotko-Kamchatkan phonological reconstruction is almost impossible (Kassian 2020).
Chukotko-Kamchatkan speakers show the closest affinity to Palaeo-Eskimos among modern populations, with the split from other present-day Siberian populations happening ca. 4300 BC, and from Saqqaq estimated ca. 4400-2400 BC. The split with Eskimo-Aleut is estimated to have occurred ca. 4200-2900 BC, due to their admixture with a group related to Northern Athabascans (Flegontov et al. Nature 2019).
Nevertheless, the genetic history of Chukotko-Kamchatkan involves a likely gene backflow from Neo-Eskimos who carried Palaeo-Eskimo and First Peoples ancestry (cf. Flegontov et al. Nature 2019, Ning et al. bioRxiv 2020).
In fact, ca. 1,500-year-old (Neo-Eskimo) Uelen IA and Ekven IA samples show marked Y-chromosome bottlenecks under Q1a-Z36017 lineages, also shown by the Saqqaq individual, probably all corresponding to a NE Siberian ‘refuge’ of groups ancestrally (patrilineally) related to Kolyma1 rather than Syalakh, Bel’kachi, or Ymyyakhtakh, which suggests that the spread of ‘Proto-Palaeo-Eskimo’ (PPE) ancestry (including the one found among modern Chukotko-Kamchatkans) – like that of EEF, Steppe, Iran_N, etc. – was later hijacked under bottlenecks of different “local” (Lena-Kolyma?) groups that spread as distinct ethnolinguistic communities.
Further difficulties assessing ethnolinguistic identities of ancient groups are offered by the potential ancestral relationship of Chukotko-Kamchatkan with Nivkh. Both are polysynthetic languages with verb structures that display typological affinities with certain languages of North America, which are nevertheless absent from the (geographically) intermediate Na-Dene and Eskimo-Aleut families. Fortescue (2017) has offered some preliminary evidence of lexical cognates, grammatical homologies and some potential systematic sound correspondences. More hypothesized contacts or genetic relationships with language isolates from North America have been proposed to date, without much success (see Vajda in Ning et al. 2020).
To complicate things even more, there is a potential Pre-Tanaina substrate formed by 13 words, for 8 of which Kassian (in Ning et al. 2020) finds a potential Chukotko-Kamchatkan phonetic (and sometimes semantic) match, with the two most promising including miɬni, piɬni, vinɬni ‘water’ ~ Proto-Chukotian *mi-məl (partial reducplication), or (ǝ)ɬtʰuʁ ‘eye’ ~ Chukotko-Kamchatkan *lV ‘eye’, reduplicated pl. *lV-lV (→ Chukotian *lǝ-lä ‘eye’, Itelmen *lo- ‘eye’, pl. *lu-l-). In both cases, as in all other described ones, the evidence is scarce and unpromising, although the long chronological distance between Chukotko-Kamchatkan and this potentially related branch is akin to, in Kassian words, the separation of modern Finnish and Proto-Balto-Slavic.
NOTE. On the other hand, Kassian finds no reliable cognates of the hypothetical Pre-Tanaina substrate with Proto-Nivkh, which would question the relevance of typological assessments of Northern Pacific languages.
In any case, the most parsimonious explanation for the current genetic picture, combined with the available (admittedly scarce) linguistic and archaeological descriptions, is that the Ymyyakhtakh cultural horizon – marked by the expansion of Neo-Siberian ancestry – was the initial vector of (Pre-)Proto-Chukotko-Kamchatkan originating in the Circum-Baikal area, based on its noticeable impact among the scant Far East Siberian ancient and modern DNA samples.
Ymyyakhtakh cultural horizon
The large and long-lasting Ymyyakhtakh cultural horizon (ca. 2200-1300 BC) has its roots in the Cis-Baikal area along the Lena and Yenisei River basins, and is marked by characteristic round-bottomed ceramics with wafer and ridge prints, armour plates, as well as stone and bone arrowheads, spears, and harpoons.
The culture spread quickly through East Siberia up to the Chukotka peninsula in the east, but also westward through the Taymyr Peninsula, Bolshezemelskaya and Malozemelskaya tundra, reaching the Kola Peninsula in North-Eastern Europe. Despite its replacement to the west of the Lena during the LBA, similar cultural traits survived with slight changes in the Russian Far East until the first centuries AD.
About the two Ymyyakhtakh samples from the grave at Kyordyughen 1 (see above), the majority of skeletal remains from the researched grave belongs to Ind 1, whereas only incomplete remains of Ind 2 were found – including the replaced right femur of Ind 1. Ind 2 is thus interpreted as possibly dismembered in a sacrifice, as a treatment proper of captives. Further dates from the burial of these hunter-gatherers – assuming they were roughly simultaneous – suggests that the true date might lie closer to the end of the 3rd millennium BC.
The reinforced shield and armor – consisting of plates made of antler – displayed in Ind 1 suggest that this was a warrior or military leader of the Late Neolithic, bearing witness to the development of military art in the second half of the 3rd millennium BC. The orientation of the body with feet directed downstream to the Lena River is interpreted by their archaeologists as reflecting a belief in the river as path to the land of the dead, a tradition that was apparently also followed by the preceding Bel’kachi ritual, such as that seen in the Onnyos burial (Stepanov et al. 2012).
In the findings about the phylogeny of haplogroup N reported in Ilumäe et al. AJHG (2016), modern Chukchi and Koryaks show a strong Y-DNA bottleneck under hg. N-B202, a subclade not found in other populations (see maps of haplogroup N distribution). This subclade finds its closest relatives in the N-P89 spread with Avars and Mongols, and further upstream in the N-CTS2929/VL29 of the Iron Age Baltic region. Also, the ancestry of modern Chukotko-Kamchatkans lies in a cline formed between the Ymyyakhtakh individuals – close to the previously published Magadan BA – reaching up to Ancient Alaskan populations.
Further, the different ancestry and haplogroup found in the Yana Young individual (ca. AD 1190), a likely medieval Yukaghir or Yakut speaker from the Kolyma region, seem to support that Tundra (i.e. Lower Kolyma) Yukaghirs – radically different in ancestry from Taiga Yukaghirs – are representatives of acculturated ancient Chukotko-Kamchatkan speakers, or alternatively of a Yukaghir population heavily admixed with locals. The opposite example is found in the 18th century Chukotian-assimilated Chuvantsy, in turn likely representing ancestrally acculturated Chukotko-Kamchatkan speakers. All this bears witness to the complex population and linguistic replacement events among East Siberian populations, rarely amenable to simplistic interpretations, whether in terms of genome-wide ancestry or linguistic typology.
Interestingly, the same Yakutia Late Neolithic ancestry – and a haplogroup maybe closest to the ‘sacrificial’ Ind. 2 from Kyordyughen – is also found in a coeval individual, kra001 (ca. 2230 BC), from the Kansk-Rubyn basin (read more about it). Its EBA group is closely connected to the Serovo cultural complex, in turn likely associated ultimately with the spread of Trans-Baikal N1a-rich populations that gave rise to the Ymyyakhtakh cultural complex, as is now supported by their tight Angara-Lena-Kolyma cluster. The sample shows a basal hg. N1a-CTS6967, immediately below FTDNA’s redefined N-L1026 branch.
NOTE. Another ancient DNA sample from Serovo published in Yu et al. (2020), STB001, is a cranium labelled “Zh. 1” that comes from a severely destroyed grave. It lacks a reliable archaeological context, and it shares ancestry and subclade with Cis-Baikal EN and LN samples, so it most likely belonged to other Q1b-rich EN/LN/EBA Cis-Baikal groups. Of course, it cannot be a priori discarded that it is an outlier among Serovo-related individuals – or that Kansk-Rubynsk and Ymyyakhtakh are unrelated to the core Sevorovo cultural area – but it seems a priori more likely that it is a mislabelled individual.
A similar ancestry profile and probably basal N1a-CTS3103 (one step downstream from kra001) is found in slightly later samples of the asbestos-mixed Lovozero ceramics from Bolshoy Olenyi Ostrov, featuring even-based arrowheads introduced in Lapland from ca. 1900 BC on (cf. Lamnidis et al. 2018). This confirms the quick expansion of Ymyyakhtakh-like populations west and east through the Tundra and Taiga, reaching first the Taymyr peninsula, where the related Vardøy ceramics (ca. 1600-1300 BC) are found.
This rapid EBA spread of Neo-Siberian ancestry most likely formed a genetic cline among populations from the North Eurasian Arctic and Hypoarctic, between the Proto-Chukotko-Kamchatkan-speaking NE Siberia and the Palaeo-Lapplandic-speaking population of the Kola Peninsula, representing – together with their bottlenecked (and recently splitting) N-L1029 lineages – the ‘eastern’ part of the so-called “Siberian” or “Ket-Uralic” ancestry (cf. Flegontov et al. 2016). As the core Palaeo-Arctic ancestry, it represents not only the main component behind modern Chukotko-Kamchaktkans, Taiga Yukaghirs, or Nganasans, but also a variable proportion of modern West Siberian and North-East European populations that admixed more recently with peoples of Palaeo-Arctic ancestry as they spread to the north.
NOTE. Nganasans in particular show a clearly divergent ancestry that most likely reflects an additional influx from a substratal population (or populations) from North Siberia or, alternatively, a (weirdly recent) marked genetic drift, representing in any case an isolated sink of previous migrations. To equate “Siberian” ancestry with “Nganasan” (whether intended as a true ancestral population, or as a proxy representing some imagined ‘pure’ Uralic-speaking ghost population) expanding beyond its historically attested borders is unjustified under any point of view – since they are clearly a recently acculturated Palaeo-Arctic group – and reveals a sloppy approach to ancient population genomics proper of a time when no ancient samples where available.
Despite the presence of N1a-rich populations among Cis-Baikal Early Neolithic (Kitoi) cemeteries like Shamanka and Lokomotiv, all pre-EBA subclades were under N-F4309, which suggests that these N-M2005-rich LN/EBA newcomers in the Cis-Baikal area have a recent ultimate patrilineal origin among earlier Trans-Baikal groups, such as those sampled from the Houtaomuga in the Amur River Basin, like M54A (ca. 5400 BC), of hg. N-Tat; and especially those recently published in Kılınç et al. Sci Adv (2021): from the Kuenga River basin, like brn008 (ca. 5400 BC), of basal hg. N-L839 (that splits the previous N-L708 trunk); or from the Kadalinka River Basin, like brn003 (ca. 4600 BC), of a basal lineage currently defining the N-M2005 trunk.
Therefore, and despite the poor sampling available, there seems to be a marked Y-chromosome bottleneck under a very recently split off N-M2005 haplogroup, supporting that the Cis-Baikal EBA population ancestral to Ymyyakhtakh and Serovo was part of a short-term ‘wedge’ introduced by Trans-Baikal migrants between the Q1b-rich populations found earlier – in Neolithic Kitoi – and later groups – in Bronze Age Glazkovo.
N1a-Tat subclades are also found later spreading westward throughout the North Eurasian pre-Taiga and Taiga from South Siberia and Altai-Sayan-TianShan populations, accompanying Late Bronze Age, Iron Age, and medieval nomads. It is therefore unclear whether there were different westward waves of similar N-rich populations along the Yenisei River Basin directly connected to the Serovo cultural complex – and distinct from the Ymyyakhtakh-related one – but archaeological connections from east to west, as well as Yeniseic developments (see below), suggest that the ultimate origin of the Lovozero population lies in East Siberia.