In a recent article in Science, through the use of sophisticated methods, the origin of Indo-European languages, including Albanian along with Armenian and Greek, is studied. It is confirmed that Albanian is one of the oldest Indo-European languages still spoken.
Furthermore, the research primarily conducted through computational linguistic studies confirms the antiquity and originality of the Albanian language.
Greek researchers have also conducted DNA studies and have verified that the Albanian population is indigenous and very ancient in these lands.
Both studies are complementary and reinforce each other, ultimately proving scientifically that both the Albanian people and the Albanian language are at least 6,000 years old and indigenous to these lands.
Linguistic tree diagrams with ancestral languages support a hybrid model for the origin of Indo-European languages.
In summary, based on this study, it is evident that Indo-European languages are spoken by nearly half of the world's population, but their origins and spread are subject to debate.
Heggarty et al. present a database with 109 modern and 52 historical Indo-European languages calibrated over time, which were analyzed using Bayesian phylogenetic models.
Their results suggest an emergence of Indo-European languages around 8,000 years ago. This date marks a much deeper origin than previously believed, and it aligns with an initial homeland in the southern Caucasus followed by a branch expanding northward to the steppe region.
These findings lead to a "hybrid" scenario that reconciles current linguistic and ancient DNA evidence, pointing to both the Anatolian Peninsula of the Middle East (as the primary source) and the steppe (as a secondary homeland). - Sacha Vignieri -
The Science article highlights that almost half of the world's population speaks a language belonging to the Indo-European language family. However, it remains unclear where the ancestral homeland of this family's common language (proto-Indo-European) was and when and why it spread throughout Eurasia.
The "Steppe Hypothesis" proposes an expansion from the Pontic-Caspian steppe, no earlier than 6,500 years ago (c. 4,500 BCE), primarily through horse-based pastoralism around 5,000 years BCE.
Now, ancient DNA is bringing forth new valid perspectives, but these remain only indirect interpretations of the prehistory of languages.
In this study, direct tests have been conducted among the predictions of the Anatolian and Steppe hypotheses on the depth of time from linguistic data.
This study reports a new framework for the chronology and sequence of Indo-European divergence, using Bayesian phylogenetic methods applied to an expanded dataset of new basic vocabulary in 161 Indo-European languages.
In justifying that previous philological analyses produced contradictory results, the causes of this mismatch have been diagnosed and resolved, notably two:
Firstly, the data sets used had limited linguistic sampling and wide encoding discrepancies.
Secondly, some analyses assumed that modern spoken languages derive directly from ancient written languages and not from parallel spoken varieties.
Together, these methodological issues distorted the branch length estimates and the conclusions about dates. The study presents a new dataset of cognates (shared word origins) across Indo-European.
This dataset eliminates past discrepancies and offers a more comprehensive and balanced linguistic sample, including 52 non-modern languages for a denser calibration.
Researchers applied Bayesian phylogenetic analysis enabled by cognacy, to test, rather than apply, direct ancestry assumptions.
The results of this study do not fully align with either the Steppe hypothesis or the Anatolian hypothesis.
Recent DNA evidence suggests that the Anatolian branch cannot be placed in the Steppe but rather to the south of the Caucasus. For other branches, possible candidate expansions outside of the Yamnaya culture are distinguishable in DNA, but some had only limited genetic impact.
The results show that these expansions from around 5,000 years BP and onward arrived much later in the linguistic chronology of Indo-European divergence. However, ultimately, they align with locations south of the Caucasus and a later branch northward to the Steppe, serving as a secondary homeland for some Indo-European branches that entered Europe with later expansions related to the Corded Ware culture.
Thus, linguistic phylogenetics and DNA are combined to suggest that the solution to the 200-year-old Indo-European enigma lies in a hybrid of the Anatolian and Steppe hypotheses.
In conclusion, the origin of the Indo-European language family remains highly debatable. Bayesian phylogenetic analyses of basic vocabulary have produced contradictory results, with some supporting an agricultural expansion outside Anatolia around 9,000 years ago (years BP), while others support a horse-based spread from the Pontic-Caspian Steppe around 6,000 years BP. Here, we present an extended database of Indo-European basic vocabulary that eliminates past discrepancies in the respective encoding.
The phylogenetic analysis, enabled by the ancestry of this dataset, shows that few ancient languages are direct ancestors of modern classes and yield a root age of ~ 8,120 years BP for the family. Although this date does not match the Steppe hypothesis, it does not exclude an original homeland in the south of the Caucasus, with a later branch northward to the Steppe and then through Europe. We verify this hybrid hypothesis with ancient DNA evidence recently published from the Steppe and Northern Middle East.