The Indo-European Cognate Relationships dataset
Abstrak
Abstract The Indo-European Cognate Relationships (IE-CoR) dataset is an open-access relational dataset showing how related, inherited words (‘cognates’) pattern across 160 languages of the Indo-European family. IE-CoR is intended as a benchmark dataset for computational research into the evolution of the Indo-European languages. It is structured around 170 reference meanings in core lexicon, and contains 25731 lexeme entries, analysed into 4981 cognate sets. Novel, dedicated structures are used to code all known cases of horizontal transfer. All 13 main documented clades of Indo-European, and their main subclades, are well represented. Time calibration data for each language are also included, as are relevant geographical and social metadata. Data collection was performed by an expert consortium of 89 linguists drawing on 355 cited sources. The dataset is extendable to further languages and meanings and follows the Cross-Linguistic Data Format (CLDF) protocols for linguistic data. It is designed to be interoperable with other cross-linguistic datasets and catalogues, and provides a reference framework for similar initiatives for other language families.
Topik & Kata Kunci
Penulis (91)
Cormac Anderson
Matthew Scarborough
Lechosław Jocz
Martin Joachim Kümmel
Thomas Jügel
Britta Irslinger
Roland Pooth
Henrik Liljegren
Richard F. Strand
Geoffrey Haig
Ulrich Geupel
Martin Macak
Ronald I. Kim
Erik Anonby
Tijmen Pronk
Oleg Belyaev
Tonya Kim Dewey-Findell
Matthew Boutilier
Cassandra Freiberg
Robert Tegethoff
Matilde Serangeli
Krzysztof Stroński
Alexander Falileyev
Nikos Liosis
Kim Schulte
Ganesh Kumar Gupta
Raheleh Izadifar
Patrycja Markus
Nicholas Williams
Simone Loi
Nicholas Sims-Williams
Martin Findell
Shirin Adibifar
Giovanni Abete
Petar Atanasov
Esther Baiwir
Maria-Reina Bastardas
Adam Benkato
Lisa Shugert Bevevino
Éva Buchi
Giorgio Cadorini
Chundra Cathcart
Loïc Cheveau
Charalambos Christodoulou
Jérémie Delorme
Steven N. Dworkin
Deniz Ekici
Shervin Farridnejad
Mojtaba Gheitasi
Harald Hammarström
Steve Hewitt
Afsar Ali Khan
Muhammad Kamal Khan
Liudmila Khokhlova
Deborah Kim
Christopher Lewin
Borana Lushaj
Parvin Mahmoudveysi
Masoud Mahommadirad
Sam Mersch
Baydaa Mustafa
Fatemeh Nemati
Maryam Nourzaei
Peadar Ó Muircheartaigh
Virginia Oogjen
Muhammed Ourang
Heather Pagan
Timothy S. Palmer
Steve Pepper
Mandar Purandare
Khwaja Rehman
Guto Rhys
Unn Røyneland
Muhammad Zaman Sagar
Jade Jørgen Sandstedt
Lars Steensland
Mortaza Taheri-Ardali
Mahnaz Talebi-Dastenaei
Sabine Tittel
Tiago Tresoldi
Michiel de Vaan
Annemarie Verkerk
Arjen Versloot
Paul Videsott
Nikola Vuletić
Manuel Widmer
Arash Zeini
Hans-Jörg Bibiko
Fiona Runge
Russell D. Gray
Paul Heggarty
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1038/s41597-025-05445-3
- Akses
- Open Access ✓