DOAJ Open Access 2025

The Indo-European Cognate Relationships dataset

Cormac Anderson Matthew Scarborough Lechosław Jocz Martin Joachim Kümmel Thomas Jügel +86 lainnya

Abstrak

Abstract The Indo-European Cognate Relationships (IE-CoR) dataset is an open-access relational dataset showing how related, inherited words (‘cognates’) pattern across 160 languages of the Indo-European family. IE-CoR is intended as a benchmark dataset for computational research into the evolution of the Indo-European languages. It is structured around 170 reference meanings in core lexicon, and contains 25731 lexeme entries, analysed into 4981 cognate sets. Novel, dedicated structures are used to code all known cases of horizontal transfer. All 13 main documented clades of Indo-European, and their main subclades, are well represented. Time calibration data for each language are also included, as are relevant geographical and social metadata. Data collection was performed by an expert consortium of 89 linguists drawing on 355 cited sources. The dataset is extendable to further languages and meanings and follows the Cross-Linguistic Data Format (CLDF) protocols for linguistic data. It is designed to be interoperable with other cross-linguistic datasets and catalogues, and provides a reference framework for similar initiatives for other language families.

Topik & Kata Kunci

Penulis (91)

C

Cormac Anderson

M

Matthew Scarborough

L

Lechosław Jocz

M

Martin Joachim Kümmel

T

Thomas Jügel

B

Britta Irslinger

R

Roland Pooth

H

Henrik Liljegren

R

Richard F. Strand

G

Geoffrey Haig

U

Ulrich Geupel

M

Martin Macak

R

Ronald I. Kim

E

Erik Anonby

T

Tijmen Pronk

O

Oleg Belyaev

T

Tonya Kim Dewey-Findell

M

Matthew Boutilier

C

Cassandra Freiberg

R

Robert Tegethoff

M

Matilde Serangeli

K

Krzysztof Stroński

A

Alexander Falileyev

N

Nikos Liosis

K

Kim Schulte

G

Ganesh Kumar Gupta

R

Raheleh Izadifar

P

Patrycja Markus

N

Nicholas Williams

S

Simone Loi

N

Nicholas Sims-Williams

M

Martin Findell

S

Shirin Adibifar

G

Giovanni Abete

P

Petar Atanasov

E

Esther Baiwir

M

Maria-Reina Bastardas

A

Adam Benkato

L

Lisa Shugert Bevevino

É

Éva Buchi

G

Giorgio Cadorini

C

Chundra Cathcart

L

Loïc Cheveau

C

Charalambos Christodoulou

J

Jérémie Delorme

S

Steven N. Dworkin

D

Deniz Ekici

S

Shervin Farridnejad

M

Mojtaba Gheitasi

H

Harald Hammarström

S

Steve Hewitt

A

Afsar Ali Khan

M

Muhammad Kamal Khan

L

Liudmila Khokhlova

D

Deborah Kim

C

Christopher Lewin

B

Borana Lushaj

P

Parvin Mahmoudveysi

M

Masoud Mahommadirad

S

Sam Mersch

B

Baydaa Mustafa

F

Fatemeh Nemati

M

Maryam Nourzaei

P

Peadar Ó Muircheartaigh

V

Virginia Oogjen

M

Muhammed Ourang

H

Heather Pagan

T

Timothy S. Palmer

S

Steve Pepper

M

Mandar Purandare

K

Khwaja Rehman

G

Guto Rhys

U

Unn Røyneland

M

Muhammad Zaman Sagar

J

Jade Jørgen Sandstedt

L

Lars Steensland

M

Mortaza Taheri-Ardali

M

Mahnaz Talebi-Dastenaei

S

Sabine Tittel

T

Tiago Tresoldi

M

Michiel de Vaan

A

Annemarie Verkerk

A

Arjen Versloot

P

Paul Videsott

N

Nikola Vuletić

M

Manuel Widmer

A

Arash Zeini

H

Hans-Jörg Bibiko

F

Fiona Runge

R

Russell D. Gray

P

Paul Heggarty

Format Sitasi

Anderson, C., Scarborough, M., Jocz, L., Kümmel, M.J., Jügel, T., Irslinger, B. et al. (2025). The Indo-European Cognate Relationships dataset. https://doi.org/10.1038/s41597-025-05445-3

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1038/s41597-025-05445-3
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.1038/s41597-025-05445-3
Akses
Open Access ✓