Semantic Scholar Open Access 2019 86 sitasi

On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement

Stefan Th Gries

Abstrak

Abstract This paper is a discussion of methodological problems that (can) arise in the analysis of multifactorial data analyzed with tree-based or forest-based classifiers in (corpus) linguistics. I showcase a data set that highlights where such methods can fail at providing optimal results and then discuss solutions to this problem as well as the interpretation of random forests more generally.

Topik & Kata Kunci

Penulis (1)

S

Stefan Th Gries

Format Sitasi

Gries, S.T. (2019). On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement. https://doi.org/10.1515/CLLT-2018-0078

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1515/CLLT-2018-0078
Informasi Jurnal
Tahun Terbit
2019
Bahasa
en
Total Sitasi
86×
Sumber Database
Semantic Scholar
DOI
10.1515/CLLT-2018-0078
Akses
Open Access ✓