Semantic Scholar
Open Access
2019
86 sitasi
On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement
Stefan Th Gries
Abstrak
Abstract This paper is a discussion of methodological problems that (can) arise in the analysis of multifactorial data analyzed with tree-based or forest-based classifiers in (corpus) linguistics. I showcase a data set that highlights where such methods can fail at providing optimal results and then discuss solutions to this problem as well as the interpretation of random forests more generally.
Topik & Kata Kunci
Penulis (1)
S
Stefan Th Gries
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2019
- Bahasa
- en
- Total Sitasi
- 86×
- Sumber Database
- Semantic Scholar
- DOI
- 10.1515/CLLT-2018-0078
- Akses
- Open Access ✓