DOAJ Open Access 2022

Developing a discipline-specific corpus and high-frequency word list for science and engineering students in graduate school

Suwako Uehara Hibiya Haraki Stuart McLean

Abstrak

Japanese graduate school students in the field of science and engineering need to read academic research in their second language (L2), and such tasks can be challenging. Studies showed a strong (0.78) correlation between vocabulary size and reading comprehension (McLean et al., 2020), and providing high-frequency word lists could enhance comprehension. In this work-in-progress, 1.35 million tokens of professor-recommended reading materials were used to investigate a method to create a vocabulary list that would benefit science majors in graduate school, the procedures to create a corpus and a high-frequency word list efficiently, and the steps required to create a cleaner corpus. This paper outlines a systematic literature-informed method that includes input from professors in the field, the combined use of tailored script in MATLAB and AntCont (Anthony, 2022) generated corpus and high-frequency words efficiently, and repeated comparison of original PDFs and the matching text files, then adding MATLAB script to deal with specific issues created by a cleaner text. This proposed method can be applied in other contexts to enhance the generation of high-frequency word lists.

Topik & Kata Kunci

Penulis (3)

S

Suwako Uehara

H

Hibiya Haraki

S

Stuart McLean

Format Sitasi

Uehara, S., Haraki, H., McLean, S. (2022). Developing a discipline-specific corpus and high-frequency word list for science and engineering students in graduate school. https://doi.org/10.7820/vli.v11.2.uehara

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.7820/vli.v11.2.uehara
Informasi Jurnal
Tahun Terbit
2022
Sumber Database
DOAJ
DOI
10.7820/vli.v11.2.uehara
Akses
Open Access ✓