Semantic Scholar Open Access 2018 380 sitasi

hep-th

Yang-Hui He Vishnu Jejjala B. Nelson

Abstrak

We apply techniques in natural language processing, computational linguistics, and machine-learning to investigate papers in hep-th and four related sections of the arXiv: hep-ph, hep-lat, gr-qc, and math-ph. All of the titles of papers in each of these sections, from the inception of the arXiv until the end of 2017, are extracted and treated as a corpus which we use to train the neural network Word2Vec. A comparative study of common n-grams, linear syntactical identities, word cloud and word similarities is carried out. We find notable scientific and sociological differences between the fields. In conjunction with support vector machines, we also show that the syntactic structure of the titles in different sub-fields of high energy and mathematical physics are sufficiently different that a neural network can perform a binary classification of formal versus phenomenological sections with 87.1% accuracy, and can perform a finer five-fold classification across all sections with 65.1% accuracy.

Topik & Kata Kunci

Penulis (3)

Y

Yang-Hui He

V

Vishnu Jejjala

B

B. Nelson

Format Sitasi

He, Y., Jejjala, V., Nelson, B. (2018). hep-th. https://www.semanticscholar.org/paper/4e3f4fbdbffaee9c488b8551e5aee0ba8f574a0a

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2018
Bahasa
en
Total Sitasi
380×
Sumber Database
Semantic Scholar
Akses
Open Access ✓