arXiv Open Access 2021

Looking Through Glass: Knowledge Discovery from Materials Science Literature using Natural Language Processing

Vineeth Venugopal Sourav Sahoo Mohd Zaki Manish Agarwal Nitya Nand Gosvami +1 lainnya
Lihat Sumber

Abstrak

Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses' literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA), providing a way to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots are presented using the 'Caption Cluster Plot' (CCP), which provides direct access to the images buried in the papers. Finally, we combine the LDA and CCP with the chemical elements occurring in the manuscript to present an 'Elemental map', a topical and image-wise distribution of chemical elements in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition-structure-processing-property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery.

Penulis (6)

V

Vineeth Venugopal

S

Sourav Sahoo

M

Mohd Zaki

M

Manish Agarwal

N

Nitya Nand Gosvami

N

N. M. Anoop Krishnan

Format Sitasi

Venugopal, V., Sahoo, S., Zaki, M., Agarwal, M., Gosvami, N.N., Krishnan, N.M.A. (2021). Looking Through Glass: Knowledge Discovery from Materials Science Literature using Natural Language Processing. https://arxiv.org/abs/2101.01508

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2021
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓