Semantic Scholar Open Access 2020 17 sitasi

Text mining and analysis of treatise on febrile diseases based on natural language processing

K. Zhao Na Shi Zheng Sa Hua-Xing Wang Chun-Hua Lu +1 lainnya

Abstrak

Objective: With using natural language processing (NLP) technology to analyze and process the text of “Treatise on Febrile Diseases (TFDs)” for the sake of finding important information, this paper attempts to apply NLP in the field of text mining of traditional Chinese medicine (TCM) literature. Materials and Methods: Based on the Python language, the experiment invoked the NLP toolkit such as Jieba, nltk, gensim, and sklearn library, and combined with Excel and Word software. The text of “TFDs” was sequentially cleaned, segmented, and moved the stopped words, and then implementing word frequency statistics and analysis, keyword extraction, named entity recognition (NER) and other operations, finally calculating text similarity. Results: Jieba can accurately identify the herbal name in “TFDs.” Word frequency statistics based on the word segmentation found that “warm therapy” is an important treatment of “TFDs.” Guizhi decoction is the main prescription, and five core decoctions are identified. Keyword extraction based on the term “frequency-inverse document frequency” algorithm is ideal. The accuracy of NER in “TFDs” is about 86%; latent semantic indexing model calculating the similarity, “Understanding of Synopsis of Golden Chamber (SGC)” is much more similar with “SGC” than with “TFDs.” The results meet expectation. Conclusions: It lays a research foundation for applying NLP to the field of text mining of unstructured TCM literature. With the combination of deep learning technology, NLP as an important branch of artificial intelligence will have broader application prospective in the field of text mining in TCM literature and construction of TCM knowledge graph as well as TCM knowledge services.

Topik & Kata Kunci

Penulis (6)

K

K. Zhao

N

Na Shi

Z

Zheng Sa

H

Hua-Xing Wang

C

Chun-Hua Lu

X

Xiaoying Xu

Format Sitasi

Zhao, K., Shi, N., Sa, Z., Wang, H., Lu, C., Xu, X. (2020). Text mining and analysis of treatise on febrile diseases based on natural language processing. https://doi.org/10.4103/wjtcm.wjtcm_28_19

Akses Cepat

Lihat di Sumber doi.org/10.4103/wjtcm.wjtcm_28_19
Informasi Jurnal
Tahun Terbit
2020
Bahasa
en
Total Sitasi
17×
Sumber Database
Semantic Scholar
DOI
10.4103/wjtcm.wjtcm_28_19
Akses
Open Access ✓