Chinese Named Entity Recognition Method Based on ALBERT-BGRU-CRF
Abstrak
Named Entity Recognition(NER) is an important basis for upper-level natural language processing tasks such as knowledge graph construction, search engines, and recommendation systems.Chinese NER labels and classifies proper nouns or specific named entities in a text sequence.Aiming at the problem that the existing Chinese NER methods cannot effectively extract long-distance semantic information and solve the problem of polysemy, this study proposes a Chinese NER method based on ALBERT pre-training language model, Bidirectional Gated Recurrent Unit(BGRU) and Conditional Random Field(CRF), called ALBERT-BGRU-CRF model.First, the ALBERT pre-trained language model performs word embedding on the input text to obtain dynamic word vectors, which can effectively solve the polysemy problem.Second, BGRU extracts contextual semantic features to further understand semantics and obtain semantic features between long-distance words.Finally, the concatenated vector is input to the CRF layer and decoded using the Viterbi algorithm to reduce the probability of wrongly labelling the output.Then, the entity annotation information is obtained, and the Chinese NER task is completed.The experimental results show that the Chinese NER accuracy and recall rate of the ALBERT-BGRU-CRF model on the MSRA corpus reach 95.16% and 94.58%, respectively.Simultaneously, compared with the fragment neural network model and the CNN-BiLSTM-CRF model, the F1 value of the ALBERT-BGRU-CRF model has increased by 4.43 and 3.78 percentage points.
Topik & Kata Kunci
Penulis (1)
LI Junhuai, CHEN Miaomiao, WANG Huaijun, CUI Ying'an, ZHANG Aihua
Akses Cepat
- Tahun Terbit
- 2022
- Sumber Database
- DOAJ
- DOI
- 10.19678/j.issn.1000-3428.0061630
- Akses
- Open Access ✓