arXiv Open Access 2013

Generation, Implementation and Appraisal of an N-gram based Stemming Algorithm

B. P. Pande Pawan Tamta H. S. Dhami
Lihat Sumber

Abstrak

A language independent stemmer has always been looked for. Single N-gram tokenization technique works well, however, it often generates stems that start with intermediate characters, rather than initial ones. We present a novel technique that takes the concept of N gram stemming one step ahead and compare our method with an established algorithm in the field, Porter's Stemmer. Results indicate that our N gram stemmer is not inferior to Porter's linguistic stemmer.

Topik & Kata Kunci

Penulis (3)

B

B. P. Pande

P

Pawan Tamta

H

H. S. Dhami

Format Sitasi

Pande, B.P., Tamta, P., Dhami, H.S. (2013). Generation, Implementation and Appraisal of an N-gram based Stemming Algorithm. https://arxiv.org/abs/1312.4824

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2013
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓