Semantic Scholar Open Access 2022 9 sitasi

Machine Learning based Sentiment Analysis of Hindi Data with TF-IDF and Count Vectorization

Ashwani Gupta U. Sharma

Abstrak

Sentiment refers to emotions. Sentiment analysis, often known as opinion mining, is the technique of identifying and extracting subjective data from pre-web and post-web reviews using text analytics, computational linguistics, and natural language processing. Hindi is an Indian language which is used by many of Indians. Due to phenomenal growth of online product reviews in Hindi post-web Hindi reviews are also increasing rapidly. A machine learning based method in this paper to analysis postweb text data. The present method is divided into four steps. First of all, an annotated Hindi review data set is developed from post-web sources. In second step, feature extraction is performed on annotated Hindi review dataset using the Term-Frequency/ Inverse-Document Frequency (TF-IDF) and count vectorization techniques. In the third step, the retrieved features are given to the classifier so it can make predictions. Moreover, annotated dataset translated into English. Second step and third step are performed on annotated English dataset in last step. A range of evaluation criteria, including precision, recall, and F1- score, are presented in the results. In both instances, the support vector machine produced the most pertinent results.

Topik & Kata Kunci

Penulis (2)

A

Ashwani Gupta

U

U. Sharma

Format Sitasi

Gupta, A., Sharma, U. (2022). Machine Learning based Sentiment Analysis of Hindi Data with TF-IDF and Count Vectorization. https://doi.org/10.1109/ICCCS55188.2022.10079323

Akses Cepat

Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Total Sitasi
Sumber Database
Semantic Scholar
DOI
10.1109/ICCCS55188.2022.10079323
Akses
Open Access ✓