An Empirical Evaluation of Machine Learning Methods and Text Classifiers for Sentiment Analysis of Online Consumer Reviews
Abstrak
This study aims to identify the best predictive model for analysing online product reviews (OPRs) in the electronics industry, with a secondary focus on leveraging unstructured customer feedback to support product improvement. Using a dataset of 9,675 Oppo mobile phone reviews, this study employs three classification models—Random Forest, Support Vector Machine (SVM) and Logistic Regression–paired with Term Frequency-Inverse Document Frequency (TF-IDF) or bidirectional encoder representation transformer (BERT) as the embedding models to analyse customer sentiment and derive actionable insights. The methodology features a comprehensive analysis pipeline that includes text preprocessing with the Natural Language Toolkit (NLTK), feature extraction using) vectorization and BERT embeddings, and sentiment prediction through various classifiers. The results indicated that BERT was the most effective, achieving the highest accuracy, precision, recall, and F1-score. This superior performance stems from the Random’s ability to handle high-dimensional, sparse data and effectively utilize the weighted word importance provided by TF-IDF, which makes it particularly well suited for sentiment classification tasks involving structured text representations. This study contributes to this field by providing an effective framework for analysing online reviews. This can help businesses understand customer needs for refining product offerings and laying the groundwork for future applications across different product categories.
Topik & Kata Kunci
Penulis (3)
Pei Qin Lo
Sew Lai Ng
Li-xian Jiao
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.33093/jiwe.2026.5.1.13
- Akses
- Open Access ✓