Review Helpfulness to Support Business: Identifying Fake Reviews from User-Generated Content Using Random Forest

Authors

  • Syed Imran Abbas Qazmi Lincoln University College, Malaysia Corresponding Author
  • Midhun Chakkaravarthy Faculty of Computer Science and Multimedia, Lincoln University College, Malaysia Author
  • Syed Hassan Raza School of Media and Communication, Taylor’s University, 47500 Subang Jaya, Selangor, Malaysia Author
  • Farrah Aslam Department of Information Sciences, University of Education Lahore, Jauharabad Campus, Pakistan Author
  • Shahbaz Aslam Author
  • Moneeba Iftikhar Author

DOI:

https://doi.org/10.47654/v30y2026i2p114-156

Keywords:

Text Similarity, BERT Embeddings, Fake Reviews, Machine Learning, Review Helpfulness, Random Forest Classifier

Abstract

Purpose: Valid and Helpful reviews on an e-commerce platform provide important information regarding customers' perception of a product, which is crucial to the existence and growth of any business. False reviews, which are created to tarnish a product's image through spam fraudulently, continue to be a significant challenge for all e-commerce platforms. Another challenge remains in identifying helpful review content on the platform that can significantly alter a customer's opinion of a product. Hence, the increasing prevalence of fake and unhelpful reviews compromises the credibility of online reviews, resulting in information overload and a misleading consumer decision-making process. Motivated by this challenge, this study aims to develop an automated system capable of retaining only applicable and valid reviews to support the identification of customer needs, which is a valuable area of research.

Design/methodology/approach: This study involves three main aspects: helpfulness classification, fake review detection, and topic identification on various categories of the Amazon Dataset. The model leveraged a feature set that included the sentiment polarity of the review in detail, word count indicating the length of feedback, word diversity in the review, comprehension analysis of parts of speech in the review reflecting its grammatical structure and complexity, and authenticity metrics. Moreover, for helpful review classification, the utilized features included review and product metadata, review content informativeness score encoded with the help of Sentence Bidirectional Encoder Representations from Transformers (SBERT), and reviewer attributes. A topic extraction model has been implemented that leverages Gemini to extract sentiment-based topic analysis over reviews.

Findings: The study provides useful reviews classification over 6 different Amazon categories using a Random Forest classifier (RFC) by achieving 94% accuracy, precision, and F1-Score, a recall of 93%, and an AUC Score of 98%. While the Gradient Boosting classifier yielded comparable performance with an AUC Score of 98% and 94% accuracy, precision, recall, and F1-Score.  For fake reviews detection in the Toys and Games category, the RFC achieved 85% accuracy, 86% precision, a 97% recall, 91% F1-Score, and 79% AUC Score. The findings indicate that combining textual, semantic, reviewer, and product-level features can improve the reliability of review quality assessment. Finally, to enhance the decision-making process for businesses, a topic extraction model utilizing the Gemini tool has been employed to extract significant topics from valid and helpful reviews, categorizing them separately into negative and positive reviews, thereby gaining nuanced insights into customer feedback.

Originality/value: Unlike prior studies that either examine review helpfulness or fake review detection in isolation, this study moves beyond single-task and small-sample-based approaches. Our proposed framework offers a comprehensive analysis of patterns in reviews across e-commerce platforms, thereby enhancing brands' ability to integrate customer needs and expectations into future marketing communications and advertising campaigns. This study contributes to Decision Sciences by proposing a data-driven two-stage framework that retains only helpful and valid reviews to enhance content quality, thereby practically supporting better decision-making by content moderation, reducing information overload, and improving consumer trust in reviews.

Author Biographies

  • Syed Imran Abbas Qazmi, Lincoln University College, Malaysia

    Syed Imran Abbas Qazmi is a PhD IT(Scholar) in Lincoln University College Malaysia, with 15 years of teaching experience. 

  • Midhun Chakkaravarthy, Faculty of Computer Science and Multimedia, Lincoln University College, Malaysia

    Midhun Chakkaravarthy is an Associate Professor at Lincoln University College, Malaysia and the Head of the Department at his university. 

  • Syed Hassan Raza, School of Media and Communication, Taylor’s University, 47500 Subang Jaya, Selangor, Malaysia

    Syed Hassan Raza completed his PhD from China and served as a Professor of Communication Studies in the University of Sargodha Pakistan. He is currently working as an Associate Professor in Taylor’s University Malaysia. 

  • Farrah Aslam, Department of Information Sciences, University of Education Lahore, Jauharabad Campus, Pakistan

    Farrah Aslam holds a Master’s in Computer Engineering and is serving as a Lecturer at the University of Education, Lahore, Pakistan

References

Abd, M. J., & Hussein, M. H. (2024). Fake reviews detection in e-commerce using machine learning techniques: a comparative survey. BIO Web of Conferences, 97. https://doi.org/10.1051/bioconf/20249700099

Alasadi, S. A., & Bhaya, W. S. (2017). Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences, 12(16), 4102–4107. https://doi.org/10.3923/jeasci.2017.4102.4107

Aslam, F. (2025). TopicExamples. GitHub. https://github.com/FarrahAslam110/TopicExamples

Asudani, D. S., Nagwani, N. K., & Singh, P. (2023). Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 56(9), 10345–10425. https://doi.org/10.1007/s10462-023-10419-1

Bilal, M., & Almazroi, A. A. (2023). Effectiveness of Fine-tuned BERT Model in Classification of Helpful and Unhelpful Online Customer Reviews. Electronic Commerce Research, 23(4), 2737–2757. https://doi.org/10.1007/s10660-022-09560-w

Bilal, M., Marjani, M., Hashem, I. A. T., Gani, A., Liaqat, M., & Ko, K. (2019). Profiling and predicting the cumulative helpfulness (Quality) of crowd-sourced reviews. Information (Switzerland), 10(10). https://doi.org/10.3390/info10100295

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

Camacho-Otero, J., Boks, C., & Pettersen, I. N. (2019). User acceptance and adoption of circular offerings in the fashion sector: Insights from user-generated online reviews. Journal of Cleaner Production, 231(231), 928–939. https://doi.org/10.1016/j.jclepro.2019.05.162

Campos, P., Pinto, E., & Torres, A. (2025). Rating and perceived helpfulness in a bipartite network of online product reviews. Electronic Commerce Research, 25(3), 1607–1639. https://doi.org/10.1007/s10660-023-09725-1

Changchit, C., & Klaus, T. (2020). Determinants and Impact of Online Reviews on Product Satisfaction. Journal of Internet Commerce, 19(1), 82–102. https://doi.org/10.1080/15332861.2019.1672135

Chatterjee, S. (2025). Effect of construal level on the drivers of online-review-helpfulness. Electronic Commerce Research, 25(2), 1115–1143. https://doi.org/10.1007/s10660-023-09716-2

Chen, C., Zhou, J., Qiu, M., Li, X., Bao, F. S., Yang, Y., & Huang, J. (2019). Multi-domain gated CNN for review helpfulness prediction. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, 2630–2636. https://doi.org/10.1145/3308558.3313587

Chen, L., Chen, G., & Wang, F. (2015). Recommender systems based on user reviews: the state of the art. User Modeling and User-Adapted Interaction, 25(2), 99–154. https://doi.org/10.1007/s11257-015-9155-5

Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–794.

Cheng, Y., Hui, Y., Liu, S., & Wong, W. K. (2022). Could significant regression be treated as insignificant: An anomaly in statistics? Communications in Statistics Case Studies Data Analysis and Applications, 8(1), 133–151. https://doi.org/10.1080/23737484.2021.1986171

Cheng, Y., Hui, Y., McAleer, M., & Wong, W. K. (2021). Spurious Relationships for Nearly Non-Stationary Series. Journal of Risk and Financial Management, 14(8). https://doi.org/10.3390/jrfm14080366

Choi, W., Nam, K., Park, M., Yang, S., Hwang, S., & Oh, H. (2023). Fake review identification and utility evaluation model using machine learning. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.1064371

Chua, A. Y. K., & Banerjee, S. (2015). Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth. Journal of the Association for Information Science and Technology, 66(2), 354–362. https://doi.org/10.1002/asi.23180

Daniels, M. (2022). (2022, 10). Why the FTC is trying to crack down on fake reviews on e-commerce sites. (modernretail). https://www.modernretail.co/operations/why-the-ftc-is-trying-to-crack-down-on-fake-reviews-on-e-commerce-sites/

Deldjoo, Y., Nazary, F., Ramisa, A., Mcauley, J., Pellegrini, G., Bellogin, A., & Noia, T. D. (2023). A review of modern fashion recommender systems. ACM Computing Surveys, 56(4), 1-37. https://doi.org/10.1145/3624733

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 1, 4171–4186.

Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74(366a), 427–431.

Duma, R. A., Niu, Z., Nyamawe, A. S., Tchaye-Kondi, J., Jingili, N., Yusuf, A. A., & Deve, A. F. (2024). Fake review detection techniques, issues, and future research directions: a literature review. Knowledge and Information Systems, 66(9), 5071–5112. https://doi.org/10.1007/s10115-024-02118-2

Duong, H. T., & Nguyen-Thi, T. A. (2021). A review: preprocessing techniques and data augmentation for sentiment analysis. Computational Social Networks, 8(1). https://doi.org/10.1186/s40649-020-00080-x

Elmitwalli, S., & Mehegan, J. (2024). Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques. Frontiers in Big Data, 7. https://doi.org/10.3389/fdata.2024.1357926

Enamul Haque, M., Tozal, M. E., & Islam, A. (2018). Helpfulness prediction of online product reviews. Proceedings of the ACM Symposium on Document Engineering 2018, DocEng 2018, 1–4. https://doi.org/10.1145/3209280.3229105

Fan, M., Feng, Y., Sun, M., Li, P., Wang, H., & Wang, J. (2018). Multi-task neural learning architecture for end-to-end identification of helpful reviews. Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, 343–350. https://doi.org/10.1109/ASONAM.2018.8508623

Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1–81.

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 1189–1232.

Hannan, S. A., Ahmed, S. J., Naveed, Q., & Thakur, R. A. (2012). Data Mining and Natural Language Processing Methods for Extracting Opinions from Customer Reviews. International Journal of Computational Intelligence and Information Security, 3(6), 52–58.

Hou, Y., Li, J., He, Z., Yan, A., Chen, X., & McAuley, J. (2024). Bridging Language and Items for Retrieval and Recommendation. http://arxiv.org/abs/2403.03952

Hussain, N., Turab Mirza, H., Hussain, I., Iqbal, F., & Memon, I. (2020). Spam Review Detection Using the Linguistic and Spammer Behavioral Methods. IEEE Access, 8, 53801–53816. https://doi.org/10.1109/ACCESS.2020.2979226

INC42. (2024). Ecommerce Platforms Bat For Crackdown On Fake Reviews. (StartupNews.fyi). https://startupnews.fyi/2024/05/16/ecommerce-platforms-bat-for-crackdown-on-fake-reviews/

Işik, M., & Dağ, H. (2020). The impact of text preprocessing on the prediction of review ratings. Turkish Journal of Electrical Engineering and Computer Sciences, 28(3), 1405–1421. https://doi.org/10.3906/elk-1907-46

Jyoti, S. D., & Singh, K. (2015). Comparison of various similarity measure techniques for generating recommendations for E-commerce sites and social websites. American International Journal of Research in Science, Technology, Engineering & Mathematics, 11(2), 219-221. http://www.iasir.net

Koroteev, M. V. (2021). BERT: A Review of Applications in Natural Language Processing and Understanding. ArXiv(2103.11943). http://arxiv.org/abs/2103.11943

Kübler, R. V., Lobschat, L., Welke, L., & van der Meij, H. (2024). The effect of review images on review helpfulness: A contingency approach. Journal of Retailing, 100(1), 5–23. https://doi.org/10.1016/j.jretai.2023.09.001

Kühl, N., Mühlthaler, M., & Goutier, M. (2020). Supporting customer-oriented marketing with artificial intelligence: automatically quantifying customer needs from social media. Electronic Markets, 30(2), 351–367. https://doi.org/10.1007/s12525-019-00351-0

Kwiatkowski, D., Phillips, P. C. B., Schmidt, P., & Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics, 54(1–3), 159–178. https://doi.org/10.1016/0304-4076(92)90104-y

Li, Q., Park, J., & Kim, J. (2024). Impact of information consistency in online reviews on consumer behavior in the e-commerce industry: a text mining approach. Data Technologies and Applications, 58(1), 132–149. https://doi.org/10.1108/DTA-08-2022-0342

Li, S., Liu, F., Zhang, Y., Zhu, B., Zhu, H., & Yu, Z. (2022). Text Mining of User-Generated Content (UGC) for Business Applications in E-Commerce: A Systematic Review. Mathematics, 10(19). https://doi.org/10.3390/math10193554

Lin, X. (2020). Sentiment Analysis of E-commerce Customer Reviews Based on Natural Language Processing. ACM International Conference Proceeding Series, 32–36. https://doi.org/10.1145/3436286.3436293

Mumuni, A. G., O’Reilly, K., MacMillan, A., Cowley, S., & Kelley, B. (2020). Online Product Review Impact: The Relative Effects of Review Credibility and Review Relevance. Journal of Internet Commerce, 19(2), 153–191. https://doi.org/10.1080/15332861.2019.1700740

Nguyen, N. M., Huong Giang, H., Vu, N. T. M., & Ta, S. A. (2025). How do online reviews moderate effects of country image on product image and purchase intention: cases of Korean and US products in Vietnam. Asia-Pacific Journal of Business Administration, 17(2), 337–358. https://doi.org/10.1108/APJBA-07-2023-0346

Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Advances in Health Sciences Education, 15(5), 625–632. https://doi.org/10.1007/s10459-010-9222-y

Park, K., Hong, J. S., & Kim, W. (2020). A Methodology Combining Cosine Similarity with Classifier for Text Classification. Applied Artificial Intelligence, 34(5), 396–411. https://doi.org/10.1080/08839514.2020.1723868

Paul, H., & Nikolaev, A. (2021). Fake review detection on online E-commerce platforms: a systematic literature review. Data Mining and Knowledge Discovery, 35(5), 1830–1881. https://doi.org/10.1007/s10618-021-00772-6

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830.

Periasamy, M., Mahadevan, R., Raman, R. C., & Jessiman, J. (2024). Finding fake reviews in e-commerce platforms by using hybrid algorithms. arXiv preprint arXiv:2404.06339.

Qiu, K., & Zhang, L. (2024). How online reviews affect purchase intention: A meta-analysis across contextual and cultural factors. Data and Information Management, 8(2), 100058. https://doi.org/10.1016/j.dim.2023.100058

Racherla, P., & Friske, W. (2012). Perceived “usefulness” of online consumer reviews: An exploratory investigation across three services categories. Electronic Commerce Research and Applications, 11(6), 548–559. https://doi.org/10.1016/j.elerap.2012.06.003

Ranfagni, S., & Rosati, M. (2023). Triangulating online brand reputation, brand image, and brand identity: An interdisciplinary research approach to design the pathways of online branding strategies in luxury hospitality. In Online Reputation Management in Destination and Hospitality: What We Know, What We Need To Know. Emerald Publishing Limited. https://doi.org/10.1108/978-1-80382-375-120231012

Ren, G., & Hong, T. (2019). Examining the relationship between specific negative emotions and the perceived helpfulness of online reviews. Information Processing and Management, 56(4), 1425–1438. https://doi.org/10.1016/j.ipm.2018.04.003

Salminen, J., Kandpal, C., Kamel, A. M., Jung, S. G., & Jansen, B. J. (2022). Creating and detecting fake reviews of online products. Journal of Retailing and Consumer Services, 64, 102771. https://doi.org/10.1016/j.jretconser.2021.102771

Sayeed, M. S., Mohan, V., & Muthu, K. S. (2023). BERT: A Review of Applications in Sentiment Analysis. HighTech and Innovation Journal, 4(2), 453–462. https://doi.org/10.28991/HIJ-2023-04-02-015

Shahmirzadi, O., Lugowski, A., & Younge, K. (2019). Text similarity in vector space models: A comparative study. Proceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019, 659–666. https://doi.org/10.1109/ICMLA.2019.00120

Shirkhani, S., Mokayed, H., Saini, R., & Chai, H. Y. (2023). Study of AI-Driven Fashion Recommender Systems. SN Computer Science, 4(5), 514. https://doi.org/10.1007/s42979-023-01932-9

Singh, A., & Garg, S. K. (2024). Comparative Study of Different Document Similarity Measures and Models. In Lecture Notes in Networks and Systems (Vol. 894). Springer Nature Singapore. https://doi.org/10.1007/978-981-99-9562-2_61

Singh, H., Chakrabarti, S., & Utkarsh. (2023). How do gratifications to read reviews and perceived reviewers’ credibility impact behavioural intentions in fashion e-commerce? A mediating-moderating perspective. Computers in Human Behavior, 143(4), 107677. https://doi.org/10.1016/j.chb.2023.107677

Singh, U., Saraswat, A., Azad, H. K., Abhishek, K., & Shitharth, S. (2022). Towards improving e-commerce customer review analysis for sentiment detection. Scientific Reports, 12(1), 21983. https://doi.org/10.1038/s41598-022-26432-3

Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to Fine-Tune BERT for Text Classification? In M. Sun, X. Huang, H. Ji, Z. Liu, & Y. Liu (Eds.), Chinese Computational Linguistics. CCL 2019. Lecture Notes in Computer Science (Vol. 11856, pp. 194–206). Springer. https://doi.org/10.1007/978-3-030-32381-3_16

Sun, X., Han, M., & Feng, J. (2019). Helpfulness of online reviews: Examining review informativeness and classification thresholds by search products and experience products. Decision Support Systems, 124, 113099. https://doi.org/10.1016/j.dss.2019.113099

Sung, E., Chung, W. Y., & Lee, D. (2023). Factors that affect consumer trust in product quality: a focus on online reviews and shopping platforms. Humanities and Social Sciences Communications, 10(1), 1–10. https://doi.org/10.1057/s41599-023-02277-7

Tharwat, A. (2021). Classification assessment methods. Applied Computing and Informatics, 17(1), 168–192.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 2017-Decem, 5999–6009.

Wong, W. K., Cheng, Y., & Yue, M. (2024). Could Regression of Stationary Series Be Spurious? Asia-Pacific Journal of Operational Research. https://doi.org/10.1142/S0217595924400177

Wong, W. K., & Pham, M. T. (2022a). Could the test from the standard regression model could make significant regression with autoregressive noise become insignificant—a note. The International Journal of Finance, 35, 19–39.

Wong, W. K., & Pham, M. T. (2022b). Could the test from the standard regression model could make significant regression with autoregressive noise become insignificant. The International Journal of Finance, 34, 1–18.

Wong, W. K., & Pham, M. T. (2023a). Could the test from the standard regression model could make significant regression with autoregressive Y_t and X_t become insignificant – a note. The International Journal of Finance, 35, 20–41.

Wong, W. K., & Pham, M. T. (2023b). Could the test from the standard regression model could make significant regression with autoregressive Yt and Xt become insignificant. The International Journal of Finance, 35, 1–19.

Wong, W. K., & Pham, M. T. (2025). How to model a simple stationary series with a non-stationary series. The International Journal of Finance, 37, 1–19.

Wong, W. K., & Yue, M. (2024a). Could Regressing a Stationary Series on a Non-Stationary Series Obtain Meaningful Outcomes?- a remedy. The International Journal of Finance, 36, 1–20. https://doi.org/10.1142/S2010495224500118

Wong, W. K., & Yue, M. (2024b). Could Regressing a Stationary Series on a Non-Stationary Series Obtain Meaningful Outcomes? Annals of Financial Economics, 19(3), 2450011. https://doi.org/10.1142/S2010495224500118

Xu, S., Cuan, H., Yin, Z., & Yin, C. (2024). A Hybridized Approach for Enhanced Fake Review Detection. IEEE Transactions on Computational Social Systems, 11(6), 7448–7466. https://doi.org/10.1109/TCSS.2024.3411635

Yin, D., Mitra, S., & Zhang, H. (2016). When do consumers value positive vs. negative reviews? An empirical investigation of confirmation bias in online word of mouth. Information Systems Research, 27(1), 131–144. https://doi.org/10.1287/isre.2015.0617

Yoo, S. Y., & Jeong, O. R. (2020). Automating the expansion of a knowledge graph. Expert Systems with Applications, 141, 112965. https://doi.org/10.1016/j.eswa.2019.112965

Zheng, L. (2021). The classification of online consumer reviews: A systematic literature review and integrative framework. Journal of Business Research, 135, 226–251. https://doi.org/10.1016/j.jbusres.2021.06.038

Zhou, Y., & Yang, S. (2019). Roles of Review Numerical and Textual Characteristics on Review Helpfulness Across Three Different Types of Reviews. IEEE Access, 7, 27769–27780. https://doi.org/10.1109/ACCESS.2019.2901472

Published

2026-03-28

How to Cite

Qazmi, S. I. A., Chakkaravarthy, M., Raza, S. H., Aslam, F., Aslam, S., & Iftikhar, M. (2026). Review Helpfulness to Support Business: Identifying Fake Reviews from User-Generated Content Using Random Forest. Advances in Decision Sciences, 30(2), 114-156. https://doi.org/10.47654/v30y2026i2p114-156