Forecasting ROA and ROE for Retail Companies in Vietnam by Using Machine Learning Techniques
DOI:
https://doi.org/10.47654/v29y2025i4p63-93Keywords:
Financial performance, forecasting, ROA, ROE, retail enterprises, VietnamAbstract
Purpose: This study aims to forecast the financial performance of Vietnamese retail companies by predicting both Return on Assets (ROA) and Return on Equity (ROE) by using artificial intelligence (AI) models, thereby enhancing predictive decision-making in the retail sector.
Design/methodology/approach: Financial statements from publicly listed retail firms covering the period 2010 to 2024, together with macroeconomic variables (CPI, exchange rate, gold price, VN-Index, oil prices), are combined. Three machine learning algorithms—Random Forest, XGBoost, and Multilayer Perceptron—are trained and tested by using an 80/20 split. Model accuracy is assessed by using RMSE, MAE, MAPE, Pearson’s R, and Theil’s U.
Findings: Random Forest achieved the lowest RMSE of 0.0926 and delivered the highest forecasting accuracy for both ROA and ROE, followed by XGBoost, while MLP performed less effectively. Decision tree–based ensemble models better capture non-linear relationships in financial data than neural networks in this context. Research limitations/implications – The study covers only listed retail firms and annual data, omitting private firms and intra-year fluctuations. The models identify associations, not causality.
Practical implications: From a Decision-Sciences perspective, the proposed AI-driven forecasting framework operates as a data-rich decision-support system that transforms multi-source information into actionable guidance on capital allocation, risk mitigation, and performance control.
Originality/value: This is the first study to jointly forecast both ROA and ROE for Vietnamese retailers using three complementary ML families, integrating firm-level and macro-financial variables, and providing reproducible benchmarks for AI-driven financial forecasting in emerging markets.
References
Abd-elaziem, A. H., & Soliman, T. H. M. (2023). A multi-layer perceptron (mlp) neural networks for stellar classification: A review of methods and results. International Journal of Advances in Applied Computational Intelligence, 3(10.54216).
Aghware, F. O., Ojugo, A. A., Adigwe, W., Odiakaose, C. C., Ojei, E. O., Ashioba, N. C., Okpor, M. D., & Geteloma, V. O. (2024). Enhancing the random forest model via synthetic minority oversampling technique for credit-card fraud detection. Journal of Computing Theories and Applications, 1(4), 407–420.
Almahadeen, L., Mahadin, G. A. L., Santosh, K., Aarif, M., Deb, P., Syamala, M., & Bala, B. K. (2024). Enhancing Threat Detection in Financial Cyber Security Through Auto Encoder-MLP Hybrid Models. International Journal of Advanced Computer Science & Applications, 15(4).
Armstrong, J. S., & Collopy, F. (1992). Error measures for generalizing about forecasting methods: Empirical comparisons. International Journal of Forecasting, 8(1), 69–80.
Asirim, Ö. E., Aşirim, A., & Salepçioğlu, M. A. (2024). Performance of Prophet in Stock-Price Forecasting: Comparison with ARIMA and MLP Networks. 2024 Sixth International Conference on Intelligent Computing in Data Sciences (ICDS), 1–7.
Aysan, A. F., Ciftler, B. S., & Unal, I. M. (2024). Predictive power of random forests in analyzing risk management in Islamic banking. Journal of Risk and Financial Management, 17(3), 104.
Balci, T., & Ogul, H. (2021). Predicting Bank Return on Equity (ROE) using Neural Networks. 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), 279–286.
Bian, J., & Lin, J. (2025). Optimizing Investment Strategies: A Random Forest Approach to Stock Return Prediction and Portfolio Management. International Workshop on Navigating the Digital Business Frontier for Sustainable Financial Innovation (ICDEBA 2024), 674–681.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7(3), 1247–1250.
Chen, R. (2025). Stock Price Prediction and Portfolio Optimization Based on Mean Variance Model and Random Forest Model. International Workshop on Navigating the Digital Business Frontier for Sustainable Financial Innovation (ICDEBA 2024), 649–655.
Cheng, Y., Hui, Y., Liu, S., & Wong, W.-K. (2022). Could significant regression be treated as insignificant: An anomaly in statistics? Communications in Statistics: Case Studies, Data Analysis and Applications, 8(1), 133 151
Cheng, Y., Hui, Y., McAleer, M., & Wong, W.-K. (2021). Spurious relationships for nearly non stationary series. Journal of Risk and Financial Management, 14(8), 366. https://doi.org/10.3390/jrfm14080366
Chiang, T. C., Qiao, Z., & Wong, W. K. (2010). New evidence on the relation between return volatility and trading volume. Journal of Forecasting, 29(5), 502–515. https://doi.org/10.1002/for.1153
Das, A., De, S., Mukherjee, T., Dey, M., & Ghosh, K. D. (2024). Forecasting Bank ROE with Gradient Boosting: A Machine Learning Approach. 2024 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI), 448–452.
Đạt, N. P., Nhật, H. M. M., & Vinh, T. C. (2025). Ứng dụng học máy và học sâu trong nghiên cứu tài chính: Một nghiên cứu về dự báo khả năng hoàn trả khoản vay của khách hàng. TẠP CHÍ KHOA HỌC ĐẠI HỌC MỞ THÀNH PHỐ HỒ CHÍ MINH-KINH TẾ VÀ QUẢN TRỊ KINH DOANH, 20(1), 35–53.
Ding, N., Ruan, X., Wang, H., & Liu, Y. (2025). Automobile Insurance Fraud Detection Based on PSO-XGBoost Model and Interpretable Machine Learning Method. Insurance: Mathematics and Economics, 120, 51–60.
Foster, G. (1986). Financial statement analysis, 2/e. Pearson Education India.
Hajek, P., Abedin, M. Z., & Sivarajah, U. (2023). Fraud detection in mobile payment systems using an XGBoost-based framework. Information Systems Frontiers, 25(5), 1985–2003.
Han, Y., Kim, J., & Enke, D. (2023). A machine learning trading system for the stock market based on N-period Min-Max labeling using XGBoost. Expert Systems with Applications, 211, 118581.
He, K., Yang, Q., Ji, L., Pan, J., & Zou, Y. (2023). Financial time series forecasting with the deep learning ensemble model. Mathematics, 11(4), 1054.
Healy, P. M., & Palepu, K. G. (2001). Information asymmetry, corporate disclosure, and the capital markets: A review of the empirical disclosure literature. Journal of Accounting and Economics, 31(1–3), 405–440.
Hoaglin, D. C., & Iglewicz, B. (1987). Fine-tuning some resistant rules for outlier labeling. Journal of the American Statistical Association, 82(400), 1147–1149.
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688.
Iranzad, R., & Liu, X. (2024). A review of random forest-based feature selection methods for data science education and applications. International Journal of Data Science and Analytics, 1–15.
Jain, A. K., Duin, R. P. W., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4–37.
Kayakus, M., Tutcu, B., Terzioglu, M., Talaş, H., & Ünal Uyar, G. F. (2023). ROA and ROE forecasting in iron and steel industry using machine learning techniques for sustainable profitability. Sustainability, 15(9), 7389.
Liu, J. (2024). Predicting Chinese stock market using XGBoost multi-objective optimization with optimal weighting. PeerJ Computer Science, 10, e1931.
Liu, X. (2021). High Frequency Price Duration Prediction of Option Based on XGBoost and GA-BP. Frontiers in Economics and Management, 2(5), 290–300.
Liu, X., Hu, Y., Li, X., Du, R., Xiang, Y., & Zhang, F. (2024). An Interpretable Model for Salinity Inversion Assessment of the South Bank of the Yellow River Based on Optuna Hyperparameter Optimization and XGBoost. Agronomy, 15(1), 18.
Lubis, M. A., & Samsudin, S. (2025). Using the Random Forest Method in Predicting Stock Price Movements. Journal of Dinda: Data Science, Information Technology, and Data Analytics, 5(1), 28–35.
Mahmoudi, S., Mahmoudi, S., & Mahmoudi, A. (2017). Prediction of earnings management by use of multilayer perceptron neural networks with two hidden layers in various industries. Journal of Entrepreneurship, Business and Economics, 5(1), 216–236.
Meher, B. K., Anand, A., Kumar, S., Birau, R., & Singh, M. (2024). Effectiveness of random forest model in predicting stock prices of solar energy companies in India. International Journal of Energy Economics and Policy, 14(2), 426–434.
Meher, B. K., Singh, M., Birau, R., & Anand, A. (2024). Forecasting stock prices of fintech companies of India using random forest with high-frequency data. Journal of Open Innovation: Technology, Market, and Complexity, 10(1), 100180.
Mihali, S. I., & Niță, Ș. L. (2024). Credit card fraud detection based on random forest model. 2024 International Conference on Development and Application Systems (DAS), 111–114.
Nhật, N. M., & Duy, N. H. K. (2024). Dự báo khả năng vỡ nợ của doanh nghiệp nhỏ và vừa tại Việt Nam: Nghiên cứu trên các mô hình học máy. Tạp Chí Kinh Tế - Luật & Ngân Hàng, 266, 51–64.
Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., & Idroes, R. (2023). Credit card fraud detection for contemporary financial management using XGBoost-driven machine learning and data augmentation techniques. Indatu Journal of Management and Accounting, 1(1), 29–35.
Parente, M., Rizzuti, L., & Trerotola, M. (2024). A profitable trading algorithm for cryptocurrencies using a neural network model. Expert Systems with Applications, 238, 121806.
Patro, S., & Sahu, K. K. (2015). Normalization: A preprocessing stage. ArXiv Preprint ArXiv:1503.06462.
Pham, V. H. S., & Le, T. D. (2024). Research on Applying Machine Learning Models to Predict and Assess Return on Assets (Roa).
Ragb, H. (2023). Multi-layered Deep Learning Perceptron Based Model for Predicting Drug Price Changes. Authorea Preprints.
Rashedi, K. A., Ismail, M. T., Al Wadi, S., Serroukh, A., Alshammari, T. S., & Jaber, J. J. (2024). Multi-layer perceptron-based classification with application to outlier detection in Saudi Arabia stock returns. Journal of Risk and Financial Management, 17(2), 69.
Salamzadeh, A., Ebrahimi, P., Soleimani, M., & Fekete-Farkas, M. (2022). Grocery apps and consumer purchase behavior: application of Gaussian mixture model and multi-layer perceptron algorithm. Journal of Risk and Financial Management, 15(10), 424.
Shmueli, G., & Koppius, O. R. (2011). Predictive analytics in information systems research. MIS Quarterly, 553–572.
Sudhakaran, P., & Baitalik, S. (2022). XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), 1–6.
Theil, H. (1992). Henri Theil’s contributions to economics and econometrics: econometric theory and methodology. Vol. I (Vol. 1). Springer Science & Business Media.
Tsai, C.-F., & Chiou, Y.-J. (2009). Earnings management prediction: A pilot study of combining neural networks and decision trees. Expert Systems with Applications, 36(3), 7183–7191.
Tukey, J. W. (1977). Exploratory data analysis (Vol. 2). Springer.
Tutcu, B., Kayakuş, M., Terzioğlu, M., Ünal Uyar, G. F., Talaş, H., & Yetiz, F. (2024). Predicting Financial Performance in the IT Industry with Machine Learning: ROA and ROE Analysis. Applied Sciences, 14(17), 7459.
Uddin, M. S., Chi, G., Al Janabi, M. A. M., & Habib, T. (2022). Leveraging random forest in micro‐enterprises credit risk modelling for accuracy and interpretability. International Journal of Finance & Economics, 27(3), 3713–3729.
Wang, Y. (2022). Credit risk evaluation of asset securitization of PPP project of sports public service venues based on random forest algorithm. Computational Intelligence and Neuroscience, 2022(1), 5177015.
Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30(1), 79–82.
Wong, W. K., Cheng, Y., & Yue, M. (2024a). Could regression of stationary series be spurious? Asia-Pacific Journal of Operational Research, 2440017.
Wong, W. K., & Pham, M. T. (2022b). Could the test from the standard regression model could make significant regression with autoregressive noise become insignificant – a note. The International Journal of Finance, 34(2), 19–39.
Wong, W. K., & Pham, M. T. (2022a). Could the test from the standard regression model could make significant regression with autoregressive noise become insignificant? The International Journal of Finance, 34(1), 1–18.
Wong, W. K., & Pham, M. T. (2023b). Could the test from the standard regression model could make significant regression with autoregressive Yt and Xt become insignificant – a note. The International Journal of Finance, 35(2), 20–41.
Wong, W. K., & Pham, M. T. (2023a). Could the test from the standard regression model could make significant regression with autoregressive Yt and Xt become insignificant? The International Journal of Finance, 35(1), 1–19.
Wong, W. K., & Pham, M. T. (2025a). Could the correlation of a stationary series with a non-stationary series obtain meaningful outcomes? Annals of Financial Economics.
Wong, W. K., & Pham, M. T. (2025b). How to model a simple stationary series with a non-stationary series? The International Journal of Finance, 37(1), 1–19.
Wong, W. K., Pham, M. T., & Yue, M. (2024b). Could regressing a stationary series on a non-stationary series obtain meaningful outcomes – a remedy. The International Journal of Finance, 36(1), 1–20.
Wong, W. K., & Yue, M. (2024). Could regressing a stationary series on a non-stationary series obtain meaningful outcomes? Annals of Financial Economics, 19(03), 2450011.
Xia, H., An, W., & Zhang, Z. J. (2023). Credit risk models for financial fraud detection: A new outlier feature analysis method of xgboost with smote. Journal of Database Management (JDM), 34(1), 1–20.
Xie, Z., & Huang, X. (2024). A credit card fraud detection method based on mahalanobis distance hybrid sampling and random forest algorithm. IEEE Access.
Yun, K. K., Yoon, S. W., & Won, D. (2021). Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process. Expert Systems with Applications, 186, 115716.
Zafar, M. B., & Yasin, T. (2025). Intellectual capital and financial performance of Islamic banks: a meta-analysis. Accounting Research Journal.
Zhao, H. (2025). Predicting Stock Prices and Optimizing Portfolios: A Random Forest and Monte Carlo-Based Approach Using NASDAQ-100. International Workshop on Navigating the Digital Business Frontier for Sustainable Financial Innovation (ICDEBA 2024), 883–892.
Published
Issue
Section
License
Copyright (c) 2025 Advances in Decision Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Scientific and Business World