Document Type : Research Paper
Authors
1 Ph.D Candidate of Information Technology Management,, Allameh Tabataba’i University, Tehran, IranCorresponding Author: AbbasBagherian@yahoo.com
2 Associate Professor, Allameh Tabataba'i University, Tehran, Iran
3 Professor, Department of Industrial Management, Faculty of Management and Accounting, Allameh Allameh Tabataba’i University Tehran, Iran
4 Associate Professor, University of Gävle, Gävle, Sweden
Abstract
Most traditional fraud detection systems primarily focus on financial criteria to identify financial fraud, often overlooking the potential for fraudulent companies to engage in various types of non-financial misconduct. Recent studies have predominantly highlighted the significance of financial data as the sole indicator of fraud, neglecting the exploration of non-financial or Environmental, Social, and Governance (ESG) metrics as supplementary predictors. This research aims to enhance fraud prediction by integrating financial and ESG data through sophisticated machine learning and deep learning models. It examines the effectiveness of supervised machine learning and deep learning algorithms in detecting financial fraud over a 10-year period ending in 1401. This study innovatively demonstrates that a hybrid model, which combines financial and non-financial criteria, yields superior predictive accuracy for financial fraud than models based solely on financial data. The results of this study, addressing the first research question, indicate that among various machine learning and deep learning algorithms, the classification or bagging algorithm demonstrated superior efficiency. Furthermore, in response to the second research question, it was found that the dataset encompassing all features—integrating both financial and non-financial data—outperformed those datasets limited to either financial or non-financial data alone. The research results indicated that the bagging machine learning algorithms act the best with combined feature set including financial and ESG metrics combined. The adoption of the proposed model significantly improves the accuracy and effectiveness of fraud detection systems.
Introduction
In an era marked by rapid advancements in data analytics and increasing corporate accountability, the detection of financial fraud has become a priority for stakeholders across the global business landscape. Traditional fraud detection systems have primarily focused on analyzing financial data, often at the expense of overlooking non-financial metrics that may equally signal fraudulent activities. This oversight is significant considering the growing evidence suggesting that non-financial indicators, particularly Environmental, Social, and Governance (ESG) metrics, can provide critical insights into the operational integrity of organizations.
Literature Review
Recent scholarly works and industry reports have highlighted a significant shift towards integrating ESG metrics with financial data to enhance the predictive accuracy of fraud detection systems. This integration reflects an expanded understanding of what constitutes corporate transparency and accountability, extending beyond mere financial disclosures to include broader sustainability and governance factors. Indeed, the integration of these diverse data sources promises a more holistic approach to fraud detection, aligning with contemporary demands for corporate responsibility and ethical business practices. The research presented in this paper builds on this foundation by employing advanced machine learning (ML) and deep learning (DL) algorithms to analyze a combination of financial and non-financial metrics. The study's innovative approach leverages a decade's worth of data from over 6000 public companies, utilizing a variety of ML and DL models to explore the efficacy of integrated datasets in predicting fraudulent activities more effectively than traditional methods. The findings aim to contribute not only to academic discourse but also to practical applications in corporate governance, offering valuable insights for regulators, investors, and policymakers committed to upholding the highest standards of corporate ethics and governance. By synthesizing complex data sets and applying sophisticated analytical techniques, this research underscores the potential of ML and DL models to revolutionize fraud detection, setting a new standard for both the scope and depth of fraud analysis.
Objective
The primary goal of this research is to improve financial fraud detection in public enterprises by integrating Environmental, Social, and Governance (ESG) metrics with traditional financial data, using machine learning (ML) and deep learning (DL) techniques. This approach addresses the limitations of traditional systems that focus mainly on financial indicators, often missing non-financial signs of fraud. This study rigorously tests various ML and DL models trained on ESG-enriched datasets against those using only financial data, exploring whether a holistic approach can enhance fraud predictiveness. The research aims to offer a broader view of company operations, in line with sustainable practices, potentially shifting how data science is applied in fraud detection. Ultimately, this study seeks to enrich discussions on integrating financial and non-financial data in fraud detection, influencing future corporate risk and governance strategies, and improving fraud prediction accuracy in line with emerging standards of corporate accountability and transparency.
Method
This study employs a sophisticated analytical approach using machine learning (ML) and deep learning (DL) to enhance financial fraud detection, leveraging a robust dataset that includes both traditional financial indicators and Environmental, Social, and Governance (ESG) metrics from over 6000 public companies worldwide. These metrics, sourced from reputable databases such as Thomson Reuters ASSET4, are crucial for advanced analyses. The methodology involves thorough data preprocessing, including handling missing values, normalizing data, and encoding categorical variables, with a focus on balancing the dataset using oversampling techniques to counter class imbalance and improve model generalization for detecting rare fraudulent cases.
The research rigorously evaluates various ML and DL models like Decision Trees, Naive Bayes, SVM, CNN, LSTM, and ensemble methods such as Bagging, Extra Trees, and Random Forests. The models are trained and tested on divided datasets to assess their effectiveness using metrics like accuracy, precision, recall, F1-score, and the Matthews Correlation Coefficient (MCC), with extensive validation techniques including cross-validation to ensure stability and prevent overfitting. The models' performance is compared with baseline models that use only financial data, highlighting the benefits of integrating ESG metrics for deeper insights and enhanced predictiveness in fraud detection.
Results
This study evaluates the integration of Environmental, Social, and Governance (ESG) metrics with traditional financial data in detecting financial fraud using various machine learning (ML) and deep learning (DL) algorithms. Results highlight the enhanced performance of fraud detection models when using combinations of financial and ESG metrics. Notably, the Extra Tree classifier and bagging algorithms excelled, particularly when analyzing balanced datasets that included both types of metrics. The use of oversampling techniques proved crucial in improving detection rates for rare fraudulent cases, thus balancing the dataset and reducing bias.
Models integrating both financial and ESG data consistently outperformed those using only one data type, enhancing accuracy, precision, recall, and F1 score. This underscores the value of a multidimensional approach in fraud detection. Advanced metrics like the Matthews Correlation Coefficient (MCC) and the Area Under the ROC Curve (AUC) provided a nuanced assessment of model performance, with higher MCC and AUC values indicating greater effectiveness in identifying fraudulent activities. The integration of ESG metrics was particularly effective in identifying potential fraud in companies that might appear financially sound but engage in unethical practices.
The findings recommend that companies, regulatory bodies, and technology developers adopt integrated approaches that encompass both financial and ESG data to improve fraud detection. Future research could focus on real-time data integration and more complex models like hybrid deep learning frameworks to further boost detection capabilities. The study demonstrates that using ESG metrics alongside financial data with advanced ML techniques significantly improves the accuracy and reliability of fraud detection systems, aligning with sustainable business practices and setting the stage for future innovations in fraud detection. This comprehensive approach not only yields superior performance but also enhances the model's capabilities, emphasizing the effectiveness of combining financial and non-financial data.
Conclusion
This research significantly advances the use of machine learning (ML) and deep learning (DL) in detecting financial fraud, highlighting the integration of Environmental, Social, and Governance (ESG) metrics with traditional financial data to enrich datasets and enhance model predictive power. Models trained on datasets combining financial and ESG metrics show superior performance in accuracy, precision, recall, and F1 score, improving anomaly detection and fraud prediction. The use of oversampling techniques addresses class imbalance issues, enhancing sensitivity to rare fraudulent cases and boosting the performance of ensemble methods like the Extra Tree classifier.
The findings highlight the critical role of ESG metrics in enhancing corporate governance and risk management, providing deeper insights into non-financial behaviors that indicate potential risks, which supports more informed decision-making and boosts transparency. Future research should investigate real-time fraud detection systems and the use of unsupervised and semi-supervised models to adapt to evolving fraud tactics. Practitioners are encouraged to adopt advanced machine learning (ML) and deep learning (DL) techniques, incorporating ESG metrics to improve fraud detection systems' accuracy and reliability, aligning with sustainable business practices and setting new standards in fraud detection technology.
Keywords: Fraud Detection Intelligent Systems, Deep Learning, Machine Learning, Financial Metrics, Non-Financial Metrics (ESG).
Keywords
Main Subjects
- Accounting Standards Committee. (n.d.). Accounting standards. Tehran: Organization for Accounting and Auditing. (In Persian)
- Baqarian, A. (2006). Electronic stock exchange; standards and intelligent supervisory systems. In Proceedings of the First National Electronic Stock Exchange Conference (pp. 1-10). Qazvin: Islamic Azad University, Qazvin Branch. (In Persian)
- Bolu, G., Barzeedeh, F., & Aleyari Abadi, H. (2020). A pattern for evaluating fraud risk in auditing financial statements. Journal of Accounting Knowledge, 11(4), 25-45. https://doi.org/10.22103/jak.2020.15880.3254 (In Persian)
- Bahrami, A., Nouroushe, I., Rad, A., & Molkareni, A. (2021). Financial statement fraud and new techniques used to detect it. Accounting and Auditing Studies, 10(38), 105-118. https://doi.org/10.22034/iaas.2021.134547 (In Persian)
- Fazli, N. A. (1997). Education, research and promotion (content analysis of social sciences letters). Index of Research, 1(1), 1-16. (In Persian)
- Ghaedi, M., & Colleagues. (2016). Content analysis method, from quantitative to qualitative. Scientific-Research Quarterly "Methods and Models of Psychology, 7(23), 57-82. (In Persian)
- Ghorbanian, A., Abdoli, M., Velian, H., & Bodlaei, H. (2023). Evaluation of the performance of corporate citizen internal audit functions. Development and Capital, 8(1), 143-165. https://doi.org/10.22103/jdc.2022.19858.1273 (In Persian)
- Ghodosi, M. R., & Colleagues. (2019). Presenting a knowledge governance framework in social networks. Journal of Information Management, 5(1). (In Persian)
- Goldman, P., & Kaufman, H. (2016). A practical guide to fraud risks and anti-fraud controls. Translated by Amir PourianSab and Mohsen Ghasemi. Tehran: Hesab Afzar Iraniyan Publications. (In Persian)
- Hakimi, H. (2013). Cognitive approach: History, vision, and attention in cognitive sciences and machine intelligence. Tehran: Sam Publishing. (In Persian)
- Holsti, L. R. (1975). Content analysis for the social sciences and humanities. Translated by Nader Salarzadeh Amiri. Tehran: Allameh Tabatabai University Press (Translation date of the original work in 1969). (In Persian)
- Jalali Jamali, A., Motaqi, A., & Mohammadi, A. (2022). Comparative study of bankruptcy prediction models and presentation of an optimal model for the Iranian economic environment. Development and Capital, 6(2), 111-134. https://doi.org/10.22103/jdc.2022.18728.1187 (In Persian)
- Kazemi, T. (2016). Optimal portfolio selection from among the stocks of companies accepted in Tehran Stock Exchange using the ant colony algorithm. Master’s thesis, Department of Accounting, Islamic Azad University, Tehran Central Branch. (In Persian)
- Kodreh, D. (2012). Computer-assisted fraud prevention and detection. Translated by Amir PourianSab and Aida PourianSab. Tehran: Hoshiar Momis Publications. (In Persian)
- Kordestani, G. R., & Ashtyab, A. (2009). Predicting earnings management based on earnings per share adjustment. Development and Capital, 2(2), 141-158. https://doi.org/10.22103/jdc.2009.1912 (In Persian)
- Mehrbani, S., Ganji, H., Taheri, A., & Asgari, M. R. (2009). Evaluation of company ranking based on accounting and non-accounting information and comparison with company ranking in Tehran Stock Exchange. Development and Capital, 2(1), 7-32. https://doi.org/10.22103/jdc.2009.1899 (In Persian)
- Meshaiki, B., & Colleagues. (2013). Developing an auditing quality model. Journal of Securities and Stock Exchange, 6(23), 103-137. (In Persian)
- Miri Yaqoob Seyedrezaee. (2016). Diversification and complication of the administrative system as a model for optimal management of the capital. Journal of Geography Research, 31(4), 123-142. (In Persian)
- Momeni Rad, A., Ali Abadi, K., Fardanesh, H., & Mazini, N. (2013). Qualitative content analysis in research: Nature, stages, and validity of results. Educational Measurement, 4(14). (In Persian)
- Modares, A., & Aflatoni, A. (2009). Earnings management in companies accepted in Tehran Stock Exchange. Development and Capital, 2(2), 51-72. https://doi.org/10.22103/jdc.2009.190 (In Persian)
- Namazi, M., & Nazimi, A. (2008). A review of accounting research conducted on the Tehran Stock Exchange. Development and Capital, 1(2), 9-48. https://doi.org/10.22103/jdc.2008.1891 (In Persian)
- Novidi Abbaspour, E., & Vaezi, J. (2022). Factors determining the ability of auditing to detect fraud: Internal and external factors. A case study: Auditors of auditing institutions that are members of the Society of Iranian Certified Public Accountants. 12th International Conference on Novel Research in Management, Economics, Accounting and Banking. https://civilica.com/doc/1566301. (In Persian)
- Raif, D., Liss, S., & Feick, F. G. (2006). Analyzing media messages (using quantitative content analysis in research) (2nd ed.). Translated by Mahdokht Boroujerdi Alavi. Tehran: Soroush. (In Persian)
- Raeesi Vanani, I., Bagherian Kasgari, A., Amiri, M., & Homayoun, S. (2023). A comprehensive analysis of two decades in intelligent surveillance systems for financial fraud detection research. Journal of Development and Capital. https://doi.org/10.22103/jdc.2023.22263.1426 (In Persian)
- Rezaeiyan, A. (1998). Analysis and design of systems: Methods and techniques of system analysis and design. Tehran: Samt. (In Persian)
- Sarmad, Z., Bazargan, A., & Hejazi, A. (2011). Research methods in behavioral sciences. Tehran: Agah Publishing. (In Persian)
- Senge, P. (1997). The fifth discipline. Translated by Hafez Kamal Hedayat, Mohammad Roshan. Tehran: Industrial Management Organization Publications. (In Persian)
- Shayan, S., Nourbakhsh, S. F., & Colleagues. (2016). Complexity theory and collage approach in geomorphic systems. Journal of Geography of Arid Regions, 6(20), 1-14. (In Persian)
- Shokouhi Fard, S., Abolhasani, A., & Farhang, A. (2021). The effects of corruption on financial fragility in Iran: A quantile regression approach. Development and Capital, 6(2), 93-110. https://doi.org/10.22103/jdc.2021.18460.1169 (In Persian)
- Strauss, A., & Corbin, J. (2015). Basics of qualitative research: Techniques and procedures for generating grounded theory (2nd ed.). Sage Publications. (In Persian)
- Thaghafi, A., & Bahar Moghaddam, M. (2008). Effective factors on earnings management. Development and Capital, 1(2), 103-125. https://doi.org/10.22103/jdc.2008.1894 (In Persian)
- Yari, J. (2008). Investigating the barriers to creating a learning organization in the Iran Khodro Training Center. Journal of Human Resource Management Studies, 1(4), 1-10. (In Persian)