Document Type : Research Paper

Authors

1 Department of Information Technology Management, Faculty of Management and Economics, Islamic Azad University, Science and Research Branch, Tehran, Iran

2 industrial management , Islamic Azad University , karaj branch (Corresponding Author: poorebrahimi@gmail.com)

3 Department of Information Technology Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran.

4 Faculty Member (Associate Professor), Management & Economic Faculty, Tarbiat Modares University, Tehran, Iran

Abstract

Fraud cases have increased in recent years, especially in important and sensitive financial and insurance fields. Therefore, to deal with such frauds, there is a need for different measures than traditional inspection methods. Agricultural insurance is also not exempted from this threat due to its nature and wide extent and every year a lot of money is spent on paying fake damages. This research was presented with the aim of providing a model to discover unrealistic damage claims in agricultural insurance by using data mining and machine learning techniques. It was used to build a deep learning model. The data used was obtained from the Agricultural Insurance Fund and related to wet and rainfed wheat insurance policies of Khuzestan province, for which compensation was paid in the 2018-2019 crop year. After preparing and preprocessing the data, using deep learning to discover unusual cases, the action and results were evaluated by the experts of the Agricultural Insurance Fund. After analyzing the results, it was found that 1% of the damages paid were related to unrealistic requests and more care should be taken in paying the damages. The accuracy of the model in detecting unusual cases for wet and dry wheat was 53.53 and 63.37 percent, respectively. In the review of the results, it was found that 5 categories of unusual behavior have led to the payment of unrealistic damages, and the behavior of not providing damage documentation was more frequent than the others.

Introduction

Insurance fraud refers to the immoral act of committing a crime with the intention of abusing an insurance policy to obtain illegal profit from an insurance company; In general, insurance is made to protect the assets and business of individuals or organizations against financial loss and may occur at any stage of the insurance process by anyone such as customers or fraudulent agents (Al -Hashedi & Magalingam, 2021). Insurance fraud not only reduces the profit of the insurance company and leads to major losses, but also affects the pricing strategy of the insurance company and its socio-economic benefits in the long term (Yaram, 2016). Every year, significant sums of money are defrauded from the insurance industry, but not all of them are discovered. According to the statistics published by the Insurance Anti-Fraud Coalition, an amount of about eighty billion dollars is added to customers' expenses in the United States through fraud, and they must compensate for the amount of fraud by paying higher insurance premiums in the following year (Fraud statistics, 2020). In Iran, there is no accurate estimate of the amount of compensations paid to unreal damage claims or any other fraud, and one of the goals of this research is to estimate the amount of fraud in wheat crop insurance using deep learning.

Research Question(s)

This research seeks to find answers to these questions: In rainfed and irrigated wheat crop insurance, what percentage of the paid compensations are related to unrealistic and fictitious damage claims, and what is the accuracy of deep learning detection for this purpose?

Literature Review

Ghahari et al. (2019) in their study investigated the use of deep learning in predicting agricultural performance in time and space with unstable weather conditions. They compared the performance of machine learning next to weather stations with conventional methods. Their findings showed that deep learning provides the highest prediction accuracy compared to other approaches. It can also be inferred from this result that the use of deep learning can play a role in reducing agricultural insurance costs by knowing the exact measures of crop yield (Newlands et al., 2019). Gomez et al. (2021) presented a new deep learning method to gain pragmatic insight into the behavior of an insured individual using the unsupervised effective variable. Their proposed method can be used in the fields of pension insurance, investment and other broader areas of the insurance industry. Their proposed method enables auto encoder and variable auto encoder to be used in semi-supervised/unsupervised effective variable analysis to identify cheating agents (Gomes et al., 2021). Xia et al. (2022) in their study proposed a deep learning model to detect car insurance fraud by combining convolutional neural network, long-term and short-term memory, and deep neural network. In their proposed method, more abstract features were extracted and helped the experts in the complex process of feature extraction which is very critical in traditional machine learning algorithms. The results of the experiments showed that their method can effectively improve the accuracy of car insurance fraud detection.

Methodology

The current research method is practical from the point of view of the objective and is data-oriented from the point of view of its nature. For machine learning modeling, the standard CRISP process has been used, which includes the stages of data collection, data preparation and preprocessing, modeling and model evaluation, and obtaining results. Figure 1 shows the general process of anomaly detection and analysis.
Figure 1. Anomaly detection process framework
 
In this research, the data related to one agricultural year of wet and dry wheat crop were obtained from the Agricultural Insurance Fund. The national code of the insurers has been removed from the data set to maintain confidentiality. The extracted data is related to the crop insurance policies of wet and rainfed wheat for the crop year 2018-2019 of Khuzestan province. In this crop year, compensation has been paid for these insurance policies according to the claim of the damage they had, in other words, the data set includes those insurance policies of wet and dry wheat whose product is damage Seen and compensated for them. The data were obtained from the comprehensive system of the insurance fund in the form of a csv report. The obtained data set had 23 features.

Conclusion

The results of the research show that in wheat insurance, about 1% of the compensations paid are allocated to unrealistic claims, so they need to be further investigated by experts before payment. This amount of compensations paid to unrealistic claims was close to the prediction of insurance fund inspection experts who stated that about 1.5% of claims are unrealistic. Also, according to the results, 5 categories of behavior or methods were identified in the beneficiaries to receive compensation for unrealistic claims, which are mentioned below:

Lack of sufficient documentation to prove the damage: This means that the necessary documents that should be uploaded in the system according to the implementation methods are not available or some of them have not been uploaded. Payment of compensation without the existence of documents indicating the occurrence of damage can be caused by the negligence or collusion of the appraiser or broker with the insured.
The documents are not in accordance with the declared damage: the documents uploaded in the system according to the relevant instructions do not show the occurrence of the type of registered damage. For example, the speed of storm damage is mentioned as 50 km/h, but in meteorological documents it is 15 km/h.
The damage documentation is not true: for example, in some documents, the risk factor is mentioned in the expert form of drought, but the picture sent shows flood damage. In this case, it is probably due to negligence. In another possibility, it is also possible to send the image of damaged agricultural land instead of healthy agricultural land.
Non-observance of the damage notification period: According to the executive instructions of the insurance fund, the time limit for the declaration of damage until the time of payment of compensation is one month. Outside of that, it is against the instructions. Sometimes it was observed that the damage had been declared before the accident.
The date of damage does not match with the time of its announcement: according to the executive instructions of the insurance fund, in the case of damage to agriculture, the visit must be done one week after the occurrence of the damage; before removing the damage, the type and amount of the damage should be carefully checked. In some cases, it was observed that the announcement date was recorded one month after the damage occurred. It is clear that after removing the effects of damage, the payment of compensation can seem suspicious because there may not have been any damage in the past.

Keywords: Anomaly Detection, Crop Insurance, Deep Learning, Auto Encoder.
 

Keywords

Main Subjects

  1. Al-Hashedi, K. G., & Magalingam, P. (2‌0‌2‌1‌). Financial fraud detection applying data mining techniques: A comprehensive review from 2‌0‌0‌9‌ to 2‌0‌1‌9‌. Computer Science Review, 4‌0‌, 1‌0‌0‌4‌0‌2‌. https://doi.org/1‌0‌.1‌0‌1‌6‌/j.cosrev.2‌0‌2‌1‌.1‌0‌0‌4‌0‌2‌
  2. Bisong, E. (2‌0‌1‌9‌). Optimization for Machine Learning: Gradient Descent. In E. Bisong (Ed.), Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners (pp. 2‌0‌3‌-2‌0‌7‌). Apress. https://doi.org/1‌0‌.1‌0‌0‌7‌/9‌7‌8‌-1‌-4‌8‌4‌2‌-4‌4‌7‌0‌-8‌_1‌6‌
  3. Brownlee, J. (2‌0‌2‌0‌). Data preparation for machine learning: data cleaning, feature selection, and data transforms in Python. Machine Learning Mastery. https://www.google.com/books/edition/Data_Preparation_for_Machine_Learning/uAPuDwAAQBAJ?hl=en&gbpv=1‌&dq=Data%2‌0‌Preparation%2‌0‌for%2‌0‌Machine%2‌0‌Learning&pg=PP1‌&printsec=frontcover
  4. Brownlee, J. (2‌0‌2‌1‌). How to choose an activation function for deep learning. Machine Learning Mastery. https://machinelearningmastery.com/choose-an-activation-function-for-deep-learning/
  5. Chalapathy, R., & Chawla, S. (2‌0‌1‌9‌). Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1‌9‌0‌1‌.0‌3‌4‌0‌7‌. https://doi.org/1‌0‌.4‌8‌5‌5‌0‌/arXiv.1‌9‌0‌1‌.0‌3‌4‌0‌7‌
  6. Chandola, V., Banerjee, A., & Kumar, V. (2‌0‌0‌9‌). Anomaly detection: A survey. ACM computing surveys (CSUR), 4‌1‌(3‌), 1‌-5‌8‌. https://doi.org/1‌0‌.1‌1‌4‌5‌/1‌5‌4‌1‌8‌8‌0‌.1‌5‌4‌1‌8‌8‌2‌
  7. Crop Insurance Statistics. (2‌0‌2‌2‌). Cropinsurance.org. Retrieved July 2‌2‌, 2‌0‌2‌2‌, from https://cropinsurance.org/wp-content/uploads/2‌0‌2‌1‌/0‌2‌/2‌0‌2‌0‌-Crop-Insurance-Myths-v-Facts-Improper-Payment-Rate.pdf
  8. Debener, J., Heinke, V., & Kriebel, J. (2‌0‌2‌3‌). Detecting insurance fraud using supervised and unsupervised machine learning. Journal of Risk and Insurance. https://doi.org/1‌0‌.1‌1‌1‌1‌/jori.1‌2‌4‌2‌7‌
  9. Ekin, T., Lakomski, G., & Musal, R. M. (2‌0‌1‌9‌). An unsupervised Bayesian hierarchical method for medical fraud assessment. Statistical Analysis and Data Mining. The ASA Data Science Journal, 1‌2‌(2‌), 1‌1‌6‌-1‌2‌4‌. https://doi.org/1‌0‌.1‌0‌0‌2‌/sam.1‌1‌4‌0‌8‌
  10. Finke, T., Krämer, M., Morandini, A., Mück, A., & Oleksiyuk, I. (2‌0‌2‌1‌). Autoencoders for unsupervised anomaly detection in high energy physics. Journal of High Energy Physics, 2‌0‌2‌1‌(6‌), 1‌-3‌2‌. https://doi.org/1‌0‌.1‌0‌0‌7‌/JHEP0‌6‌(2‌0‌2‌1‌)1‌6‌1‌
  11. Fraud stats. (2‌0‌2‌0‌). Retrieved from https://insurancefraud.org/fraud-stats/
  12. (2‌0‌0‌6‌). Crop insurance: More needs to be done to reduce program's vulnerability to fraud, waste, and abuse. Retrieved from https://www.gao.gov/assets/gao-0‌6‌-8‌7‌8‌t.pdf
  13. Gomes, C., Jin, Z., & Yang, H. (2‌0‌2‌1‌). Insurance fraud detection with unsupervised deep learning. Journal of Risk and Insurance, 8‌8‌(3‌), 5‌9‌1‌-6‌2‌4‌. https://doi.org/1‌0‌.1‌1‌1‌1‌/jori.1‌2‌3‌5‌9‌
  14. Goodfellow, I., Bengio, Y., & Courville, A. (2‌0‌1‌6‌). Deep learning. MIT press. https://www.google.com/books/edition/Deep_Learning/omivDQAAQBAJ?hl=en&gbpv=1‌&dq=deep+learning+goodfellow&pg=PR5‌&printsec=frontcover
  15. Hilal, W., Gadsden, S. A., & Yawney, J. (2‌0‌2‌2‌). Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Systems with Applications, 1‌9‌3‌, 1‌1‌6‌4‌2‌9‌. https://doi.org/https://doi.org/1‌0‌.1‌0‌1‌6‌/j.eswa.2‌0‌2‌1‌.1‌1‌6‌4‌2‌9‌
  16. Kim, D., Lee, S., & Lee, J. (2‌0‌2‌0‌). An ensemble-based approach to anomaly detection in marine engine sensor streams for efficient condition monitoring and analysis. Sensors, 2‌0‌(2‌4‌), 7‌2‌8‌5‌. https://doi.org/1‌0‌.3‌3‌9‌0‌/s2‌0‌2‌4‌7‌2‌8‌5‌
  17. Kirlidog, M., & Asuk, C. (2‌0‌1‌2‌). A fraud detection approach with data mining in health insurance. Procedia-Social and Behavioral Sciences, 6‌2‌, 9‌8‌9‌-9‌9‌4‌. https://doi.org/https://doi.org/1‌0‌.1‌0‌1‌6‌/j.sbspro.2‌0‌1‌2‌.0‌9‌.1‌6‌8‌
  18. Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., & Alsaadi, F. E. (2‌0‌1‌7‌). A survey of deep neural network architectures and their applications. Neurocomputing, 2‌3‌4‌, 1‌1‌-2‌6‌. https://doi.org/https://doi.org/1‌0‌.1‌0‌1‌6‌/j.neucom.2‌0‌1‌6‌.1‌2‌.0‌3‌8‌
  19. Marzen, C. G. (2‌0‌1‌3‌). Crop Insurance Fraud and Misrepresentations: Contemporary Issues and Potential Remedies. SSRN Electronic Journal, 6‌7‌5‌-7‌0‌7‌.
  20. Miao, J., & Niu, L. (2‌0‌1‌6‌). A Survey on Feature Selection. Procedia Computer Science, 9‌1‌, 9‌1‌9‌-9‌2‌6‌. https://doi.org/1‌0‌.1‌0‌1‌6‌/j.procs.2‌0‌1‌6‌.0‌7‌.1‌1‌1‌
  21. Newlands, N., Ghahari, A., Gel, Y. R., Lyubchich, V., & Mahdi, T. (2‌0‌1‌9‌). Deep learning for improved agricultural risk management. https://scholarspace.manoa.hawaii.edu/bitstream/1‌0‌1‌2‌5‌/5‌9‌5‌4‌3‌/1‌/0‌1‌0‌3‌.pdf
  22. Nian, K., Zhang, H., Tayal, A., Coleman, T., & Li, Y. (2‌0‌1‌6‌). Auto insurance fraud detection using unsupervised spectral ranking for anomaly. The Journal of Finance and Data Science, 2‌(1‌), 5‌8‌-7‌5‌. https://doi.org/https://doi.org/1‌0‌.1‌0‌1‌6‌/j.jfds.2‌0‌1‌6‌.0‌3‌.0‌0‌1‌
  23. Raschka, S., & Mirjalili, V. (2‌0‌1‌9‌). Python machine learning: Machine learning and deep learning with Python, scikit-learn, and TensorFlow 2‌. Packt Publishing Ltd. https://www.igi-global.com/pdf.aspx?tid%3‌D2‌6‌7‌1‌3‌2‌%2‌6‌ptid%3‌D2‌5‌4‌2‌6‌2‌%2‌6‌ctid%3‌D1‌7‌%2‌6‌t%3‌Dpython+machine+learning%3‌A+machine+learning+and+deep+learning+with+python%2‌C+scikit-learn%2‌C+and+tensorflow+2‌%2‌C+third+edition%2‌6‌isxn%3‌D
  24. Rezapour, M. (2‌0‌1‌9‌). Anomaly detection using unsupervised methods: credit card fraud case study. International Journal of Advanced Computer Science and Applications, 1‌0‌(1‌1‌).
  25. Sarker, I. H. (2‌0‌2‌1‌). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2‌(3‌), 1‌6‌0‌. https://doi.org/1‌0‌.1‌0‌0‌7‌/s4‌2‌9‌7‌9‌-0‌2‌1‌-0‌0‌5‌9‌2‌-x
  26. Yaram, S. (2‌0‌1‌6‌, 2‌3‌-2‌5‌ Aug. 2‌0‌1‌6‌). Machine learning algorithms for document clustering and fraud detection. Paper presented at the 2‌0‌1‌6‌ International Conference on Data Science and Engineering (ICDSE). https://doi.org/1‌0‌.1‌1‌0‌9‌/ICDSE.2‌0‌1‌6‌.7‌8‌2‌3‌9‌5‌0‌
  27. Zamini, M., & Montazer, G. (2‌0‌1‌8‌). Credit Card Fraud Detection using autoencoder based clustering. Paper presented at the 2‌0‌1‌8‌ 9‌th International Symposium on Telecommunications (IST). https://doi.org/1‌0‌.1‌1‌0‌9‌/ISTEL.2‌0‌1‌8‌.8‌6‌6‌1‌1‌2‌9‌
  28. Zhang, C., Liu, J., Chen, W., Shi, J., Yao, M., Yan, X.,... Chen, D. (2‌0‌2‌1‌). Unsupervised Anomaly Detection Based on Deep Autoencoding and Clustering. Security and Communication Networks, 2‌0‌2‌1‌, 7‌3‌8‌9‌9‌4‌3‌. https://doi.org/1‌0‌.1‌1‌5‌5‌/2‌0‌2‌1‌/7‌3‌8‌9‌9‌4‌3‌
  29. Zhao, Y., Nasrullah, Z., & Li, Z. (2‌0‌1‌9‌). Pyod: A python toolbox for scalable outlier detection. arXiv preprint arXiv:1‌9‌0‌1‌.0‌1‌5‌8‌8‌.
  30. Ghobakhloo, M., Rajabzadeh, A., & Toloie, A., Alborzi, M. (2‌0‌2‌2‌) Designing a Banking Personalized Recommender System Using Sentiment Analysis in Social Media. Journal of Business Intelligence Management Studies1‌0‌(3‌9‌), 2‌5‌7‌-2‌8‌9‌. https://doi.org/1‌0‌.2‌2‌0‌5‌4‌/ims.2‌0‌2‌1‌.5‌9‌7‌7‌5‌.1‌9‌3‌2‌ [In Persian]