Document Type : Research Paper
Authors
1 MSc., Computer Science, Islamic Azad University, Malayer Branch, Young and Elite Researchers Club, Malayer, Iran.Corresponding Author: Rezvaneyaghobi2050@gmail.com
2 MSc., Information Technology, Islamic Azad University, Malayer Branch, Iran.
3 Professor, Department of Computer Science, Bu Ali Sina University, Hamadan, Iran.
Abstract
Plagiarism is removal and to put it in their own name the ideas or words of others. With the Increasing progress of the Internet and the proliferation of online articles, scientific theft has also become easier. Many systems have been developed today to detect plagiarism. Most of these systems are based on lexical structure and string matching algorithms. Therefore, these systems can hardly detect recovery robberies, placement of synonyms. This paper presents a method for identifying plagiarism based on semantic role labeling and cellular learning automata. In this paper, cellular learning automata are used to locate the processed words. Semantic role labeling specifies the role of words in sentence. Comparison operations are performed for all sentences of the original text and suspicious text. Results of the experiments on PAN-PC-11 corpus demonstrate the proposed method improves values of evaluation parameters such as recall, precision and F-measure, comparing to previous approaches in plagiarism detection.
Keywords
رضوان، یعقوبی و حسن ختنلو. (1394). شناسایی سرقت ادبی مبتنی بر الگوریتم ژنتیک و برچسبگذاری نقش معنایی در مقالات علمی. فصلنامه صنایع الکترونیک,6(3)، 79-67.
مهدی، شاه آبادی و محمدرضا، میبدی.(1382). الگوریتمهای مرتب سازی جدید برای اتوماتای سلولی دو بعدی. کنفرانس ملی سالانه انجمن کامپیوتر ایران.
References
A.H. Osman, N. S. (2011). Conceptual similarity and graph -based method for plagiarism detection. Journal of Theoretical and Applied Information Technology, 32(2), 135-145.
A.H. Osman, N. S. (2012). An improved plagiarism detection scheme based on semantic role labeling. 12, 1493-1502.
A.Z, B. (1997). On the resemblance and containment of documents. in: Compression and Complexity of Sequences Proceedings.
- Gipp, J. B. (2010). Citiation based plagiarism detection:a new approach to identify plagiarized work language independently. 273-274.
- Gipp, N. M. (2011). Citation pattern matching algorithms for citation-based plagiarism detection:greedy citation tiling, citation chunking and longest common citation sequence. Conference: Proceedings of the 2011 ACM Symposium on Document Engineering, Mountain View, CA, USA, 19-22.
D.R. White, M. J. (2004). Sentence-based natural language plagiarism detection. Journal of Education Resources in Computing, 4(4), 2-3.
Gelbukh, S. (2009). Computing Similarity Measures for Original WSD Lesk Algorithm. Advances in Computer Science and Application, 43, 155-166.
Heintze, N. (1996). Scalable document fingerprinting. in:UNIX Workshop on Electronic Commerce, (pp. 191-200).
K.K. Chow, N. S. (2010). Web based cross language plagiarism detection. in: Second International Conference on Computational Intelligence, Modelling and Simulation, (pp. 199-204).
Kent, N. C. (2010). Features based text similarity detection. Journal of Computing, 2(1), 53-57.
Kriszti, e. (2000). Document overlap detection system for distributed digital libraries. in: Proceedings of fifth ACM conference on Digital libraries, (pp. 226-227). San Antonio, TX, United States.
- Elhadi, A. A.-T. (2008). Use of text syntactical structures in detection of document duplicates. in:Digital Information Management Third International Conference on ICDIM.
- Esmaeilpour, V. N. (2012). Cellualr Learning Automata for Mining Customer Behavior in Shopping Activity. 8(4), 2491-2511.
- Esnaashari, M. M. (2010). Dynamic point coverage problem in wireless sensor networks:a cellular laerning automata approach. Journal of Ad Hoc and Sensors Wireless Networks, 10(2-3), 193-234.
Meuschke, N. S. (2019). Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations. ACM/IEEE-CS Joint Conf. on Digital Libraries (JCDL).
Mohamed, M. &. (2019). SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis. Information Processing & Management, 56(4), 1356-1372.
- Alzahrani, N. S. (2010). Fuzzy Semantic-based Sring Similarity for Extrinsic Plagiarism Detection. CLEF(Notebook papers/LABs/Workshops).
Savargiv, M., Masoumi, B., & Keyvanpour, M. R. (2020). A new ensemble learning method based on learning automata. Journal of Ambient Intelligence and Humanized Computing, 1-16.
Sindhu.L, B. T. (2011). A Study of Plagiarism Detection Tools and Technologies. Interrnational Journal of Research In Technology, 1(1), 64-70.
Thatha, V. N. (2020). An Enhanced Feature Selection for Text Documents. In Smart Intelligent Computing and Applications, 21-29.
The Stanford NLP Group. (2014). Retrieved from https://nlp.stanford.edu/software/lex-parser.shtml#Download
Virmani, D. &. (2019). A text preprocessing approach for efficacious information retrieval. In Smart Innovations in Communication and Computational Sciences, 13-22.
Zhang, F. F. (2019). Construction site accident analysis using text mining and natural language processing techniques. Automation in Construction, 238-248.
References [In Persian]
Yaghobi, R., A & khotanloue, H. (2015). Plagiarism detection in the scientific papers using semantic role labeling and Genetic algorithm. Electronics Industries, 6(3),67-79 .[In Persian]
Shahabadi M., & Meybodi, M. R. (2003). New sorting algorithms for two-dimensional cellular automation. Annual National Conference of the Iranian Computer Association.[In Persian]
استناد به این مقاله: یعقوبی، رضوان، یعقوبی، مهدی، ختن لو، حسن. (1400). رویکردی جدید برای شناسایی سرقت ادبی با استفاده از آتوماتای یادگیر سلولی و برچسبگذاری نقش معنایی، مطالعات مدیریت کسب وکار هوشمند، 9(36)، 183-208. DOI: 10.22054/IMS.2021.49415.1661
Journal of Business Intelligence Management Studies is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License..