Business Intelligence Management Studies

Data science, intelligence and future analysis

Classification of user Comments on Virtual Reality Technology by Topic Modeling

Fariba Karimi; ameneh khadivar; Fatemeh Abbasi

Volume 12, Issue 47 , November 2024, , Pages 1-43

https://doi.org/10.22054/ims.2023.74147.2342

Abstract

In recent years, the rapid growth of virtual space has made people devote more of their time in virtual space, especially to social networks, which can be attributed to the remarkable features of virtual space; including increasing the speed of information exchange, easy and free access to information ... Read More In recent years, the rapid growth of virtual space has made people devote more of their time in virtual space, especially to social networks, which can be attributed to the remarkable features of virtual space; including increasing the speed of information exchange, easy and free access to information and variety of knowledge topics. In this regard, the opinions recorded by users in virtual networks have grown day by day and have become very important, and extracting the opinions and feelings of users' opinions for more informed decision-making is of great help to businesses, on the other hand, virtual reality technology in the past few decades It has undergone technical changes and improved immersion and the feeling of remote presence; This technology is used in various fields such as education, tourism, health, sports, entertainment, architecture and construction, etc. The increasing progress of virtual reality technology has caused many businesses to operate in this field, but due to changes Continuous market and the need for timely information, companies should use differentiation and growth strategies, in this regard, they need to ask users' opinions and in line with that, try to grow and improve their business, considering that Users' comments are textual, and reading and summarizing them is time-consuming and difficult. Based on this, the aim of the current research was to categorize comments related to virtual reality technology using machine learning methods and a dictionary-based approach. Therefore, about one million tweets in the field of virtual reality technology were collected by the web crawler, and after data preprocessing, 480,432 samples remained in the data, then Dirichlet's hidden allocation topic modeling was implemented on the data. This modeling separated different topics by examining the distribution of words in tweets; The tweets whose distribution of words were similar were placed into a topic and the number of topics with the highest coherence score was selected, the number of topics 9 had higher coherence and the data were grouped into 9 topics, so once again the Dirichlet hidden allocation modeling was set to 9. The topic was done, with this the tweets were grouped into 9 different topics. To evaluate the model, considering that we had a probability distribution, the confusion criterion was used, the value of which was -9.44, and the coherence score was used for the degree of semantic similarity between words and the distinction between subjects, and the result was 0.47. The lower the confusion criterion and the higher the coherence score, the more efficient the model is. With the help of keyword weights obtained by Dirichlet hidden allocation modeling and examining at least 5 different tweets from each topic, 9 topics related to virtual reality technology were identified: "New Technology", "Creation and Make", "Technological Business", "Education", "Virtual Games", "Progress", "Gadget", "Metaverse", and "Indiegame", the topics were analyzed with the help of several graphs. We found that the number of neutral comments on topics such as "New Technology" and "Metaverse" is more than positive and negative comments, which indicates the lack of sufficient information or the lack of use of these technologies, and it is necessary for businesses in this field, to try more in this regard, in the same way, if we observe the graph of "Virtual Games" and "Technological Business", we can see that it changes almost with the same ratio in different years, in the sense that this The two graphs are related, in fact, businesses should keep in mind that the factors affecting these two issues are the same, but users pay more attention to the issue of "Virtual Games", as a result, if the creators of "Technological Business" Focus specifically on "Virtual Games", they will grow more due to the more attention of users, also the creators of games should consider that "Virtual Games" are a topic of more attention than "Indiegame". Is. In the subjects of "Education" and "Gadget", users lost their attention to these subjects in the field of virtual reality over time, in fact they showed their attention to other subjects, so it is better for businesses that operate in this field to take measures To advertise and attract users or change their user area if there is no growth. Introduction Constant changes in the market and the need for timely information force companies to use differentiation and growth strategies appropriate to the needs of customers. (Sánchez, Folgado-Fernández, & Sánchez, 2022). Companies can check and analyze their customers' opinions through microblogging sites (Facebook, Twitter, etc.) and finally improve the desired products or services (Ahmad, Aftab, Bashir, & Hameed, 2018). Today, users express their opinions and feelings and review products in online social networks. Therefore, user comments and the analysis of these comments have become a valuable resource for businesses (Kim et al., 2015; Loureiro et al., 2019). Virtual reality and augmented reality have undergone technical developments in the past few decades and have improved immersion and the feeling of remote presence. Several examples of applications of such techniques can be found in stores, the tourism industry, hotels, restaurants, etc. (Loureiro, Guerreiro, & Ali, 2020). Due to the constant changes in the market and the need for timely information, companies should use differentiation and growth strategies, nowadays, due to the rapid evolution of the Internet, instead of collecting their opinions through time-consuming and expensive methods such as questionnaires and interviews, etc., they express in the context of social networks, which is very useful for businesses in their development, and they can measure the feelings of customers towards products and services, and understand the needs of users, and finally make appropriate and appropriate decisions in the direction of adopt growth, but in order to use the produced content correctly, text mining and sentiment analysis techniques should be used, which has not been researched in Iran so far. Analysis of users' opinions and feelings about virtual reality technology can help businesses that operate in the field of metaverse, virtual game production, virtual education, virtual tourism, etc., to make better decisions and plans. Literature Review Social media generates a large amount of real-time social signals that can provide new insights into human behavior and emotions. People around the world are constantly engaged with social media. (Al-Samarraie, Sarsam, & Alzahrani, 2023). On the other hand, the amount of data is increasing day by day. Almost all institutions, organizations and business industries store their data electronically. A huge amount of text is circulating on the Internet in the form of digital libraries, repositories, and other textual information such as blogs, social media networks, and emails (Sagayam, Srinivasan, & Roshni, 2012). Topic modeling is one of the most powerful techniques in text mining for data mining, discovering hidden data and finding relationships between data and textual documents (Jelodar et al., 2017). The technological advances of the last century have confronted societies with new realities that have indisputably improved daily life, making it more convenient and interesting. In recent decades, technology using virtual reality and wearable devices have had a significant impact in the fields of education, tourism, health, sports, entertainment, architecture and construction, etc. (Kosti et al., 2023). Virtual reality is a technology that allows a user to interact with a computer-simulated environment, whether that environment is a simulation of the real world or an imaginary one. With virtual reality, we can experience the most frightening and overwhelming situations with safe play and a learning perspective (Mandal, 2013). Most people are curious about the possibilities and future of new technologies, considering the various applications it is supposed to offer such as virtual meetings, learning environments and many others, however, there are also concerns about potential negative effects. because real world signals can be transmitted in the virtual world. In this regard, people express their feelings in different social networks (Bhattacharyya et al., 2023). Methodology According to the main goal of the research, which is to classify comments related to virtual reality technology using machine learning methods and a dictionary-based approach, therefore, about one million tweets in the field of virtual reality technology were collected by the web crawler and After data preprocessing, 480,432 samples remained in the data, then Dirichlet hidden allocation thematic modeling was implemented on the data. By examining the distribution of words in tweets, this modeling tries to separate different topics by detecting the distribution of words; The tweets whose distribution of words are similar were put into a topic, and the number of topics with the highest score was selected, the number of topics 9 has higher coherence, and the data was grouped into 9 topics, so once again, Dirichlet hidden allocation modeling was applied 9 topics were done, whereby the tweets were grouped into 9 different topics. Considering that we have a probability distribution, the confusion criterion was used to evaluate the model. The lower the confusion criterion and the higher the coherence score, the more efficient the model is. With the help of keyword weights obtained by Dirichlet hidden allocation modeling and examining at least 5 different tweets from each topic, 9 topics related to virtual reality technology were identified: "New Technologies", "Creation and Make", "Technological Business", "Education", "Virtual Games", "Progress", "Gadget", "Metaverse" and "Indiegame" were named. Discussion and Conclusion In this research, by examining topics in different years, we observed that the topic of "Progress" was the most popular topic among users from 2017 to the end of 2021, in early 2022, this topic gave way to "Metaverse", currently "Metaverse" is one of the most popular topics being discussed by users. Businesses in the field of virtual reality should strive for the attractiveness of "Metaverse" and attract users. Likewise, if we observe the "Virtual Games" and "Technological Business" graphs, we can see that they change with almost the same ratio in different years, meaning that these graphs are related to each other, in fact, business and keep in mind that the factors affecting these two issues are the same, but in the case of "Virtual Games" it has more effects, and if "Technological Businesses" specifically focus on virtual games, they will grow more due to the greater attention of users. had Similarly, "Indiegame" which have had a series of changes but in recent years have had a declining trend and then no change, now the creators of these games should check, and in general "Virtual Games" are a more interesting topic than "Indiegame". In the subjects of "Education" and "Gadget" it has been decreasing since the beginning of 2017, which shows that users lost their attention to these subjects in the field of virtual reality over time, in fact to other topics showed their attention, so it is better for businesses that are active in this field to take measures to advertise and attract users, or change their user field if there is no growth. Keywords: Data Mining, Text Mining, Virtual Reality Technology, Topic Modeling, Latent Dirichlet Allocation.

Data science, intelligence and future analysis

Predicting students' performance using machine learning algorithms and educational data mining (a case study of Shahed University)

Mozhdeh Salari; Reza Radfar; Mahdi Faghihi

Volume 12, Issue 47 , November 2024, , Pages 315-366

https://doi.org/10.22054/ims.2023.75523.2375

Abstract

AbstractThe purpose of this research is to investigate the effective factors in predicting the academic performance of undergraduate students in the classification of four classes. To achieve this goal, the study follows the CRISP data mining method. The data set was extracted from the NAD educational ... Read More AbstractThe purpose of this research is to investigate the effective factors in predicting the academic performance of undergraduate students in the classification of four classes. To achieve this goal, the study follows the CRISP data mining method. The data set was extracted from the NAD educational system for the bachelor's degree in Shahed University for the entry of the years 2011 to 2021. 1468 records were used in data mining. First, the effective features on students' academic performance were extracted. Modeling was done using Rapidminer9.9 tool. To improve classification performance and satisfactory prediction accuracy, we use a combination of principal component analysis combined with machine learning algorithms and feature selection techniques and optimization algorithms. The performance of the prediction models is verified using 10-fold cross-validation. The results showed that the decision tree algorithm is the best algorithm in predicting students' performance with an accuracy of 84.71%. This algorithm correctly predicted the graduation of 77.88% of excellent students, 85.26% of good students, 84.69% of medium students, and 85.96% of weak students based on the final GPA. IntroductionThe main problem in this research is to identify the factors that are effective in predicting the academic performance of undergraduate students in Shahed University. Choosing the best machine learning algorithm in predicting academic performance among different modeling methods based on validation and evaluation of models is another issue in the present research. The purpose of this research is to investigate the effective factors in predicting the academic performance of undergraduate students in Shahed University using educational data mining based on classification models.Research questionsThe main question in this research is what factors affect the prediction of undergraduate students' performance and improving their performance?Sub questions1- Which modeling algorithms have better results in predicting student performance?2- What methods have been used to predict students' performance?3- What is the validity of the developed model for Shahed University students? 2- Research background1-2- Theoretical foundationsEducational data miningThe processing of educational data improves the prediction of student behavior and new approaches to educational policies (Capuano & Toti, 2019) (Viberg et al., 2018)Academic performanceAcademic performance of students means the extent to which they achieve educational goals (Banik & Kumar, 2019).2-2- review of past studiesThe highlighted cells in Table 1, based on past research, show the classification algorithms that have the most accuracy and effectiveness in predicting students' performance in the relevant research. The decision tree algorithm has been used the most in previous researches. The NB algorithm has been the most used in research after the decision tree. RF and ANN algorithms are next in use. After that, SVM and KNN algorithms have been used in researchTable 1. The results of research literature based on the use of classification algorithmsData mining algorithmDTRFNBKNNSVMANNLine RLLRAccuracy(Batool et al., 2023) * * (Marjan et al., 2023)****** (Abdelmagid & Qahmash, 2023) * ** * (Manoharan et al., 2023)** * * * (Alghamdi & Rahman, 2023)*** 99.34%(Alboaneen et al., 2022) * **** (Yağcı, 2022)* *** *70-75%(Dabhade et al., 2021)* * * 83.44%(Najafi & etal,2021)* 95%(Soltani & etal,2021)* ** (Cruz-Jesus et al., 2020) * ** *50-81%(Sokkhey & Okazaki, 2020)*** * (Rebai et al., 2020)** (Jayaprakash et al., 2020)*** (Zulfiker et al., 2020)** * (Musso et al., 2020) * (Waheed et al., 2020) * 85%(Salal & Abdullaev, 2019)* **** (Turabieh, 2019)* ** * (Xu et al., 2019)* ** (ghodoosi & etal,2019)* * (fadavi & etal,2019) * 95.84%(Ajibade et al., 2019)* *** 91.5%(Ahmad & Shahzadi, 2018) * 85%(Hasani & Bazrafshan, 2018)* * (Hussain et al., 2018)*** * (Umer et al., 2017)**** * (Khasanah, 2017)* * (Asif et al., 2017)* (Hoffait & Schyns, 2017) * * *92.34%(khosravi &etal,2017)* * (Mueen et al., 2016)* * * 86%(Amrieh et al., 2015)* ** (Yehuala, 2015)* * 92.34%(zahedi & etal,2015)* * * (Punlumjeak & Rachburee, 2015)* (Osmanbegović et al., 2014)** 71%(Shamloo & et al.,2014)* (Asadi & et al.,2013)* (Kabakchieva, 2013)* ** 60-75%(Oskouei & Askari, 2014)*** * 96%(Nghe et al., 2007)* * present research****** 94.17%3- MethodThis study follows the popular training data mining method CRISP. The data collection of Nad educational system for bachelor's degree in non-medical fields of Shahed University has been extracted from 2011 to 2021. We used the Label Encoder technique to encode the features. In this research, C4.5 and ID3 decision tree classification algorithms, random forest, Naïve Bayes, k-nearest neighbor and artificial neural network and gradient enhanced tree were used to analyze and classify students and predict the final GPA. Modeling was done using RapidMiner 9.9. To improve the classification performance and solve the misclassification problem, we use a combination of principal component analysis and feature selection techniques and optimization algorithms. In this research, prediction accuracy was evaluated using 10-fold cross-validation method for all algorithms. Also, different algorithms were compared using the analytical descriptive method and based on evaluation criteria, and the best prediction model was introduced in this research.4-Data analysis4-1 IntroductionThe best model is the model that has the best values for the selected performance measurement criteria(Lever et al., 2016). Figure 1 is a graph that compares the accuracy of the algorithms used in this research.Figure 1. Comparative chart of the accuracy of the algorithms According to Table 2, the DTC4.5 algorithm is able to predict the class of 1235 objects out of 1458, which gives it an accuracy value of 84.71%.Table 2. Confusion matrix of DT C4.5-GI&OSE research modelprecisionStudents with poor performanceStudents with average performanceStudents with good performanceStudents with excellent performance 78.64%002281Prediction 178.67%94929522Prediction 286.46%50498271Prediction 389.36%3614120Prediction 4 85.95%84.69%85.26%77.88%Recall4-2 important featuresThe prioritization of predictive variables based on their weight is as follows:Diploma GPA: 0.262Semester 1 GPA: 0.201Semester 2 GPA: 0.197Number of honors semesters: 0.122Conditional number: 0.114Year of entry: 0.1044-3 The results of the implementation of the student performance prediction modelThe results of the prediction model are shown in Table 3:Table 3. The results of the DT C4.5-GI&OSE model implementation 5- DiscussionIn the main method of research, namely DT C4.5-GI&OSE, in the classification mode of four classes, it is observed that the average of the diploma has the greatest effect on the process of predicting student performance. In response to the sub-question of a research, the best algorithm in the four-class mode is Decision Tree C4.5-GI&OSE with a prediction accuracy of 84.71. This model showed 84.17% accuracy, 83.42% sensitivity and 0.780 kappa. DT C4.5-GI&OSE technique correctly predicted the graduation of 77.88% of excellent students, 85.26% of good students, 84.69% of average students, and 85.96% of poor students.6-ConclusionThe obtained results show that there is a relationship between students' social and academic characteristics and their academic performance. DT C4.5-GI&OSE algorithm was the best algorithm for predicting the final GPA scores of students at the end of studies with a prediction accuracy of 84.71%. In this model, the average grade point average of the diploma has the greatest effect on the prediction process. Using machine learning models as a decision support tool improves the academic level of students and reduces the number of potential unsuccessful and dropout students. This study was carried out at the undergraduate level, which can be used in future research for the master's and doctoral level.Keywords: student performance prediction, data mining, machine learning, modeling, improving the quality of education

Branch Client Behavior Analysis Using RFM Method

Fateme Rahimi; mohammad vahid sebt; nasim ghanbar tehrani

Volume 9, Issue 36 , August 2021, , Pages 189-209

https://doi.org/10.22054/ims.2021.50853.1697

Abstract

In today's competitive world, applying new techniques to business development has a great impact. The restaurant industry is no exception. Therefore, in this research, using new methods of knowledge discovery and data mining, customer data of chain restaurant is investigated. The purpose of this study ... Read More

A Model for Learners Segmentation and Educational Performance Improvement Using Data Mining Algorithms

Sina Raeesi Vanani; Iman Raeesi Vanani; Mohammad Taghi Taghavifard

Volume 9, Issue 33 , December 2020, , Pages 5-38

https://doi.org/10.22054/IMS.2019.7824.1092

Abstract

Educational performance measurement through the identification and analysis of data extracted from learners’ activities can effectively result in the improvement of educational performance. In this Article, data of international learners was analyzed based on design science methodology and using ... Read More

The Impact of E-CRM on Customer Loyalty Using Data Mining Techniques

Hassan Rangriz; Zahra Bayrami Shahrivar

Volume 7, Issue 27 , July 2019, , Pages 175-205

https://doi.org/10.22054/ims.2019.9987

Abstract

With the expansion of the Internet, various tools have been used to communicate with customers in organizations, and organizations use different E-CRM methods to create competitive advantages. Since customer loyalty is critical to achieving competitive advantage and profitability for organizations, one ... Read More

A New Method to Cluster HTML Documents Using Mixed Algorithms

Maryam Shoar; Ali Asghar Salarnezhad

Volume 6, Issue 24 , May 2018, , Pages 37-62

https://doi.org/10.22054/ims.2018.8891

Abstract

Given the high volume of web information, more attention has been paid to the automatic data extraction systems. One of the most important methods of data extraction is clustering. Today, many clustering methods are provided which are mostly based on vector models. In these models, each document ... Read More

Optimal Feature Selection in order to Bank Customer Credit Risk Determination

Mojtaba Salehi; Alireza Korde Katooli

Volume 6, Issue 22 , May 2018, , Pages 129-154

https://doi.org/10.22054/ims.2018.8523

Abstract

Credit risk interprets as the probability of obligations non-repayment by customer in due date is considered as one of causes financial institutions bankruptcy. For this purpose, data mining techniques such as neural networks, Decision Tree, Bayesian networks, Support Vector Machine is used for customer ... Read More

A New Approach for Exceptional Phenomena Knowledge Detection and Analysis by Data Mining

Masoud Abessi; Elahe Hajigol Yazdi; Hassan Hoseini Nasab; Mohammad Bagher Fakhrzad

Volume 3, Issue 12 , September 2015, , Pages 1-20

Abstract

Learning logic of exceptions is a considerable challenge in data mining and knowledge discovery. Exceptions are the rare phenomenon with positive unusual behavior in a database. Creating an efficient framework to increase the reliability in the detection of exceptions in the knowledge and learning is ... Read More

Resistance and Refusal to Mobile Payment: Analysis of the Iranian Characteristics

Jamshid salehisadegheiani; Samaneh Sorournejad; Reza Ebrahimi Atani; Maryam Akhavan Kharazian; Mousa Rezvani Chamazamin

Volume 1, Issue 2 , December 2013, , Pages 147-162

Abstract

In recent years, a number of new payment solutions have been introduced in mobile commerce although with less success. The existence of standardized and widely accepted mobile payment (also known as MP) procedures is crucial for successful business-to-customer mobile commerce. On the other hand, Non-acceptance ... Read More

Business Intelligence Management Studies

Articles in Press

Current Issue

Volume 12 (2023)

Volume 11 (2022)

Volume 10 (2021)

Volume 9 (2020)

Volume 8 (2019)

Volume 7 (2018)

Volume 6 (2017/18)

Volume 5 (2017)

Volume 4 (2015)

Volume 3 (2015)

Volume 1 (2012)

Volume 2 (1392)

Keywords = Data Mining

Classification of user Comments on Virtual Reality Technology by Topic Modeling

Abstract

Predicting students' performance using machine learning algorithms and educational data mining (a case study of Shahed University)

Abstract

Branch Client Behavior Analysis Using RFM Method

Abstract

A Model for Learners Segmentation and Educational Performance Improvement Using Data Mining Algorithms

Abstract

The Impact of E-CRM on Customer Loyalty Using Data Mining Techniques

Abstract

A New Method to Cluster HTML Documents Using Mixed Algorithms

Abstract

Optimal Feature Selection in order to Bank Customer Credit Risk Determination

Abstract

A New Approach for Exceptional Phenomena Knowledge Detection and Analysis by Data Mining

Abstract

Resistance and Refusal to Mobile Payment: Analysis of the Iranian Characteristics

Abstract