Management approaches in the field of smart
seyed Mohsen Safavi koohsareh; seyed amin hosseini sano; Amirhossein Mohajerzadeh
Abstract
The primary objective of mobile network operators is arguably to maximize their efficiency. Beyond operational and investment costs, maximizing the utilization of available resources can help them achieve this goal. To this end, operators offer discounted data plans during off-peak hours to encourage ...
Read More
The primary objective of mobile network operators is arguably to maximize their efficiency. Beyond operational and investment costs, maximizing the utilization of available resources can help them achieve this goal. To this end, operators offer discounted data plans during off-peak hours to encourage users to utilize the network during these times. These data plans are typically based on the average traffic load across the entire network at different times of the day. However, they often overlook the fact that traffic patterns can vary significantly across different population areas within a city at various times. In this paper, different population areas are automatically identified using clustering based on traffic patterns. By identifying these areas and considering the traffic patterns specific to each area, the allocation of appropriate data plans for users, based on the regions they frequent, is analyzed and discussed. Additionally, other potential applications of this clustering method for offering various services are presented, followed by a conclusion.IntroductionThe number of cellular network users and their required bandwidth are continuously increasing (Ericsson, 2022). However, limited wireless frequency bands constrain network capacity, prompting operators to deploy dense base stations to reuse radio frequencies in smaller coverage areas, thereby enhancing capacity. Operators plan for peak usage, leading to base station layouts that often remain underutilized for extended periods, resulting in inefficient use of capital (equipment) and operational (energy and maintenance) costs (Liu et al., 2023). To address this, operators offer discounted plans during low-traffic periods but overlook the varying traffic patterns across urban areas, which could enable tailored offers for different regions. This paper proposes a hierarchical clustering-based method to identify and segment urban areas, design region-specific traffic-based plans, and target appropriate users. The main contribution is improving efficiency by maximizing the utilization of existing cellular networks without expanding capacity, benefiting both operators through increased revenue and users through enhanced satisfaction.MethodologyThe best approach to evaluate proposed solutions in cellular networks is to use real-world datasets from mobile operators. Cellular network logs are vast, contain sensitive user and network information, and require algorithms capable of handling large-scale data. In this study, we use a publicly available dataset (Barlacchi et al., 2015) containing telecommunication, weather, news, social media, and power grid data from Milan and Trentino, Italy, spanning November 1, 2013, to June 1, 2014. Our focus is on telecommunication data, specifically Call Detail Records (CDRs), to evaluate the proposed method.The dataset is processed and analyzed using Python and libraries such as NumPy, Pandas, Scikit-learn, and Matplotlib. The proposed method involves clustering base stations based on traffic patterns, designing region-specific data plans, and targeting users during low-traffic periods.3.1. Traffic Pattern-Based Region IdentificationAs mentioned earlier, traffic patterns of cellular base stations vary across urban areas. These patterns are heavily influenced by the stations' locations. For example, base stations in residential areas exhibit different traffic patterns compared to those in commercial, transportation, or recreational zones (Xu et al., 2017).Figure 1: Traffic Patterns of Base Stations in Three Different Population Zones Figure 1 illustrates the traffic patterns of base stations in three different population zones over a week. Zone 3 likely corresponds to recreational areas like amusement parks, with higher traffic on weekends. Zone 1 may represent office areas, with reduced traffic on weekends, while Zone 2 could be industrial or transit areas with consistent traffic throughout the week.To separate these zones, hierarchical clustering is employed (Abubakar et al., 2022). Instead of using Euclidean distance, which fails to distinguish adjacent zones with different traffic patterns, we use traffic time series as the clustering criterion. The chosen algorithm is agglomerative hierarchical clustering (Kassambara, 2017), as shown in Figure 2. Base stations first remove noise from their data and send average traffic data to a central node every x minutes. At the central node, Euclidean distance is used to measure traffic similarity between stations, reducing dimensionality from two dimensions (time series traffic volume) to one (distance between clusters). Over 80% of time series similarity studies use this metric, though some employ deep learning for feature extraction to improve clustering.The Euclidean distance between two base stations' traffic time series Q and C is calculated as:1 To mitigate sensitivity to variations, preprocessing steps include removing outliers, adjusting offsets, and smoothing noise using moving averages (Keogh & Pazzani, 1998).The hierarchical clustering dendrogram (Figure 2) determines the optimal number of clusters by identifying the best cut-off line. Two strategies are proposed:Predefine the number of clusters based on comprehensive traffic pattern analysis and use k-means clustering.Use silhouette scoring to dynamically determine the optimal number of clusters based on traffic similarity.We adopt the second approach, using average silhouette scores (Almeida et al., 2015) to select the optimal number of clusters. This method eliminates the need for predefined cluster counts and provides precise cluster identification.Once clusters are identified, data plans are designed for each cluster based on their traffic patterns.3.2. Designing Data PlansFor each cluster, the average traffic profile is calculated, and data plans are designed inversely proportional to traffic volume. The number of offers q in time interval t is determined by:2 where A and B are the traffic range bounds, S is the current traffic, N is the maximum number of offers, and C is the minimum (0).Alternative models, such as linear, exponential growth/decay, and logarithmic growth/decay, are also explored (Safavi et al., 2024), as shown in Figures 5 and 6.3.3. Targeting UsersUsers with higher overlap with low-traffic periods are prioritized for data plan offers. A user’s average monthly presence in low-traffic intervals is used to rank them. The longest data plans are assigned to users with the highest presence in low-traffic periods, ensuring efficient resource allocation.ResultsSimulations represents the method proposed in this paper, utilize 100% of the network's bandwidth capacity.results demonstrate the optimal utilization of existing equipment and resources, which directly correlates with increased operator profitability. Those also show that the proposed method can maximize resource efficiency by approximately 40%, representing the highest possible improvement in network resource utilizationWe can conclude, the proposed method significantly enhances resource utilization and operator profitability by fully leveraging network capacity. While other scenarios improve resource usage to some extent, only the proposed method achieves 100% utilization, highlighting its effectiveness in optimizing network performance and operational efficiency.Keywords: Mobile Network Operator, Maximizing the Utilization, Cellular Data Plan, Clustering, Traffic Pattern.
Data, information and knowledge management in the field of smart business
Mohammad Kazemi; Mohammad Ali Keramati; Mehrzad Minooie
Abstract
The effort of this article is to solve one of the main problems in the field of banking, which is closely related to the field of information technology. The combination of the management discussion of this issue with the field of information technology will be one of the important topics in the field ...
Read More
The effort of this article is to solve one of the main problems in the field of banking, which is closely related to the field of information technology. The combination of the management discussion of this issue with the field of information technology will be one of the important topics in the field of information technology management. The main purpose of this article is the clustering of bank customers.At first, all customer characteristics were extracted from the bank's database, which was randomly extracted for 900,000 customers and will be provided as input to the proposed method of this article. All the characteristics of these customers were extracted and 10 characteristics (except four characteristics of the LRFM method) were listed using the opinions of experts. The proposed method should be able to choose among these 10 features for customer clustering that results in more resolution in clustering. This makes more suitable features to be placed next to the four features of LRFM and improve the performance of LRFM. Due to the high number of variations in this problem, it is not possible to do it manually and the proposed method tries to provide a separate pattern for clustering for the customers of each bank by examining different situations. Also, the problem of choosing the right value for the number of clusters in the K-means method is solved by the method proposed in this article. The results show that it is better than the basic RFM and LRFM methods.
Introduction
Today, the Achilles heel of all customer-oriented businesses is customer satisfaction and providing services tailored to each customer's situation. This issue has gone so far that regardless of customer satisfaction, any organization will face failure (Otto et al., 2019). One of the main current challenges for customer-oriented organizations is understanding the differences and ranking customers in order to optimally allocate resources. This issue is very important in managing the correct relationship with the customer. Banks are one of the main customer-oriented institutions in the country (Morzdashti et al., 2022). The bank does not do any proper clustering to know its customers and plan future goals. More precisely, it does not have information about the total number of customers and their distribution. Because of this, more time and money is wasted. As far as the research of this article has followed; The clustering that currently exists for customers does not have the necessary dynamics and people are clustered based on some characteristics such as transaction amounts, occupation or other general characteristics.
LRFM model is a method used to cluster customers in customer relationship management. In this model, customers are clustered based on four characteristics of customer relationship, novelty of exchange, number of times of exchange and monetary value exchanged. In fact, the customer relationship length has been added to the RFM model and created the LRFM model. Because, the RFM model was not able to identify loyal customers (Moslehi et al., 2013).
In the proposed model of this article, an attempt will be made to provide a dynamic method for using variables with the LRFM method to provide the possibility of implementing different clusters depending on the time of use. This issue will lead to more compliance of the proposed clustering method with reality.
Research Question(s)
What methodology is used to follow the process of presenting the proposed model?
What features can be placed next to the LRFM model to provide appropriate results?
What methodology is used to follow the process of presenting the proposed model?
What features can be placed next to the LRFM model to provide appropriate results?
What will be the structure of particle swarm algorithm?
What similarity measure or clustering method would be suitable for customers?
How can the LRFM model be improved by the particle swarm algorithm and the creation of different clusters based on the K-means method?
Literature Review
Shrahi and Ali Qoli have implemented a clustering method for the customers of one of Sepeh Bank branches in Tehran (Shrahi and Ali Qoli, 2015). This model is based on K-means clustering algorithm. In this method, an attempt has been made to identify sixty companies loyal to the bank from among all legal customers. However, the K-means algorithm has some problems (Bagatini et al., 2019, Santini, 2016):
Determine the optimal value for the number of clusters.
The initial points that are chosen randomly at the beginning of the algorithm have a great impact on the final result.
The order of data entry and their review is effective in the final result.
Ayoubi has tried to cluster bank customers using Kohonen neural networks (Ayoubi, 2016). In this method, the training of a neural network is done using the training data, and after that it is possible to cluster the new customer.
Yousefizad and Sorayai have also used the RFM model to cluster customers in order to design a model for providing services to customers, which consists of two stages (Yosefizad and Sorayai, 2017).
suggested method:
In this section, the proposed method of the article is described in full detail.
Methodology
In this part, how to improve the LRFM method using the combination of particle swarm algorithm and K-means method is described. All the steps of particle swarm algorithm are followed and its functions and parameters are specified. The steps of the proposed method will be as follows:
Initialization: The schematic of the initial population matrix will be as shown in Figure (2). This matrix consists of two parts. The first part has one element that tries to suggest the number of clusters using the K-means method, and the second part will have 10 binary elements.
Calculating the fitness of each particle: Using the fitness function, the fitness level is determined for each particle present in the population. This fitness level is based on clustering using the K-means method. The appropriateness of the clustering done is measured based on the intraclass variance criterion, which corresponds to the image of the fitness of each particle (Ahmar et al., 2018).
Update of particle values: Using two parameters, local optimum (LBEST) and global optimum (GBEST), the values present in the particles can be updated. By LBEST, we mean the best value that the I-th particle has reached so far (the best-fit value for the I-th particle). Also, GBEST means the value that has the best fit until T iterations. These two values are used to update the values of other particles.
Conclusion
This article tries to provide a dynamic method for clustering bank customers in order to improve their service. The LRFM method has four important features in the field of banking, but its problem is lack of dynamics. More precisely, it is possible that other characteristics such as financial, occupational, or daily transaction characteristics can be added to the four LRFM characteristics and improve the performance of this method. Among all the features that can be placed next to the four features of LRFM; Depending on the customer's data, the appropriate features should be selected. This choice is the responsibility of the particle swarm algorithm. This algorithm tries to put appropriate features along with the four LRFM features depending on the data conditions and customer information to get a better result in clustering. Also, because this algorithm method
K-means helps in finding the number of clusters.
It is also possible to replace the particle swarm with other meta-heuristic methods and compare its results with the results in the article.
Keywords: Relationship Management with Bank Customers, Clustering, RFM model, LRFM Model, Particle Swarm Algorithm, K-Means Method.
Mohammad Kazemi; Mohammad Ali Keramati; Mehrzad Minooie
Abstract
AbstractClustering is a common method for analyzing various data that is used in many fields, including statistical pattern recognition, machine learning, data mining, image analysis, and bioinformatics. Clustering The process of grouping objects similar to different groups, or more precisely, partitioning ...
Read More
AbstractClustering is a common method for analyzing various data that is used in many fields, including statistical pattern recognition, machine learning, data mining, image analysis, and bioinformatics. Clustering The process of grouping objects similar to different groups, or more precisely, partitioning and dividing a set of data, into separate subcategories, the main point of which is not to be specific. The number of classes is in clustering. One of its most widely used uses is in the field of data, the clustering of which is performed by experts in taste. Bank customer clustering has been a challenge from the beginning, and it has been difficult to find consensus among experts to select a feature for grouping.This dissertation seeks to provide a solution for dynamic clustering of bank customers. This clustering will be based on a genetic algorithm and will decide on the number of categories, members of each category, and the similarity criteria used. The dynamics of the method are based on the improvement of the LRFM method using the genetic algorithm. In other words, the genetic algorithm will try to find different information fields about the bank's customers in the database; Put the right fields next to the features used in the LRFM method and get better results for clustering the bank's customers. This process leads to the determination of the criterion of similarity of one customer with another customer and the degree of similarity between them.
Ehsan Kashi; Mehri Shahriari
Abstract
News and rumors about the prevalence of corona virus on social media have a significant impact on people. The aim of this study is to examine the topics discussed by people about corona disease in social media from the beginning of corona prevalence to the present day. The research data were collected ...
Read More
News and rumors about the prevalence of corona virus on social media have a significant impact on people. The aim of this study is to examine the topics discussed by people about corona disease in social media from the beginning of corona prevalence to the present day. The research data were collected from people’s comments in posts related to Corona News on Instagram and analyzed using the method of text mining and clustering. Based on the results of the research, the topics of discussion of the citizens were divided into 10 clusters, which are: Lack of sanitary equipment, lack of attention to quarantine, news and rumors, mental condition, information about symptoms, prevention, control and treatment, government and public actions, lack of personal hygiene, death rate in patients and burial, closure of educational activities And economic problems. Then they were compared with the issues in December and January, when some issues such as access to vaccines, hourly traffic restrictions and the mutated virus were added to the concerns of the people, and some of them were addressed by government measures.
Sina Raeesi Vanani; Iman Raeesi Vanani; Mohammad Taghi Taghavifard
Abstract
Educational performance measurement through the identification and analysis of data extracted from learners’ activities can effectively result in the improvement of educational performance. In this Article, data of international learners was analyzed based on design science methodology and using ...
Read More
Educational performance measurement through the identification and analysis of data extracted from learners’ activities can effectively result in the improvement of educational performance. In this Article, data of international learners was analyzed based on design science methodology and using data mining methods. In this regard, domestic and international research has been reviewed over the past decade and the academic and non-academic data of students were clustered into three categories: family, supportive, and academic behavior. After the validation of algorithms outputs and determining the number of optimal clusters in each category, clusters were labeled and analyzed. Analysis of labels presents the experience of success or failure of students and roots of effective performance in each cluster, and the labeling method proposed is a new and applicable method in most of the learning centers for segmenting and formulating the educational performance.
Mohamammad Ali KhatamiFirouzabadi; MohammadTaghi TaghaviFard; Khalil Sajjadi; Jahanyar Bamdad Soufi
Abstract
Knowing customer behavior patterns, clustering and providing proper services to the customers is one of the most important issues for the banks.In this research, 5 criteria of each customer, including Recency, Frequency, Monetary, Loan and Deferred, were extracted from a bank database during a fiscal ...
Read More
Knowing customer behavior patterns, clustering and providing proper services to the customers is one of the most important issues for the banks.In this research, 5 criteria of each customer, including Recency, Frequency, Monetary, Loan and Deferred, were extracted from a bank database during a fiscal year, and then customers were clustered using K-Means algorithm. Then, a multi-objective model of bank service allocation was designed for each of the clusters. The purpose of the designed model was to increase customer satisfaction, reduce costs, and reduce the risk of allocating services. Given the fact that the problem does not have an optimal solution, and each client feature has a probability distribution function, simulation was used to solve the models. To determine the optimal solution, Simulated Annealing algorithm was used to create neighboring solutions and consequently a simulation model was implemented. The results showed a significant improvement in the current situation. In this research, we used Weka and R-Studio software for data mining and Arena for simulation and optimization
Mohammad’reza Gholamian; Azimeh Mozafari
Abstract
Management and evaluation of valuable customers, is one of the most important banking factors to reduce costs and increase profitability. In recent decades, many researchers have studied on the analysis of the customer attributes to evaluate value of them using data mining techniques and decision tree ...
Read More
Management and evaluation of valuable customers, is one of the most important banking factors to reduce costs and increase profitability. In recent decades, many researchers have studied on the analysis of the customer attributes to evaluate value of them using data mining techniques and decision tree is one of the most widely used data mining algorithms in the field. Since this algorithm for built tree, considers only one attribute at a time to test each node and ignores the dependency between attributes, therefore, required maximum memory is increased. To solve this problem, in this research a method is proposed to improve the decision tree using neural network to explore the dependency between the attributes based on reduction in required maximum memory that is used based on RFM model to predict customer values. Results show that the proposed method using dependencies between attributes will predict the new customer values by less maximum memory compare to the basic method