Utilization of Data Mining on MSMEs using FP-Growth Algorithm for Menu Recommendations

,


INTRODUCTION
Current technological developments have a major impact on human life. One of the factors is how humans use data [1]. Data becomes a significant factor in everything, such as how to increase sales results, consumer spending patterns, and their desires, and seeing current market trends [2]. Along with data storage in the form of softcopy and cloud, it makes data easier to do computerized processing [3]. As a result, the data stored on the server or cloud is increasingly piling up, increasing in size and becoming big data [4]. In addition, technological developments make big data technology significant in providing results because it has been integrated with social media [5].
The Cafe is identical to a micro restaurant that sells a variety of snacks and drinks served with the concept of an interesting place to relax and spend time just chatting about personal and non-personal matters. In general, cafes only provide various kinds of coffee and non-coffee drinks to food dishes according to the cafe theme, served dine-in or eat on the spot. However, several cafes provide takeaway and delivery services [6]. Over Limit Caf is a reasonably popular cafe in the area. At first, Over Limit Caf was a small cafe with few employees. However, Over Limit Caf already has approximately four employees. Cafe Over Limit has problems determining menu recommendations; namely, the optimal presentation concept has not been created in determining the menu recommendations given to customers. For this reason, it is necessary to determine the proper menu recommendations so that customers are expected not to waste too much time when ordering the available menu items.
The transaction at Over Limit Caf is that the customer comes to the cashier, and the cashier records what the customer wants to buy. The recorded data will be stored as a sales transaction memorandum and is not used optimally so that the existing data is only stored or used as transaction history. The availability of quite a lot of data at Over Limit Cafe, of course, has a sales transaction pattern that resembles each transaction that can be utilized and used as a consideration through the understanding of data mining science. Data mining is useful in exploratory analysis scenarios where there is no predetermined idea of what an interesting outcome concept would be. Data mining is needed in the new quest to determine the concept of valuable results and non-trivial information in the volume of datasets to draw conclusions on data that has been formed, which is achieved in a balance from human knowledge to visualizing pictures of problems and specific goals assisted by computer search capabilities [7].
This study uses association rules as a data mining method that will be applied to obtain results. Association rules in the data mining method are used to determine the directed association of each item in the dataset to characterize the correlation or relationship between various items and other items. In short, the suitability of the characterization formed on each item from the dataset will be combined through association rules [8]. In determining the results in the suitability of the characterization formed for each item, the dataset will be processed using one of several association rules algorithms, namely fp-growth. This study uses the fp-growth method to process the sales transaction data to make it easier for the Cafe to determine customer menu item recommendations [9].
The latest study, which is among other studies that are of relevance to previous research, provides a solution using the association rules algorithm, namely fp-growth, crisp-DM, and association rules to recommend menus to customers by utilizing sales transaction update data at Cafe Over UMKM and limits, which can later be applied to other types of MSMEs in making menu recommendations based on historical transaction data. Previous research stated that implementing a web-based system that uses the crisp-DM method to build a system that can recommend products [10]. Researchers have also made web-based applications with data processing using the fp-growth algorithm, resulting in an analysis of consumer buying patterns in motorcycle spare parts sales transactions [11]. Applying search association rules to sales transaction data using fp-growth produces more accuracy than the a priori algorithm [12]. Furthermore, research conducted by Setyo to determine products that are often sold results in the fp-growth algorithm generating retail item data on the CV. Cahaya Setya [13]. In contrast to the research conducted by Gunadi, the fp-growth algorithm has lower or less significant results than the a priori algorithm on sales transactions [14]. The novelty of this research is that it provides a solution by using the association rules algorithm, namely fp-growth, crisp-dm, and association rules, to recommend menus to customers by utilizing sales transaction update data at Over Limit Cafe. This research aims to make it easier for MSMEs to determine customer menu recommendations.

RESEARCH METHOD
This study used a combined method of fp-growth, association rule, data mining, and crips-dm. Figure 1 shows the flow of the stages of the research carried out. Frequent pattern growth, better known as the fp-growth algorithm, is an a priori algorithm developed by definition as an approach to finding and determining the dominant data series in the itemset (frequent itemset) domain. The fp-growth algorithm does not require candidate generation activities because it puts forward an approach from the concept of building a tree (fp-tree) at the stage of obtaining dominant data in the itemset domain. As a result, this algorithm becomes more efficient in obtaining dominant data in the itemset domain (frequent itemset) compared to the a priori algorithm [15]. Association rules are part of data mining techniques with the main procedure for finding patterns of relationships that are formed between an item contained in the dataset; in that context, the items contained in the dataset must have a relationship between one item and another item, if the item is related to the item otherwise, the association rule is useful for discovering the rules governing how or why these items are often purchased together [16]. Data mining is useful in exploratory analysis scenarios where there is no predetermined idea of what an interesting outcome concept would be. Data mining is needed in the new quest to determine the concept of valuable results and non-trivial information in the volume of datasets to draw conclusions on data that has been formed, which is achieved in a balance from human knowledge to visualizing pictures of problems and specific goals assisted by computer search capabilities [17]. CRISP-DM is an acronym that comes from the term Cross-Industry Standard Process for Data Mining, which is a model that provides an overview of the life cycle of a data mining project consisting of 6 stages [18].

Data Preparation
At this stage, the researcher conducts a literature review of journals conducted by previous researchers to get the latest from this research.

Association Rule
At this stage, the researcher conducts a literature review of journals conducted by previous researchers to get the latest from this research.

State Transition Diagram
This study's application design uses the state Transition Diagram with the Harel model. State Transition Diagrams are designed to illustrate ongoing processing on a system with states that will be connected between the states formed [19].

Testing
Obtaining the test results will describe in detail the results of the data testing that has been completed and processed by the system. The test results are the results of data calculations using the fp-growth algorithm or method to form a pattern of sales transactions at the MSME Over Limit Caf and the search for recommendations for food and drink menus at the Over Limit Cafe. ISSN: 2476-9843

RESULT AND ANALYSIS
The stages in application design begin with data preparation, then proceed to generate association rule, and next to the design of a state transition diagram display.

Data Preparation
This stage focuses on forming the Cafe Over Limit dataset, which will be processed to obtain output as association rules based on the dataset used from March 2021 to August 2021 as transaction data on food and beverage menu sales at Cafe Over Limit. First, the data will be tested to form Association rules which focus on finding association rules based on the output values of support and confidence using the FP-Growth algorithm.

Association Rule
The stages in the FP-Growth algorithm in the context of forming Association rules can be seen in the following steps:

Minimum Support Value Stage
The initial stage in Association rules is the analysis process regarding the dominant item in the entire transaction history, commonly known as providing support value. At this stage, it will produce an output of the assumption of proof of an item, whether it is significantly feasible to find the confidence value of the item or vice versa. The results of the study conducted by Hasan assume that the range value range of 10% to 20% is a significant minimum support value, so the application in the current study has a minimum support value of 10% in the process of taking samples from the range set [20].

Minimum Confidence Value Stage
Next is the stage of providing a support value with a minimum confidence value. Giving a minimum confidence score is a step for measuring the effectiveness of the strength of a relationship that occurs between an item and other items in the context of items in sales transactions within the scope of the association rule [16]. The assumption of provisions in making decisions from the minimum confidence value will be good if the minimum confidence range obtains a value of 50% to 60% [21] so that this study applies a sample from the minimum confidence range of 50%.

Generate Frequent Itemset Stage
Next, the generate frequent itemset stage is searching for items contained in transaction data that have met the acquisition of the minimum support value. So it shows that the results of the step generate frequent itemset at Cafe Over Limit in obtaining the minimum support value are shown in the following Table 1.

Ordered Itemset Stage
The stages of the ordered itemset step adjust the order of the items in the acquisition of data transactions resulting from generating frequent itemsets. The order of items at this stage focuses on the items presented in the results of generating frequent itemsets, where the order of transactions will follow the transaction data.

FP-Tree Formation Stage
This stage aims to visualize the acquisition of specific data transactions for each dominant item with similarities between items that will be applied to an FP-Tree path. The FP-Tree path allows the formation of overlapping items between one another; the higher the dominance items that have similarities between certain items, the more effective the FP-Tree path data processing will be.

Generate of Conditional Pattern Base Stage
The stage of generating the conditional pattern base is the stage in the sub-database by applying the suffix pattern and prefix path. This stage has the assumption of forming an FP-Tree in its implementation; if the formation of the FP-Tree has not been carried out, then indirectly, the conditional pattern base generation cannot be applied or generated. However, this can be implemented because it has been formed by the formation of the FP-Tree, as shown in Table 2.

Generate of Conditional FP-Tree Stage
The FP-Tree conditional stage is the sum of the support counts for each conditional pattern base. Assuming that there are items with a support count value greater than (>=) minimum support, it can be generated through the conditional FP-Tree stages, as shown in Table 3.

Frequent Itemset Search Stage
The frequent itemset search stage focuses on whether the conditional FP-Tree is a single path. The resulting output is a frequent itemset by applying a combination of each item to each conditional FP-tree. Meanwhile, if the conditional FP-Tree is not a single path, it is necessary to generate FP-Growth recursively. The results of this stage are shown in Table 4.

Modelling Stage
At the modeling stage, the focus is on obtaining data generated in the FP-Growth process, which will enter a new stage, namely the association rule process, with the output generated from the FP-Growth process as a list array. The acquisition of the list array will then enter the stage of establishing the association rule.

State Transition Diagram Design
The STD design of the FP-Growth page describes the calculation of a dataset used with the FP-Growth algorithm approach. For example, Figure 2 shows where the FP-Growth page will input a calculation sheet when selecting an existing transaction date and input the obtained result value from minimum support and the obtained result value from minimum confidence. The State Transition Diagram of the Calculation Results page has a mechanism to form association rules with the FP-Growth algorithm approach. Assuming that each rule is formed can function as a recommended solution for food and beverage menus, as shown in Figure 3.

Testing
The results of this test focus on knowing the accuracy of the results on each data transaction formed or generated from the association rule. The lift ratio is a reference assumption in obtaining the validity of the transaction patterns in this study. The lift ratio has a decision-making assumption; if the lift ratio is less than one (< 1), then it can be assumed that the resulting item is categorized as negatively correlated with item B so that the item does not have a significant relationship with other items. Meanwhile, suppose the lift ratio results are greater or more than number one (> 1). In that case, it can be assumed that the relationship produced by item A with item B is categorized as positively correlated. The decision assumption, if the resulting value is equal to one (= 1), has the assumption that item A and item B are categorized as independent (seen in Table 5).  Table 5 shows the results of several rules, including "if a customer orders mariam chocolate cheese milk, then the customer will order Kopsus Overlimit" with a support value of 10.79%, a confidence value of 54.19%, and a lift ratio value of 0.93, it can be categorized as invalid rules or negatively correlated because the lift ratio value is < 1. While the next rule, "If customers order Kopsus Overlimit, then the customer will order Tahu Rumah Nenek," with a support value of 34.69%, a confidence value of 59.76%, and a lift ratio value of 1.15, can be categorized as a valid rule or positively correlated because the lift ratio value is > 1. Furthermore, the last rule, "If customers order Tahu Rumah Nenek, then the customer will order Kopsus Overlimit," with a support value of 34.69%, a confidence value of 66.7%, and a lift ratio value of 1.15, can be categorized as a valid rule or positively correlated because the lift ratio value is > 1. Therefore, from the results of the rules made, it can be seen that only two rules can be categorized as valid and used as a reference in recommendations for food and drink menus at MSME Over Limit Cafe.
MSMEs can make comparisons to determine menu recommendations that will be offered to customers compared to menu recommendations by utilizing data mining using an a priori algorithm based on transaction history data. Furthermore, based on this research, MSME can determine that it is easier to determine customer menu recommendations using the a priori algorithm. However, they need a system that can do this. Finally, this research can lead to appeals and other actions utilizing transaction data to determine menu recommendations. This will encourage other MSMEs to use transaction data to increase business value and become more competitive.

CONCLUSION
Implementing the fp-growth algorithm approach can obtain significant output in knowing transaction patterns resulting from purchase transactions on menus or items from food and beverages. This implementation can also determine the proper menu recommendations or food and beverage items based on the pattern of continuity of purchases or transactions from the MSME Over Limit Caf. The results of the three purchase transaction patterns or rules at the MSME Overlimit Cafe produce two significant purchase transaction patterns or rules. Based on the results of the rules formed, it can be concluded that only two rules can be categorized as valid and can be used as a reference in food and beverage menu recommendations at MSME Cafe Over Limit. The decisions obtained from the two patterns of purchase transactions or significant rules are based on the assumption that the higher or greater the support value used, the more the confidence value and the lift value will be ratio to the rule that is formed, and the rule will be better. Further research related to the implementation of data mining that MSMEs can use for menu recommendations can use other methods, such as a priori or market basket analysis, to see the difference in the level of accuracy obtained. Furthermore, the future researcher can add a few more variables so that more relationships formed from the resulting association rules can be displayed and more detailed knowledge can be formed.