Data Mining Techniques and Preference Learning in Recommender Systems

The importance of implementing recommender systems has significantly increased during the last decade. The majority of available recommender systems do not offer clients the ability to make selections based on their choices or desires. This has motivated the development of a web based recommender system in order to recommend products to users and customers. The new system is an extension of an online application previously developed for online shopping under constraints and preferences. In this work, the system is enhanced by introducing a learning component to learn user preferences and suggests products based on them. More precisely, the new component learns from other customers’ preferences and makes a set of recommendations using data mining techiques including classification, association rules and cluster analysis techniques. The results of experimental tests, conducted to evaluate the performance of this component when compiling a list of recommendations, are very promising.


Introduction
Designing an appropriate recommender system, to meet the business needs of clients is the first and foremost consideration of this research.A recommender system for online shopping, based on preference learning, is a potential tool for business development and marketing.In this paper, an online shopping system is extended and based on preference elicitation (Alanazi, Mouhoub, & Mohammed, 2012;Mouhoub, Mohammed, & Alanazi, 2012), to recommend products based on customer suggestions.Recommender systems have significantly increased in the past decade.Preference learning in a recommender system is considered one of the most popular and significant techniques from Information Filtering (Eaton & Wagstaff, 2006;Gemmis, Iaquinta, Lops, Musto, Narducci, & Semeraro, 2009).Information filtering assists in the removal of insignificant information and content that does not need to be stored in a customer profile.When a recommender system is applied, for instance, to learn the interests of users (Eaton & Wagstaff, 2006;Gemmis et al., 2009), it will study and learn some of the user's behavioural aspects in order to generate and recommend a list of products (Eaton & Wagstaff, 2006;Gemmis et al., 2009).Learning the user's preferences is one technique to discover the best outcomes to recommend items (Eaton & Wagstaff, 2006;Gemmis et al., 2009).
Currently, it is important for clients to be assisted with their choices due to the exponential increase in existing data (Gemmis et al., 2009).Adaptive tools, algorithms, and user profiles (Gemmis et al., 2009) are the three most significant components for designing and managing personalized recommendations.The popular recommender systems approaches are Content-Based, Collaborative, Demographic, Knowledge-Based and Hybrid (Suguna & Sharmila, 2013;Tran, Phung, & Venkatesh, 2012;Gemmis et al., 2009).There are many techniques for learning user profiles including probabilistic approaches, neural networks, decision trees and association rules (Deshpande & Karypis, 2004;Gemmis et al., 2009).
The idea of preference learning is easy to understand but challenging to implement.A line of investigation is presented as follows: "Can we learn, and know the preferences of users especially when there are missing data"?Also, "Are there any application platforms or recommender system for online shopping based on learning representation.The information can be processed with machine learning techniques in order to learn user preferences for use in the recommendation procedure (Eaton & Wagstaff, 2006;Gemmis et al., 2009).

Data Mining Techniques
Data mining is the field of extracting valuable information and knowledge from large amounts of data stored in databases.It is the process of finding out formerly unknown, useful and valuable patterns from a large amount of data stored in a database (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005;Han & Kamber, 2006).
Database mining deals with the data stored in a database administration scheme/system.The tools and techniques for data mining identify business trends which may occur in the future.It also answers many questions of businesses with regard to time consumption for decision making (Kaur & Aggarwal, 2010).There are two significant reasons why data mining has attracted and gained a lot of attention in the last few years (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005).It has the capability to store and collect a large amount of data while this storage quickly increases every day.As a result of improvements in processing power, there is the potential to store a large amount of relevant data which can be processed anytime.The most significant reason is the need to transform data into useful and valuable knowledge and information (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005;Han & Kamber, 2006).Data mining examines databases in order to discover hidden patterns and valuable information that sometimes experts may not observe as it occurs outside their expectations (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005).The discovered patterns are accessible to the user and could be stored as new information in the information database (Kaur & Aggarwal, 2010;Han & Kamber, 2006;Han, Kamber, & Pei, 2011).

Data Mining Association Rules
Association rule mining (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005;Han & Kamber, 2006) is a data mining task for finding and discovering hidden associations between items in a transaction.It is a well-known technique to find and discover interesting and attractive relationships between variables and items in large databases (Kaur & Aggarwal, 2010;Han & Kamber, 2006).This method relies on the extraction of an association rule with algorithmic techniques such as the FP-tree, Apriori and AprioriTid algorithms to obtain and generate the appropriate association rules between items in a transaction (Kaur & Aggarwal, 2010;Han & Kamber, 2006;Han, Kamber, & Pei, 2011).More precisely, it is based on association rule evolution by utilizing different measures such as support and confidence factors.Support (s) defines how frequently a rule is appropriate, and applicable to a particular data set, whereas confidence (c) defines how often items in set B appear in transactions containing set A (Tan, Steinbach, & Kumar, 2005).The next two equations are the formal definitions for support (s) and confidence (c) (Tan, Steinbach, & Kumar, 2005): where s is a support and c is a confidence.A and B are sets, and T is a transaction.is the support and confidence count and is the union count of A: Association rules are utilized in several areas, such as "medical diagnosis and research, website navigation analysis, churn analysis and prevention, market basket analysis, and retail data analysis" (Kaur & Aggarwal, 2010;Han & Kamber, 2006;Han, Kamber, & Pei, 2011).A classic example is the market basket analysis where retailers identify and analyze what customers would like or prefer to purchase to find an association between items that customers have purchased.Retailers can identify frequent items between customers to aid and assist them in order to plan diverse item placement, advertising and inventory administration (Kaur & Aggarwal, 2010;Tan, Steinbach, & Kumar, 2005;Han & Kamber, 2006).There are many algorithmic techniques used for association rule mining.The most popular are the Apriori, AprioriTid, Partition, FP growth, and Eclat algorithms (Hipp, G¨untzer, & Nakhaeizadeh, 2000).

Example of Data Mining Association Rules
Figure 1 shows an example of 10 transactions with 6 itemsets (Dell, Apple, Samsung, Sony, LG, Toshiba).In this example, the method for computing and calculating the support (s), and confidence (c) from 10 transactions with 6 itemsets is shown.As mentioned in section 3.1, support (s) defines how frequently a rule is appropriate and applicable to a particular data set, whereas confidence (c) defines how often items in Y appear in transactions which contain X (Tan, Steinbach, & Kumar, 2005) In the follo steps, inclu least twice Step 1: Ca Step 2: Re Step 3: Ge Step 4: Ca in {A, L, S Step 5: Re Step 6: Pro Step 7: Fr are frequen AprioTid a
Step 4: Ch round or sc Step 5: Sta second ite support co Step 6: De third round LG} and { Step 7: Th in order to two colum {Samsung

Data M
The    a low value and decreases when the value is high.Additional solutions and algorithmic techniques can be used in this system in order to recommend products, and make it easier for users.New techniques for preferences and constraints can be implemented and tested on the system to see if the latter can handle more complex preferences and constraints.This recommender system can be generalized, and added to any interactive recommender application where the user and customer are involved in the procedure of choosing their products.

Figure
Figure 5. Red