A Comparative Analysis Of Memory-based And Model-based Collaborative .

Transcription

A Comparative Analysis of Memory-based andModel-based Collaborative Filtering on theImplementation of Recommender System for Ecommerce in Indonesia : A Case Study PT XP. H. Aditya, I. Budi, Q. Munajat1Faculty of Computer Science, University of IndonesiaDepok, Indonesiapandhu.hutomo@ui.ac.id, indra@cs.ui.ac.id, qoribmunajat@cs.ui.ac.idAbstract The increasing growth of e-commerce industry inIndonesia motivates e-commerce sites to provide better servicesto its customer. One of the strategies to improves e-commerceservices is by providing personal recommendation, which can bedone using recommender systems. However, there is still lack ofstudies exploring the best technique to implement recommendersystems for e-commerce in Indonesia. This study compares theperformance of two implementation approaches of collaborativefiltering, which are memory-based and model-based, using datasample of PT X e-commerce. The performance of each approachwas evaluated using offline testing and user-based testing. Theresult of this study indicates that the model-based recommendersystem is better than memory-based recommender system inthree aspects: a) the accuracy of recommendation, b)computation time, and c) the relevance of recommendation. Fornumber of transaction less than 300,000 in database, respondentsperceived that the computation time of memory-basedrecommender system is tolerable, even though the computationaltime is longer than model-based.Keywords e-commerce; collaborative filtering; recommendersystem; memory-based; model-based.I.INTRODUCTIONIt is important for e-commerce sites to provide innovativefeatures to compete with others. There are three categories of ecommerce features which are Transactional, Relational, andSocial [1]. Recommender system, as a relational feature, is oneof important features that need to be implemented to improvethe quality of e-commerce services. Some e-commerce sitesthat have implemented the recommenders are amazon.com andeBay [2].Based on the method of implementation, recommendersystems generally can be divided into two, memory-based ation by accessing the database directly, whilemodel-based method uses the transaction data to create a modelthat can generate recommendation [3]. By accessing directly todatabase, memory-based method is adaptive to data changes,but requires large computational time according to the datasize. As for model-based method, it has a constant computingtime regardless the size of the data but not adaptive to datachanges.McCarey, Cinneide, and Kushmerick [4] conducted a studyto evaluate memory-based and model-based collaborativefiltering on software library. The research results show thatmemory-based approach is superior on two aspects which areprecision and recall. Robillard and Walker [5] states that thenature of recommender system on software engineering and ecommerce domain are different. In the domain of softwareengineering, the recommendations are made by the taskcontext, whereas in the domain of e-commerce, therecommendations are very dependent on the user's profile [5].Considering the differences, it is necessary to study theimplementation of recommender systems in e-commercedomain. Currently, there is still lack of studies that conductcomparative studies of model-based and memory-basedrecommender systems on the domain of e-commerce inIndonesia.Motivated by the growth of e-commerce industry inIndonesia, it is important for e-commerce sites in Indonesia toimplement recommender systems to improve its servicequality. However, there is still lack of studies that provide thebest practices to implement recommender systems within thedomain of e-commerce in Indonesia. Therefore, this studywants to give contribution by exploring two approaches ofrecommender system implementation which are memory-basedand model-based collaborative filtering on e-commerce inIndonesia. In order to perform the study, one e-commercecompany in Indonesia is selected as a case study. Theperformance of each method is evaluated based on thecomputation time, accuracy, and relevance of therecommendation.To explain the conduct of the study, the paper is structuredas follows. We first explained the context of e-commerce andthe related theories in recommender systems followed by theresearch methodology explanation. Then, we explained theimplementation of the recommender systems followed by theevaluation. The results and analysis is discussed in section 5and finally section 6 conclude the finding of this research.II.RECOMMENDER SYSTTEMS IN E-COMMERCEThis study applied recommender systems within thedomain of e-commerce. Kalakota and Whinson [6] defines e-

commerce from various perspectives, one of which is onlineperspective. This study uses the online-perspective whichdefines e-commerce as an online system that can provideproduct information and enable users to perform a transaction[6]. In e-commerce domain, recommender systems contributeby improving its information and service quality which are twoof the six dimensions in IS success model [7]. Recommendersystems improves the information and service quality byproviding personalized product recommendations to users.In the context of e-commerce, recommender system isdefined as software and method that provide suggestions aboutproducts to consumers [8]. In general, recommender systemconsists of several components which are databases, filteringalgorithm, implementation method, and evaluation method. [3].Among those components, filtering algorithm is the maincomponent that defines how recommender systems generatesuggestions. There are several filtering algorithms that can beused in recommender systems; which are content-basedfiltering, demographic filtering and collaborative filtering. Inrecommender systems domain, collaborative filtering is one ofthe most successful filtering algorithm [9][10] so that it ischosen as the filtering algorithm in this study.The basic idea of collaborative filtering is that collaborativefiltering make predictions based on the opinions of users withsimilar characteristics [9]. In a general scenario of collaborativefiltering, there are a list of m user, i.e U {u1, u2, ., um}, a listof n items, i.e I {i1, i2, ., in}, as well as the opinion about theitem which is also known as rating [9]. Collaborative filteringcan be implemented using two approaches, model-based andmemory-based. Memory-based collaborative filtering uses allthe data in the database to generate a prediction while themodel-based collaborative filtering uses the data in thedatabase to create a model that can then be used to generatepredictions [11].A. Memory-based Collaborative FilteringMemory-based collaborative filtering utilizes the entireuser-item data to generate predictions. The system usesstatistical methods to search for a set of users who have similartransactions history to the active user. This method is alsocalled nearest-neighbor or user-based collaborative filtering[9]. Bobadilla et al. [3] explained that there are three processesin nearest neighbor method: (1) choosing other users that aresimilar to a user; (2) predicting rating of the item i to a user bycalculating the results of aggregating similar users, and (3)providing recommendations based on the results predicted instage 2.Su and Khoshgoftaar [12] stated that the advantages ofmemory-based collaborative filtering are easy to implementand able to accommodate the new data with ease. However,memory-based collaborative filtering has decreasingperformance in data with high sparsity and have limitedscalability for large datasets [12].B. Model-based Collaborative recommendations by developing a model from user ratings [9].In addition to using explicit data such as ratings, collaborativefiltering can also use implicit information by observing thehabits of users, such as music played, applications downloaded,websites visited, or books read [3]. To develop a model, thereare two approaches that can be used, which are probabilityapproach or rating prediction [9]. The modeling process isconducted by machine learning techniques such asclassification, clustering, and rule-based approach [9]. Basedon its characteristics, model-based also has its advantages anddisadvantages. Su and Khoshgoftaar [12] stated that modelbased approach has better predictions than memory-based. It isalso capable of handling the problem of sparsity and scalabilitybetter than memory-based. However, model-based approachrequires a great resource, such as time and memory, to developthe model and may lose information when using dimensionalityreduction [12].III.RESEARCH METHODOLOGYThis study uses case study and quantitative analysis toevaluate the performance of memory-based and model-basedrecommender systems. The research process is depicted in Fig.1.Fig. 1. Research MethodologyThe data used in this research is data from PT X, one ofthe most popular e-commerce company in Indonesia. PT Xofficially launched its e-commerce on 1 March 2014. The ecommerce site of PT X offers products from eight categories,namely Fashion, Beauty / Health Babies / Kids, Home /Garden, Gadgets / Computers, Electronics, Sport / Hobby /Automotive and Service / Food. By 2016, e-commerce PT Xalready has 2.3 million subscribers and 4 million products[15]. The recommender system is developed using the datafrom PT X which is then evaluated using offline and onlineapproach. The technique used in the evaluation process isexplained in the evaluation section.IV.IMPLEMENTATIONModel-based collaborative filtering uses learningtechniques to create a model to generate recommendation.According to Sarwar et al. [9], learning techniques inrecommender system can be categorized into two approaches:a) using a probability approach, for example, BayesianClassifier, and b) using rating prediction of an item, forexample, Singular Value Decomposition. In this study, we usedthe probability approach to construct the model since thedataset does not include rating information. The selectedprobability approach is Improved Naïve Bayes, a modifiedform of Naïve Bayes, developed for collaborative filteringapplication in e-commerce.A. Improved Bayesian NetworkTo get recommendations using Naïve Bayes, we need tocalculate the probability of an item will be bought by a userIn Naive

Bayes technique, the probability can be determined by usingthe following equation [13]:(1)Letis an item andare the transactionhistory of an active user, sois theprobability of the active user to buy item . However, inNaïve Bayes, there is an assumption that the features (itembought) are independent. Meanwhile, the nature of transactionproved otherwise; therefore, there is a bias of result. The mainidea of Improved Naive Bayes is by adding a constant to theNaive Bayes equation to reduce bias. The equation forimproved Naïve Bayes is:In Fig. 2, an active user had two items in his transactionhistory, which are Camera N and Notebook X. The nextprocess is finding users that had similar transaction history. Inthis case, the users are User 1 who had Notebook X, User 2who had Camera N, and User 3 who had Notebook X andCamera X in their transaction history. User 4 is excluded sincehe did not purchase any item the active user had bought. Thelast process is getransaction history that have not been bought by the activeuser. In this case, the items are Bicycle W and Guitar Y. Theitems are then sorted by best-selling criteria. In this case,Bicycle W is the first recommendation and Guitar Y is thesecond recommendation.V. EVALUATION(2)The value of is the number of item in transaction historyand is a constant value of 3 determined by Wang & Tan [13]experiment. Improved techniques Naive Bayes is selectedbecause it has been adapted to the conditions of the nature ofcollaborative filtering [13].B. Nearest NeighborIn contrast to the model-based approach that conduct themost computation, i.e. developing model, before making actualrecommendations; memory-based approach directly usestransaction data in the database to make recommendations. Oneof the methods in memory-based approach is nearest neighbor.Using nearest neighbor, to recommend items for user U, thesystem looks for other users that have similar transactionhistory to user U. Having obtained the list of users that aresimilar to user U, the system searches for products purchasedby the users similar to user U and recommends products thatuser U, sorted by best-sellingcriteria. Fig. 2 illustrates how nearest neighbor methodgenerates recommendations.In recommender systems domain, evaluation process canbe done using offline analysis, experiments on live users(online), and a combination of them [14]. In this study, weused both offline analysis and user-based testing to getcomprehensive evaluation that can complement eachevaluation method weaknesses. We conducted a case study ona real e-commerce company in Indonesia to demonstrate theimplementation of recommender system on actual data of ecommerce in Indonesia.A. Data PreprocessingThe data used in this study is the user, products, andtransactions data-commerce site. The productdata has three attributes, namely ID, product names, andproduct categories which consists of three levels of categories.In this study, we used 95,468 records of product data. As forthe user data, there are three attributes, namely ID, age range,and gender. In this study, we used 50,000 records of user data.Lastly, we used purchasing data (transaction data), whichconsists of month, year, user ID, product ID, product names,and product categories attributes. In this study, we used290,060 records of purchasing data.Before using the data, the data is pre-processed to removethe anomaly or outlier within the data. After discarding someirrelevant data, we obtained dataset as follows.TABLE I.DATASET BEFORE AND AFTER PREPROCESSINGDatasetBefore Preprocessing290.060 Transactions95.468 Products50.000 UsersAfter Preprocessing290.060 Transactions40.640 Products26.672 UsersAfter that, the dataset was divided into three datasets inorder to simulate the growing of data in e-commerce sites.Fig. 2. Nearest Neighbor ProcessB. Evaluation ScenarioTo perform the evaluation, dataset is divided into two sets,training and test set. Training set contains data that will be usedas input to generate recommendations on memory-basedapproach and to create model on model-based approach. Thedata that is not included in training set is used as the test set toevaluate the recommendation result. The partition of the sets is

70% for the training set and 30% for the test set, as alsoperformed by Godbole and Sarawagi [16].In order to simulate the conditions of data growing in ecommerce site, we formed three datasets from the originaldataset (namely Dataset 1, Dataset 2, Dataset 3). Dataset 1 has33.33% of the total transaction data, Dataset 2 has 66.67% ofthe total transaction data and Dataset 3 has 100% of the totaltransaction data. The three dataset is formed based ontransaction time in ascending. These three data sets simulatehow the data size is growing overtime so that can be used toevaluate how the growing data size affects the performance ofeach recommender systems approach. The following tableshow the number of data for each set.TABLE II.Dataset 1100.016 Transactions17.574 Products22.329 Users3 Avg. TransactionsDATASETS FOR EXPERIMENTSDataset 2200.016 Transactions28.639 Products25.217 Users5 Avg. TransactionsDataset 3290.060 Transactions40.640 Products26.672 Users6 Avg. TransactionsC. Offline TestingPrecision and recall, metrics that is commonly used in thefield of information retrieval, can be used to evaluate therecommendation accuracy. According to Sarwar, Karypsis,Constant, and Riedl [10], the definition of recall and precisionin the context of the recommenders system is as follows:Recall defined as:(3)evaluation, we developed an online system that can beaccessed by respondents. The scenario of user-based testing isas follow:1. User access the system through a browser.2. User fill out the basic information form (age andgender).3. User chooses three items as transaction history4. The system generates five recommended items usingmemory-based approach.5. User chooses a number of recommended items thatare considered relevant by user and determinewhether the computation time can be tolerated.6. The system generates five recommended items usingmodel-based approach.7. User chooses a number of recommended items thatare considered relevant by user and determinewhether the computation time can be tolerated8. Repeat step 4-7 for each dataset (Dataset 1, 2, and 3)There are 31 respondents with demographics of a) 42%female (13 people), 58% male (18 people), and b) agedbetween 18-24 years. All the respondent is familiar with ecommerce sites.VI.RESULT AND ANALYSISA. Offline Testing ResultThe offline testing evaluated the accuracy of therecommendation using F1 metric. The evaluation is conductedfor all variety of N value (5, 8, 11, 14, 17, and 20) where N isthe number of recommended item generated. The result ofoffline testing is displayed on Table III, as follows.TABLE III.Precision defined as:N(4)Precision and recall are often in conflict, for example, theaddition of the value of N increases the recall value butreduces the precision value [10]. However, both of precisionand recall are important to determine the quality of the system.Therefore, it is required a metric that can combine theprecision and recall value, which is F1 metric. F1 Score can becalculated by using the following formula.(5)D. User-based TestingOnline testing or live user experiments is conducted as anevaluation method for evaluating user performance,satisfaction, participation, and other measurements [14].Online testing or user-based testing is an approach that istaken to cover the weaknesses of offline testing. In this study,user-based testing is performed to measure how relevant theproducts recommended by the system and to evaluate theuser tolerance toward computation time. To conduct this5811141720Avg.OFFLINE TESTING RESULTDataset 1Dataset 2Dataset 3MeMoMo MeMoMo MeMoMo 18For each dataset, there are three result of recommendationwhich are (1) Me, the result from memory-based approach; (2)Mo, the result form Model-based approach without updatingthe model; and (3) Mo , the result from Model-basedapproach with updated model for each data change. The thirdresult, Mo , is used as a benchmark for other models. In thisevaluation, the higher the F1 Score means better performance.Based on Tabel III, we can conclude that model approach hasbetter performance than memory-based approach in term ofaccuracy. We can also conclude that N 5 is the optimalnumber of recommendations that need to be generated.

B. User-based Testing ResultThe user-based testing evaluated the relevance of therecommendation according to user and the user tolerancetowards computation time for each method. The relevancetesting is conducted to reconfirm the accuracy (F1 Score)obtained from the offline testing.An online system was developed as an evaluation tool forboth of model-based and memory-based recommendersystems. Respondent were requested to use the systems as if itwere a real e-commerce sites. The systems generate 5recommended items that need to be evaluated by respondents.Five (5) is chosen as the number of N (the number ofrecommendation generated) based on the offline testing result.The offline testing result shows that 5 has the highest averageF1 Score, so that it is chosen as the optimal number of N.From the user-based testing, three information arecollected: (1) the computation time, as shown in Tabel IV; (2)user tolerance toward computation time, as shown in TabelV; and (3) the relevance of recommended items as shown inTable VI.TABLE IV.MethodMemorybasedModelbasedseen in the graph, the data growth affects the computation timeof memory-based approach. This is because memory-basedapproach includes all the data, including the new one, in thecalculation process. Meanwhile, the increasing of computationtime on model-based was not significant because the modelthat used on each dataset is the same model. New data didn'taffect computation time significantly.Fig. 3. Computation Time towards Data GrowthCOMPUTATION TIMEDataset 1Dataset 2Dataset 3Average415789122080854809777All data is presented in millisecondTABLE V.MethodMemorybasedModelbasedUSER TOLERANCE TOWARD COMPUTATION TIMEDataset 1Dataset 2Dataset 3Average95,7787,190,3291,4093,5587,183,8788,17Fig. 4. User Tolerance towards Computation TimeAll data is presented in percentTABLE VI.MethodMemorybasedModelbasedRELEVANCE OF RECOMMENDED ITEMSDataset 1Dataset 2Dataset l data is presented in percentThe value in Table V shows the percentage of respondentthat considered the computation time can be tolerated. It meansthe higher the score is better. Meanwhile, the value in Table VIshows the percentage of item that were considered relevance tothe recommendation given the user transaction history. It alsomeans that the higher the score is better. The computation timeis recorded to examine how long the time to generaterecommendation for each method. Together with the usertolerance data, the computation time data can be used toanalyze the limit of user tolerance.C. AnalysisThe graph in Fig. 3 shows that the computation time ofmemory-based is longer than model-based approach. As can beThe graph in Fig. 4 shows that in general, the usertolerance towards computation time from dataset 1 to dataset 3is decreasing. There is an anomaly in which the user toleranceon memory-based better than model-based, despite the growthof memory-based computation time is always at the top of themodel-based. This may happen because of several factors suchas the internet connection, bias occured due to the order oftesting or other factors. This data shows the need for furtherresearch in the perspectives of user behaviour.The graph in Fig. 5 shows that in general, the number ofrelevant products perceived by user is decreasing when thenumber of data is increasing.The average number of relevant products for each datasetin memory-based and model-based are 17.20% and 18.07%respectively. This indicates that model-based is better in termsof recommending products that are relevant to the user. Thereis an anomaly in the experiment on dataset 1 which shows thatthe relevance of the recommended products on memory-basedis better than the model-based despite the better accuracy ofmodel-based (see Table III). This finding indicates that there

are many factors determining the user acceptance towardsgenerated recommendations that need to be explored further.user tolerance toward computational time and useracceptance toward recommendations in e-commercedomain.d) It is also suggested in the future work to increase thenumber and the variety of respondent on user-basedtesting to improve the analysis of the result.IX.ACKNOWLEDGEMENTThis study was supported by International PublicationGrants for Student Thesis. The Grant title is MobileAdvertising Acceptance from Users and Gratifications Theory(UGT) Perspectives and Its Impact Towards Buying Behaviorof Smartphone Users (No: 1753/UN2R12/PPM.00.00/2016).REFERENCES[1]Fig. 5. The Number of Relevant Product for N 5VII. CONCLUSIONBased on the memory-based method and model-basedcollaborative filtering experiment, there are severalconclusions:In term of accuracy of recommendation, the offlineevaluation shows that model-based accuracy is betterthan memory-based accuracy.b) Based on the computation time, model-based has anaverage computation time 10 times faster thanmemory-based. This makes model-based better thanmemory based in terms of computational speed inrecommending products.c) Although memory-based computation time is slowerthan model-based, it was found that the respondentsconsidered memory-based computing time is still astolerable as the model-based within this number ofdata (less than 300,000 transaction in database).d) Based on the average number of relevant productsperceived by user, model-based is better .[2][3][4]a)VIII. LIMITATIONS AND FUTURE WORKThe following are suggestions that can be used for futurework:a) This study used Improved Naïve Bayes and NearestNeighbor techniques in the implementation ofmodel-based and memory-based collaborativefiltering. In the future studies, more techniquesshould be experimented to get more comprehensiveanalysis.b) It is essential to conduct testing on other performanceaspects of recommender system such as diversity,novelty, serendipity, and coverage.c) The finding related to the user tolerance towardscomputational time and user perception towardsrecommended products shows an opportunity to dofurther research to explore the factors that 6]R. G. Curty, & P. Zhang, Website features that gave rise to socialcommerce: a historical analysisElectronic Commerce Research andApplications, 2013, 12(4), pp. 260-279.J. Schafer, J. Konstan, & J. Riedl, Recommender Systems in ECommerceProceedings of the 1st ACM conference on Electroniccommerce, 1999, pp. 158-166.J. Bobadilla, F. Ortega, A. Hernando, & A. Gutiérrez, RecommenderKnowledge-Based Systems, 2013, 46, pp. 109-132.systems surveyF. McCarey, M. Cinneide, & N. Kushmerick, A Recommender Agentfor Software Libraries: An Evaluation of Memory-Based and ModelProceedings of the IEEE/WIC/ACMBased Collaborative Filteringinternational conference on Intelligent Agent Technology, 2006, pp.154-162.M. Robillard, & R. Walker, An Introduction to RecommendationSystems in Software EngineeringRecommendation Systems inSoftware Engineering, 2013, pp.1-11.R. Kalakota and A. Whinston, Electronic commerce. Reading, Mass.:Addison-Wesley, 1997.W. Delone, & E. McLean, Measuring e-Commerce Success: Applyingthe DeLone & McLean Information Systems Success ModelInternational Journal of Electronic Commerce, 2004, 9:1, pp. 31-47.F. Ricci, L. Rokach, & B. Shapira, Introduction to RecommenderSystem HandbookRecommender systems handbook, 2011, pp.1-35.Boston: SpringerB. Sarwar, G. Karypsis, J. Konstan, & J. Riedl, Item-BasedCollaborative Filtering Recommendation Algorithms n WWW10,2001, pp. 285-295.B. Sarwar, G. Karypsis, J. Konstan, & J. Riedl, Application ofDimensionality Reduction in Recommender System A Case Study nACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000.J. S. Breese, D. Heckerman, & C. Kadie, Empirical analysis ofProceedings of thepredictive algorithms for collaborative filtering14th Conference on Uncertainty in Artificial Intelligence, 1998, 461(8),pp. 43 52.X. Su, & T. Khoshgoftaar, A Survey of Collaborative FilteringTechniques in Advances in Artificial Intelligence, 2009, pp.1-19.K. Wang, & Y. Tan, A New Collaborative Filtering RecommendationAdvances in SwarmApproach Based on Naive Bayesian MethodIntelligence, 2011, 6729, pp. 218-227.J. L. Herlocker, J. A. Konstan, J. T. Riedl, & L. G. Terveen, Evaluatingcollaborative filtering recommender systemsACM Transactions onInformation Systems, 2004, 22 (1), pp. 5 53.Ayyubi, S. A. E-COMMERCE: Elevenia Klaim Memiliki 2,3 JutaPelanggan im-memiliki-23-juta-pelangganS. Godbole, & S. Sarawagi, Discriminative Methods for Multi-labeledProceedings of the 8th Pacific-Asia Conference onClassificationKnowledge Discovery and Data Mining (PAKDD 2004), 2004

There are several filtering algorithms that can be used in recommender systems; which are content-based filtering, demographic filtering and collaborative filtering. In recommender systems domain, collaborative filtering is one of the most successful filtering algorithm [9][10] so that it is chosen as the filtering algorithm in this study.