An Advanced Inventory Data Mining System For Business .

Transcription

2017 IEEE Third International Conference on Big Data Computing Service and ApplicationsAn Advanced Inventory Data Mining System forBusiness IntelligenceQifeng Zhou , Bin Xia§ , Wei Xue† , Chunqiu Zeng† , Ruyuan Han , and Tao Li†‡ AutomationDepartment, Xiamen University, Xiamen, Fujian, 361005 Chinaof Computing and Information Sciences, Florida International University, FL, 33199 USA‡ School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023 China§ School of Computer Science Technology and Engineering, Nanjing University of Science and Technology, Nanjing, 210094 China† SchoolAbstract—Inventory management plays a critical role to trackinventory levels, orders, and sales of the retailing business.Effective inventory management is a capability necessary to leadin the global marketplace. In the current retailing market, ahuge amount of data regarding stocked items in inventory isgenerated and collected every day. Due to the increasing volumeof transaction data and their correlated relations, it is oftena non-trivial task to efficiently manage stocked goods, yet itis imperative to explore the underlying dependencies of theinventory items and give insights into implementing intelligentmanagement systems. However, existing inventory managementsystems rely on statistical analysis of the historical inventorydata, and have a limited capability of intelligent management.For example, they usually do not have the ability to forecastitem demand and detect anomalous patterns of item inventorytransactions. There is little work reported in implementing intelligent inventory management solutions to reveal hidden relationswith integrated data-driven analysis. In this paper, we present anintelligent system, called iMiner, to facilitate managing enormousinventory data. We utilize distributed computing resources toprocess the huge volume of inventory data and incorporate thelatest advances in data mining technologies. iMiner provides comprehensive support for conducting many inventory managementtasks, such as forecasting inventory, detecting anomalous items,and analyzing inventory aging.about stocked goods needs to be processed instantly. For example, in one of our studies, the retailing vendor has 251,874items in total, 1321,400 transactions per day on average, andmore than 600 million records per year, which leads to morethan 1TB data. 6XSSRUW KLJK SHUIRUPDQFH GDWD DQDO\VLV RQ D GLVWULEXWHG V\VWHP 6XSSRUW ODUJH VFDOH DQDO\VLV WDVNV UXQQLQJ LQ SDUDOOHO LQ KHWHURJHQHRXV HQYLURQPHQWV6XSSRUW QHZ DOJRULWKP SOXJ LQV FFXUDWH LQWHUSUHWDEOH UHVXOWV '\QDPLF 3UHGLFWLRQ ,QYHQWRU\ )RUHFDVWLQJ -RLQW 3UHGLFWLRQ QRPDO\ 'HWHFWLRQ ,QYHQWRU\ JLQJ QDO\VLV 3LHFHZLVH /LQHDU 5HSUHVHQWDWLRQ :HLJKWHG 690 &RUUHODWLRQ WWULEXWH 6HOHFWLRQ &RQVWUDLQ EDVHG PXOWLSOH WLPH VHULHV SUHGLFWLRQ 8QVXSHUYLVHG SUREOHP 6XSHUYLVHG SUREOHP 5HJUHVVLRQ SUREOHP &ODVVLILFDWLRQ SUREOHP 6\PPHWULF FRVW PHDVXUH &RVW VHQVLWLYH OHDUQLQJ 5DQGRP IRUHVW EDVHG IHDWXUH VHOHFWLRQ 3URSRVHG DQRPDO\ GHWHFWLRQ DOJRULWKP KDV EHHQ DSSOLHG LQ FRQGLWLRQ PRQLWRULQJ DQG IDXOW GLDJQRVWLFV IRU K\GURSRZHU SODQWV 3URSRVHG MRLQW SUHGLFWLRQ DOJRULWKP KDV EHHQ DSSOLHG LQ DQ RQOLQH EXVLQHVV PDQDJHPHQW Fig. 1: iMiner OverviewA. Motivation for Developing iMinerThe evolution of market demand and business transactioninfluenced the inventory management systems to step beyondthe basic and ERP inventory management, to adopt recentemerging cloud-based inventory management.Most existing inventory management systems and methods, such as InFlow, Inventoria, Inform ERP, and FishbowlInventory1 , are demand-driven and cannot satisfy the needsin mining business big data. These traditional software toolsonly provide basic statistical functionalities, such as, trackingwhere products are stocked, which suppliers they come from,and how long they have been stored. They have limitedcapability and unable to support intelligent management, suchas large-scale inventory data management, forecasting itemdemand automatically, and detecting anomalous patterns ofitem inventory transactions. It is a challenge for researchers toexplore new methods to meet future requirements of inventorymanagement.To address the limitations of existing systems and assistlarge retailing business in efficiently performing inventorymanagement, we design, implement, and deploy an intelligentinventory management system, named as iMiner. As shownin Figure 1, iMiner overcomes the aforementioned limitationsI. I NTRODUCTIONInventory management is the process of monitoring theproduct storage. A good inventory management is criticalto the successful operation of most businesses and supplychains. In operation, inventory management has functionalitiesto avoid product overstocks and outages thus to reduce thecarrying costs. In marketing, inventory management affectscustomer satisfaction. In finance, inventory investment is acompany’s largest asset. With more demanding customers andrising operation costs, it is much more important for retailers toapply inventory management technology to manage businesstransactions and business decisions [3] [17]. Therefore, theability to transform inventory data into meaningful and actionable insight is a significant factor of competitive advantage forlarge retailers.Developing intelligent inventory management system isquite challenging. First, inventory management involves manydifferent facets of the retailers’ business, such as, warehousemanagement, retail loss prevention, and inventory count. Second, in the current retailing market, a large amount of data978-1-5090-6318-5/17 31.00 2017 IEEEDOI 10.1109/BigDataService.2017.36 1 /210

with a carefully-designed architecture and advanced data analysis techniques. More importantly, iMiner provides a set ofkey functionalities that facilitate the businesses conveniencethrough efficient analysis of the large scale inventory data [11][13].Main data analysis algorithms proposed in iMiner have alsobeen extended to other application fields (e.g., the anomalydetection algorithm has also been applied in condition monitoring and fault diagnostics for hydropower plants, and thejoint prediction algorithm has also been used in an onlinebusiness management).;ĂͿ ĂƚĂ džƉůŽƌĂƚŝŽŶ ƚĂƚŝƐƚŝĐĂů ŶĂůLJƐŝƐK Wͬ ĂƚĂ ƵďĞ;ďͿ ĂƚĂ ŶĂůLJƐŝƐKƉĞƌĂƚŝŽŶ WĂŶĞů ŶŽŵĂůLJ ĞƚĞĐƚŝŽŶ/ŶǀĞŶƚŽƌLJ &ŽƌĞĐĂƐƚŝŶŐ/ŶǀĞŶƚŽƌLJ ŐŝŶŐ DŝŶŝŶŐ;ĐͿ ZĞƐƵůƚ DĂŶĂŐĞŵĞŶƚB. Research Challenges and SolutionsBased on our long-term collaboration with retailing companies and the demand for intelligent business in big dataenvironment, we have identified four key issues that need tobe addressed in the traditional inventory management system.1) Business Big Data management. Most existing inventory management software tools suffer from thefollowing limitations: 1) They are memory-based andcannot efficiently support large scale analysis. 2) Theydo not support new algorithm plug-ins.2) Inventory forecasting. Since forecasting inventory canavoid product overstock and greatly reduce the maintenance cost, an accurate and interpretable inventoryforecasting is highly needed. Usually, inventory management process has two types of time series data, i.e.,the amount of stock in and stock out evolving overtime. With the increasing transaction scales and complextransaction types, the following features of inventorydata should be handled carefully for accurate inventoryforecasting: 1) Large amounts of records; 2) Largeamounts of attributes; 3) Item correlations: e.g. whena customer is buying a TV, he or she may also chooseother related products, such as TV mounts or DVDplayers. The correlations further increase the difficultyfor efficient inventory management.3) Inventory anomaly detection. Inventory anomaly detection helps retailers find the unmarketable products andabnormal stock. In traditional inventory managementsystem, the inventory-to-sales ratio (i.e., the ratio of theinventory available for sale versus the actual quantitysold) is a key statistic to measure whether or not an itemis overstocked. When the data scale increases dramatically, and the type of anomalies gets more complicated,the inventory-to-sales ratio cannot accurately and timelyreflect the real anomaly stock. This introduces the challenge of anomaly detection on big time-variant inventoryand identification of the unmarketable products.4) Inventory aging analysis. Inventory aging analysis canhelp companies understand the inventory status anytime, prevent items from overstocking, and reduce theoverstocked items. Aging analysis can also be appliedto liability accounts to obtain a clear picture of thecompany’s obligations. Commonly used statistics-basedmethods provide some basic inventory aging analysissŝƐƵĂůŝnjĂƚŝŽŶZĞƐƵůƚ ŝƐƚ Θ ŶĂůLJƐŝƐFig. 2: Data Analysis Modulesfunctionalities such as inventory aging computing, comparison, and classification. However, how to furtherdiscover in-depth knowledge such as the primary factorscorrelated with overstock items, remains challenging forunderstanding stock issues in advance and making quickresponses.To address the challenges mentioned above and fulfill thedemand of intelligent inventory management for large retailcompany, a good inventory management system should bedesigned with the following principles: 1) The system shouldbe able to handle large-scale data analysis; 2) The systemshould provide users with interactive functionalities; and 3)The system should have accurate, interpretable results.Based on the above design principles, we develop iMineron a powerful data analysis platform with advanced dataanalysis techniques. Specifically, to address challenge 1, weimplement an integrated data analytic platform based on adistributed system to support high-performance data analysis.The platform manages all transactions in a distributed environment, which is capable of configuring and executing dataprocessing and analysis in parallel. To address challenges 2,3, and 4, we integrate a variety of data mining technologies toanalyze inventory data. In particular, iMiner 1) adopts variousregression models on time series data for inventory forecasting; 2) employs classification-based cost-sensitive learningalgorithms to identify unusual items; and 3) utilizes statisticalregression models to perform inventory aging analysis. iMinerprovides various visualization tools and interpretable resultsfor potential, complex transaction rules discovering. It helpsusers evaluate future inventory requests and make good marketdecisions.C. RoadmapThe rest of the paper is organized as follows. Section 2presents an overview of iMiner, starting from introducing thesystem architecture, followed by the system merits. Section 3to Section 5 introduce three core functionalities of inventorymanagement and the data-driven solutions adopted in proposedsystem. Specifically, in Section 3, we propose two models for211

interface to build workflows for executing such tasks automatically. As shown in Figure 2(b), Operation Panel mainlycontains three modules, Inventory Forecasting , Anomaly detection, and Inventory Aging Mining. These modules implement various mathematical models and advanced data miningalgorithms to address the challenges of history data analysisin inventory management.3) Result Management: As shown in Figure 2(c), Resultmanagement templates are designed to support automaticstorage, update, and retrieval of discovered patterns. Resultsare recorded based on analysis tasks and can be organizedaccording to different inventory time series, supply companies,brands, or data source. Dashboard, statistical graphs, and tablesare produced to visualize the company’s whole operations andanalysis results. Also, for each result, customers and domainexperts can refine and give feedbacks to the system.inventory forecasting: the dynamic model for single time seriesand the joint prediction model for multiple time series. InSection 4, we describe our proposed data-mining strategy todetect inventory anomaly from a large amount of inventorytime series. Section 5 focuses on data-driven solutions forinventory aging analysis and attribute correlation mining onoverstocked items. Section 6 presents the system deployment,including system performance evaluation and real results fromactual usage. Finally, Section 7 concludes the paper.II. S YSTEM OVERVIEWA. System ArchitectureiMiner is built upon our previous developed large scale dataanalysis system, FIU-Miner, which is a Fast, Integrated, andUser-friendly system to ease data analysis [19] [12].The system is composed of four layers: User Interface,Data Analysis Layer, Task and System Management Layer, andPhysical Resource Layer. User Interface. This layer contains various interactiveinterfaces for inventory operations. Specially, it providesDashboard and Statistics Interface to allow users to havean overview of the current inventory status. In addition,several key indices of inventory, e.g., turnover rate andinventory-to-sales ratio, are presented in Inventory IndexInterface, assisting users in promptly querying the statusof a particular item. Data Analysis Layer. This layer is the heart of the system.Beside basic data processing and exploration functionalities, Data Analysis Layer consists of appropriate datamining solutions to the corresponding tasks of inventorymanagement, including Inventory Forecasting, AnomalyDetection, and Inventory Aging Analysis. The main module functionalities will be discussed in Section 2.2 andthe detailed algorithms will be discussed in Section 3,Section 4, and Section 5. Task and System Management Layer. Task and SystemManagement Layer provides a fast, integrated, and userfriendly system to configure complex tasks, integratevarious data mining algorithms, and execute tasks in adistributed environment. All the data analysis tasks inData Analysis Layer can be configured as workflows andscheduled automatically.III. I NVENTORY FORECASTINGThe primary goal of inventory forecasting is to minimize theinventory loading. A common practice of inventory forecastingis to predict the demand for a particular item in the futureand reserve the appropriate amount of items, based on theforecasting results. However, inventory data is a type oftime series with large volume, long time span, and fewerregularities. These features bring up two challenges to inventory forecasting: 1) implementing an accurate interpretableinventory prediction; 2) modeling the relationships amongmultiple time series data sets and predicting their future valuessimultaneously.Although, in recent years, there has been an explosion ofinterest in mining time series [2] [14], traditional approachessuch as auto-regression(AR), linear dynamical systems (LDS),Kalman filter(KF) cannot solve above challenge directly [1][8].Hence, more accurate and optimal forecasting methods areexpected. In iMiner we design and deploy two new inventory forecasting models: dynamic prediction model and jointprediction model to solve the prediction problems.A. Dynamic Prediction ModelTo implement an accurate and interpretable inventory forecasting, we propose a two-step dynamic prediction model.First, it adopts machine learning techniques combined withtime series analysis methods to obtain a forecasting basis.Second, it takes into account multiple factors of inventory suchas seasonality, trend, and special events for dynamic inventoryforecasting.The overall framework of dynamic forecasting model isshown in Figure 3.1) Determining the Forecasting Basis: In this step, weemploy machine learning algorithms to capture the hiddenpatterns in stock in/out time series. Each algorithm is used tobuild an inventory forecasting regression model based on thepast inventory transaction data and update on a daily basis.These algorithms include : Linear Regression [16], NeuralNetwork (NN) [7], Gradient Boosting Regression Tree [6],B. Data Analysis Modules1) Data Exploration: As shown in Figure 2(a), StatisticalAnalysis and Data Cube are capable of assisting data analystsin exploring inventory data efficiently and effectively. Statistical product analysis in different sizes and dimensions canquickly discover interesting time frame, product categories,and key indicators. Data Cube provides a convenient approachto explore high-dimensional data so that data analysts can havea better view of the characteristics of the dataset.2) Data Analysis: In our system, the data mining approaches in Algorithm Library can be organized as a configurableprocedure in Operation Panel. Operation Panel is a unified212

trend in time series (reflected in the forecasting basis) and theimpact of monthly periodicity, trend and events (reflected inthe dynamic forecasting process), thus, give an interpretableprediction result.6WHS %DVLV IRUHFDVWLQJ )RUHFDVWLQJ EDVLVĂDWDŶƐĞŵďůĞ ĨŽƌĞĐĂƐƚŝŶŐB. Joint Prediction6WHS '\QDPLF IRUHFDVWLQJ WUHQGVHDVRQDOLW\HYHQWDŽŶƚŚWƌŽĚƵĐƚ ϭWƌŽĚƵĐƚ Ϯ:ƵŶ͘ ϮϬϭϰ:Ƶů͘ ϮϬϭϰ ƵŐ͘ ϮϬϭϰ ĞƉ͘ ϮϬϭϰKĐƚ͘ ϮϳϲϳϯϮϳϵϵϳϰϯ͙ϯϰϭϳϬϮ ϯϱϳϭϴϯ ϯϲϯϳϮϳ ϰϰϮϵϮϴ ϯϴϴϵϱϰ ͙1) Multiple Time Series Prediction: Existing inventorymanagement systems [2] often forecast the two time seriesstock in and stock out separately. Both of them are treated asindependent ignoring their relationship.In practice, the amounts of stock in and stock out in aninventory are often dependent on each other. The amount ofstock out (Sout for short) is usually subject to the amountof stock in (Sin for short) at the same or near time periodsto prevent the situation that an item is out of stock. Also,the scheduled Sin primarily depend on the past Sout to avoidthe situation that a unit is in excess of demand. So two timeseries of Sin and Sout bear some interdependencies accordingto the characteristics of inventory management. The existingsingle time series predictive methods lack the capability ofcapturing the dynamic relationships between multiple timeseries or predicting their future values simultaneously. Littleresearch attention has been paid to predicting the movementof a collection of related time series.In iMiner, we model the interdependencies of time series data and integrate them into the process of time seriesprediction. The model can capture the dynamic relationshipsbetween multiple time series data set and predict their futurevalues simultaneously. Specifically, in the domain of inventorymanagement, the aggregated amount of Sin is often largerthan the aggregated amount of Sout in a given period to avoid“out of stock”. In addition, Sin and Sout should be close toeach other to prevent a unit from “excess demand”. Based onsuch an intuition, we transform the requirement of inventorymanagement into model constraints and perform time seriesprediction under the constraints.2) The Joint Predictive Model: In multiple time series prediction, we have L different time series, and for all time series,(l)we have N available examples {xi : i {1, 2, . . . , N }}. Thegoal is to learn a function fl N X for each l-th time seriesbased on the available data, and in total, we have L suchfunctions. Note that the input space of all L time series mightshare common representation, or their input space could betotally different.Different from independent L single time series predictions, in joint prediction, the output space of different timeseries might correlate with each other, i.e., they may haveto satisfy some specific constraints naturally embedded inthe applications. In the case of inventory management, thestorage capacity should always be greater than the volume ofsales to avoid the situation that supply falls short of demand.Therefore, we add constraints into the model to capture therelationships between different time series.&ŽƌĞĐĂƐƚŝŶŐ ƌĞƐƵůƚƐFig. 3: The Framework of Dynamic Forecasting ModelSupport Vector Machine

the basic and ERP inventory management, to adopt recent emerging cloud-based inventory management. Most existing inventory management systems and meth-ods, such as InFlow, Inventoria, Inform ERP, and Fishbowl Inventory1, are demand-driven and cannot satisfy the needs in m