Streamlining Smart Meter Data Analytics

Transcription

Downloaded from orbit.dtu.dk on: Jun 12, 2021Streamlining Smart Meter Data AnalyticsLiu, Xiufeng; Nielsen, Per SievertsPublished in:Proceedings of the 10th Conference on Sustainable Development of Energy, Water and Environment SystemsPublication date:2015Document VersionPublisher's PDF, also known as Version of recordLink back to DTU OrbitCitation (APA):Liu, X., & Nielsen, P. S. (2015). Streamlining Smart Meter Data Analytics. In Proceedings of the 10th Conferenceon Sustainable Development of Energy, Water and Environment Systems International Centre for SustainableDevelopment of Energy, Water and Environment Systems.General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyrightowners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portalIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Streamlining Smart Meter Data AnalyticsXiufeng Liu and Per Sieverts NielsenTechnical University of Denmark{xiuli, pernn}@dtu.dkAbstract. Today smart meters are increasingly used in worldwide. Smart meters are the advanced meters capable of measuring customer energy consumption at a fine-grained time interval, e.g., every 15 minutes. Thedata are very sizable, and might be from different sources, along with the other social-economic metrics such asthe geographic information of meters, the information about users and their property, geographic location andothers, which make the data management very complex. On the other hand, data-mining and the emerging cloudcomputing technologies make the collection, management, and analysis of the so-called big data possible. Thiscan improve energy management, e.g., help utilities improve the management of energy and services, and helpcustomers save money. As this regard, the paper focuses on building an innovative software solution to streamline smart meter data analytic, aiming at dealing with the complexity of data processing and data analytics.The system offers an information integration pipeline to ingest smart meter data; scalable data processing andanalytic platform for pre-processing and mining big smart meter data sets; and a web-based portal for visualizing data analytics results. The system incorporates hybrid technologies, including big data technologies Sparkand Hive, the high performance RDBMS PostgreSQL with the in-database machine learning toolkit, MADlib,which are able to satisfy a variety of requirements in smart meter data analytics.Keywords: Streamline, Software Platform, Smart meter data, Data Analytics1INTRODUCTIONToday smart meters are increasingly used in worldwide for the ability of providing timely reading, automating metering without customer promises, producing fine-grained data, and more.Smart meters collect energy consumption data at a time interval, usually every 15 minutes orhourly. Smart meter data analytics system is an ICT-based platform for analyzing the collectedmeter readings, which nowadays has become an indispensable part for utilities running smartgrid. Smart meter data analytics can help utilities better understand customer consumption patterns, provision energy supply to peak demand, detect energy theft, and provide personalizedfeedback to customers. Also, government can make the decision for future smart grid development based on analytic results. For customers, smart meter data analytics can help them betterunderstand their own energy consumption, save energy, and reduce their bills. Smart meter analytics thus is seen so important that the market has been growing rapidly, which is expectedto reach over four billion dollar by year 2020 [19]. Various algorithms for smart meter dataanalytics have been proposed, mainly in the smart grid literature, such as the ones for electricity consumption prediction, consumption profile extractions, clustering similar consumers,and personalized feedback to consumers on how to adjust their habits and reduce their bills.Nevertheless, there has been lacking smart meter analytics applications in reality until in therecent that some database vendors starts to offer smart meter analytics software, e.g., SAP andOracle/Data Raker. So did several startups in this area, e.g., C3Energy.com and OPower.com.Furthermore, some utilities such as California’s PG&E4 also start to provide on-line portalswhere customers can view their electricity consumption and compare it with their neighbourhood’s average. However, these systems and tools focus on simple aggregation and simple waysof visualizing consumption. The details of their implementations are not disclosed. It is unclearon how to build a practical and scalable analytics system to handling smart meter data, whichare characterized by big volume and big velocity.In this paper we present a software platform for streamlining smart meter data analytics. Thisplatform is built based on our benchmark work for smart meter data analytics technologies [30],

and extended from our prototype smart meter data analytics system, SMAS [29]. This platformaims at providing a solution for facilitating the whole process of smart meter data analytics,including data ingestion, data transformation, loading, analyzing, and visualization. Utilities orcustomers can get the final information through these stages. We adopt a hybrid architecture inthe system design, in which the primary building blocks consist of Spark and Hive in the dataprocessing layer, and PostgreSQL with MADlib [18] in the analytics layer. The design considers the support for high performance analytics queries, i.e., through RDBMS, and the supportfor big data analytics, i.e., through Spark and Hive. We decouple the system architecture intothree layers, including data ingestion layer, processing layer, and analytics layer, which make iteasy for users’ implementation and extension. Smart meter data goes through the three layersfrom data sources to be presented in a web portal. The processing layer is an open platform thatcan integrate various user-defined processing units, such as the units for data transformation,data anonymization, and anormal data detection. While, the analytics layer is also open to theextension of different analytics algorithms. The analytics layer currently supports multiple typesof algorithms, including time-series analytics at different temporal aggregations (e.g., hourly,daily, or weekly), load dis-aggregation, consumption pattern discovery, segmentation, forecasting and consumer feedback. Consequently, in this paper we make the following contributions:1) we propose a hybrid architecture of combining the best of different technologies for streamlining smart meter data analytics; 2) we implement the open data platform that can be easilyextended by using different data processing units and analtyics algorithms; and 3) we implement smart meter data analytics system of supporting both supply- and demand-side analytics,which can help utilities better to manage energy supply and help consumers save energy.The rest of this paper is structured as follows. Section 2 summarizes the related work; Section 3 presents the design principles of the system; Section 4 gives an overview of the system;Section 5 and 6 present data processing layer and analytics layer of the system, respectively;Section 7 concludes the paper with directions for future work.2RELATED WORKSystems and Platforms for Smart Meter Data Analytics. The traditional technologies, such as R(S-PLUS), Matlab, SAS, SPSS and Stata, which support numeric computing and comprehensive statistical analysis can be used in smart meter data analytics. The recent trend of analyticstechnologies is to support in-memory, in-database, and in parallel on a cluster. Main-memorybased systems, such as KDB [16] and SAP HANA [13], and the in-database machine learningtoolkit, e.g., PostgreSQL/MADlib [18] are the good options to smart meter analytics. Paralleldistributed platforms such as Hive (built on top of Hadoop) or Spark are the two distinct distributed computing frameworks that are able to handle big data analytics in a cluster. In thispaper, we implemented our system with a hybrid architecture of using Hive, Spark and PostgreSQL/MADlib. The system combines the best of each technology, which is able to do dataanalytics in-database, in-memory and in parallel.The systems or prototypes for smart meter data analytics emerge in both industry andacademia. The industry companies that we mentioned in Section 1 developed smart meter analytic software, but the implementation of the systems and analytics algorithms used are unclear,which may be due to the license issue. Nezhad et al. developed a smart meter dashboard in theirresearch work, called SmartD [20], which is orthogonal to the work of the analytics layer ofour system, but ours provides more comprehensive functionalities, and the whole software architecture is the complete solution of supporting data ingestion, transformation, analyzing andvisualization. Liu et al. use analytic procedures in Hive to process smart grid data on cloud stor-

age, and use an RDBMS to cope with daily data management transactions on the informationof meter devices, users, organizations, etc [31]. This is somewhat similar to our architecture,but our main focus is to streamline the whole process of smart meter analytics by taking advantage of different technologies. Furthermore, our platform is open to the extension of addingmore data processing units and algorithms. Besides, the work [31] primarily studies how to efficiently retrieve smart meter data from Hive, and focuses on simple operational queries ratherthan the deep analytics that we address in our system. Beyond electricity sector, smart meteranalytics systems and applications were also developed in the water sector, e.g., WBKMS [25],a web-based application for providing real-time information of water consumption; and Autoflow [21], a tool for categorising residential water consumption. We are currently developingwater data analytics algorithms, which will be integrated into the analytics layer of our system.These existing works provide useful information to our implementation.Benchmarking Smart Meter Data Analytics. Arlitt et al. implements a toolkit, called IoTAbench, to benchmark the analytics algorithms of Internet of Thing (IoT). They use synthetic electricity data, and evaluates six queries for smart meter data analytics algorithms on HP Verticacluster platform. Benchmarking time series data mining is also discussed in [17], where different implementations of time series similarity search, clustering, classification and segmentationwere evaluated. Anil benchmarks data mining operations for power system analysis [5], whichanalyzes voltage measurements from power transmission lines. However, all of these worksonly focus on benchmarking analytics algorithms. In our previous work [30], we benchmarkfour representative analytics algorithms for smart meter data, as well as five technologies withdifferent categories, including Matlab, KDB, PostgreSQL/MADlib, Spark and Hive. They represent the technologies of the traditional (Matlab), in-memory (KDB and Spark), in-database(PostgreSQL/MADlib), in-memory distributed (Spark) and Hadoop based (Hive). The work isthe foundation to implement this system, i.e., providing the reference of choosing the technologies.Smart Meter Data Analytics Algorithms. Two broad applications of smart meter data analytics are widely studied, which are consumer and producer-oriented. Consumer-oriented applications aim to provide feedback to end-users on reducing electricity consumption and savingmoney (see, e.g., [10, 19, 23]). Producer-oriented applications are for utilities, system operatorsand governments, which provide information of consumers, such as their daily habits for thepurposes of load forecasting and clustering/segmentation (see, e.g., [1, 2, 4, 7, 11, 12, 14, 15, 19,20]). From a technical standpoint, both of the above classes of applications perform two typesof operations: extracting representative features (see, e.g., [7, 10, 12, 14]) and finding similarconsumers based on the extracted features (see, e.g., [1, 11, 23, 24, 27]). Household electricity consumption can be broadly decomposed into the temperature-sensitive component (i.e.,the heating and cooling load) and the temperature-insensitive component (other appliances).Thus, representative features include those which measure the effect of outdoor temperature onconsumption [3, 10, 23] and those which identify consumers’ daily habits regardless of temperature [1, 7, 12], as well as those which measure the overall variability (e.g., consumptionhistograms) [2]. Some of the above existing algorithms have been integrated into our system,as well as new implemented algorithms, which are used to study variability of consumption,load profiling, load segmentation, pattern discovery, load dis-aggregation, and load similarity toother consumers.3DESIGN PRINCIPLESWe now describe the high-level design principles for this smart meter data analytics system.

Scalability. The system will be able to deal with fast-increasing volume of smart meterdata, with at a high frequency, e.g., 15 minutes, 30 minutes or hourly. It may adopt the current technologies such as MapReduce to scale big data analytics. Data mining and machinelearning algorithms thus have to be transformed as MapReduce programs to be parallelizedin a cluster environment. The system sh

water data analytics algorithms, which will be integrated into the analytics layer of our system. These existing works provide useful information to our implementation. Benchmarking Smart Meter Data Analytics. Arlitt et al. implements a toolkit, called IoTAbe-nch, to benchmark the analytics algorithms of Internet of Thing (IoT). They use synthetic elec-Cited by: 19Publish Year: 2015Author: Xiufeng Liu, Per Sieverts Nielsen