Transcription
Advanced Analytics for PI Data for DataScientistsAhmad Fattahi – Manager, Data Science Enablement, OSIsoftDallas Swift – Data Scientist, OSIsoft#OSIsoftUC#PIWorld 2018 OSIsoft, LLC1
Goal: Gain a better understanding of data science practicesfor process data and the PI SystemAgenda Definitions and general concepts CRISP-DM Process Best practices and pitfalls Case study#OSIsoftUC#PIWorld 2018 OSIsoft, LLC2
NomenclatureArtificialIntelligenceData Science is an interdisciplinaryfield of scientific methods, processes,algorithms and systems toextract knowledge or insightsfrom data in various tUC#PIWorld 2018 OSIsoft, LLC3
CRISP-DM CRoss Industry Standard Process forData Mining Among most popular methodologies Emphasizes cycles and iterationsSource: KDnuggets#OSIsoftUC#PIWorld 2018 OSIsoft, LLC4
Story: Optimize Building Energy Consumption#OSIsoftUC#PIWorld 2018 OSIsoft, LLC5
Inception: Management or SMEStart from a “Sharp Question” “Can the building wake up later?”Business owner plays a key role Facilities ManagerEnvision the delivery mechanism “Recommendation engine? Direct control?”SME and data professionals start engaging Many conversations until they speak the samelanguage#OSIsoftUC#PIWorld 2018 OSIsoft, LLC6
PitfallsMyth: The data scientist can do it all!Targeting the wrong questionLosing sight of bottom line value to thebusinessGetting crushed between political gears#OSIsoftUC#PIWorld 2018 OSIsoft, LLC7
Q: Why did you become data scientists? A: Because “Superhero” is not a job title!#OSIsoftUC#PIWorld 2018 OSIsoft, LLC8
PitfallsMyth: The data scientist can do it all!Targeting the wrong questionLosing sight of bottom line value to thebusinessGetting crushed between political gears#OSIsoftUC#PIWorld 2018 OSIsoft, LLC9
Building the “Model”Engage with data engineers, PI Admins Python and R libraries by OSIsoft, PI Web API, AF SDK,PI Integrators, PI SQL librariesBuild the features and the model Some features can be built in PIConstantly ask for validation from the SME Does it make sense?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC10
Process Data Can Be Significantly Different!Features typically have to be engineered from raw dataIt is usually not the traditional “time-series” analysisPI System can do a lot! Raw, summarized, or interpolated data Event Frames Hierarchy in AF is crucialSME plays a key role#OSIsoftUC#PIWorld 2018 OSIsoft, LLC11
Is the goal of the project to predict? control?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC14
Explainability#OSIsoftUC#PIWorld 2018 OSIsoft, LLC15
TradeoffSource: ResearchGate GmbH#OSIsoftUC#PIWorld 2018 OSIsoft, LLC17
Pitfalls – Veering off the processBuilding model for something uncontrollableMixing correlation with causationNot including data engineering concerns fordeploymentNot leveraging PI capabilities in featureengineering#OSIsoftUC#PIWorld 2018 OSIsoft, LLC19
Evaluation – Loop back with the BusinessGuarantees we answered the right questionForces us to measure real value, often indollars, man-hours, or other tangible resourcesNot trivial!Caution: data scientists speak a differentlanguage than process people#OSIsoftUC#PIWorld 2018 OSIsoft, LLC20
Deployment – Data Engineers Are KeyProductizing the modelSimpler models can be deployed in PI; some controlmodels are built into the control networkConsult with PI Admins and Data Engineers earlyData Governance can pose challenges in production#OSIsoftUC#PIWorld 2018 OSIsoft, LLC21
Reproducible Work Is the DifferentiatorAssume your work is going to be repeated and tweaked frequentlyOver time: Models veer offPhysical systems changePriorities evolveNew business owners comeYou get reassigned!Leverage tools such as Jupyter Notebooks or other commercial platforms#OSIsoftUC#PIWorld 2018 OSIsoft, LLC22
The Cycle Repeats#OSIsoftUC#PIWorld 2018 OSIsoft, LLC23
Case study: Interacting withPI System data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC24
Reduce wasted cooling energyOptimize the startup of the Variable Air Volume Cooling(VAVCO) units to improve the building’s energy Time7:00 ftUC#PIWorld 2018 OSIsoft, LLC
Setting ourselves up for success How are the data streamsstructured? How do the data behave? What information is relevantfor the problem?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC26
Understanding the Asset FrameworkPI ServerData ArchiveAssetFrameworkDataSources#OSIsoftUC#PIWorld 2018 OSIsoft, LLC27
Explore hierarchy and trendsPI Vision(Values)PI System Explorer(Hierarchy)#OSIsoftUC#PIWorld 2018 OSIsoft, LLC28
Leveraging data science tools Data science tools are greatfor data exploration R and Python libraries thatuse PI Web API are availablevia PI Developers Club https://github.com/osimloeff/PIWeb-API-Client-R on#OSIsoftUC#PIWorld 2018 OSIsoft, LLC29
Transforming data to information How should I aggregate timeseries data? Which features are relevantfor model prediction? How can I make the dataavailable for modeling?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC30
Time series data are complex!VAVCO-1TemperatureAir flowHumidityVAVCO-2TemperatureAir flowHumidityCO2#OSIsoftUC#PIWorld 2018 OSIsoft, LLC31
Need to shape and export data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC33
Labeling the data – easy, right? Separate first cooling periodof day from othersCooling rateTemperatureSetpoint When is a cooling periodfinished? Typical process data issues(data alignment, gaps, etc.)#OSIsoftUC#PIWorld 2018 OSIsoft, LLC34
Event Frames help aggregate data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC35
Data ready to go into model PI Integrator for BusinessAnalytics PI OLEDB Enterprise Custom AF SDK#OSIsoftUC#PIWorld 2018 OSIsoft, LLC36
That looks funny Data gap?Warning signs: Unexpected straight lines Missing data System digital statesHumidity?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC37
Potential energy savings discovered Identified important factors forpredicting cooling time Linear regression fits the data𝑡𝑐𝑜𝑜𝑙 𝑏 𝑚1 𝑥1 𝑚𝑘 𝑥𝑘#OSIsoftUC#PIWorld 2018 OSIsoft, LLC38
Putting the model to work How can I operationalize amodel after it has beendeveloped? What options are available forrecording model predictions?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC39
Data flow implemented in the labPI SystemPI Integrator forBusiness AnalyticsAdvancedPI Web APIApacheKafkaConsumerPython-basedML Model Microsoft Azure Event/IoT Hubs SAP HANA Smart Data Streaming Asset Analytics - MATLAB Integration#OSIsoftUC#PIWorld 2018 OSIsoft, LLC40
Different tools for different stages Asset Framework PI Vision/PI ProcessBook PI DataLink (MS Excel) Python/R libraries#OSIsoftUC#PIWorld 2018 OSIsoft, LLC41
Different tools for different stages PI Integrator for BusinessAnalytics PI OLEDB Enterprise#OSIsoftUC#PIWorld 2018 OSIsoft, LLC42
Different tools for different stages PI Integrators Asset Analytics with MATLABIntegration#OSIsoftUC#PIWorld 2018 OSIsoft, LLC43
Keys to success Communication is king Process data has unique challenges PI System has tools to enable data science Your knowledge of data science is a major differentiator.Leverage it!#OSIsoftUC#PIWorld 2018 OSIsoft, LLC
Keep on learning! Labs and online courses PI World presentations Talk to other users, partners,and usList of talks available on PI Squarebit.ly/DSPIWorld18#OSIsoftUC#PIWorld 2018 OSIsoft, LLC45
Ahmad Fattahiafattahi@osisoft.comManager, Data Science EnablementOSIsoftDallas Swifttswift@osisoft.comData ScientistOSIsoft#OSIsoftUC#PIWorld 2018 OSIsoft, LLC46
QuestionsPlease wait for themicrophone before askingyour questionsPlease remember to Complete the Online Surveyfor this sessionState yourname & company#OSIsoftUC#PIWorld 2018 OSIsoft, LLC47
MerciThank YouGrazie#OSIsoftUC#PIWorld 2018 OSIsoft, LLC48
Machine Learning Deep Learning Nomenclature Data Science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms. -wikipedia 3. #OSIsoftUC #PIWorld 2018 OSIsoft, LLC . Python and R libraries by OSIsoft, PI Web API, AF SDK, PI Integrators, PI SQL .