Advanced Analytics For PI Data For Data Scientists - OSIsoft

Transcription

Advanced Analytics for PI Data for DataScientistsAhmad Fattahi – Manager, Data Science Enablement, OSIsoftDallas Swift – Data Scientist, OSIsoft#OSIsoftUC#PIWorld 2018 OSIsoft, LLC1

Goal: Gain a better understanding of data science practicesfor process data and the PI SystemAgenda Definitions and general concepts CRISP-DM Process Best practices and pitfalls Case study#OSIsoftUC#PIWorld 2018 OSIsoft, LLC2

NomenclatureArtificialIntelligenceData Science is an interdisciplinaryfield of scientific methods, processes,algorithms and systems toextract knowledge or insightsfrom data in various tUC#PIWorld 2018 OSIsoft, LLC3

CRISP-DM CRoss Industry Standard Process forData Mining Among most popular methodologies Emphasizes cycles and iterationsSource: KDnuggets#OSIsoftUC#PIWorld 2018 OSIsoft, LLC4

Story: Optimize Building Energy Consumption#OSIsoftUC#PIWorld 2018 OSIsoft, LLC5

Inception: Management or SMEStart from a “Sharp Question” “Can the building wake up later?”Business owner plays a key role Facilities ManagerEnvision the delivery mechanism “Recommendation engine? Direct control?”SME and data professionals start engaging Many conversations until they speak the samelanguage#OSIsoftUC#PIWorld 2018 OSIsoft, LLC6

PitfallsMyth: The data scientist can do it all!Targeting the wrong questionLosing sight of bottom line value to thebusinessGetting crushed between political gears#OSIsoftUC#PIWorld 2018 OSIsoft, LLC7

Q: Why did you become data scientists? A: Because “Superhero” is not a job title!#OSIsoftUC#PIWorld 2018 OSIsoft, LLC8

PitfallsMyth: The data scientist can do it all!Targeting the wrong questionLosing sight of bottom line value to thebusinessGetting crushed between political gears#OSIsoftUC#PIWorld 2018 OSIsoft, LLC9

Building the “Model”Engage with data engineers, PI Admins Python and R libraries by OSIsoft, PI Web API, AF SDK,PI Integrators, PI SQL librariesBuild the features and the model Some features can be built in PIConstantly ask for validation from the SME Does it make sense?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC10

Process Data Can Be Significantly Different!Features typically have to be engineered from raw dataIt is usually not the traditional “time-series” analysisPI System can do a lot! Raw, summarized, or interpolated data Event Frames Hierarchy in AF is crucialSME plays a key role#OSIsoftUC#PIWorld 2018 OSIsoft, LLC11

Is the goal of the project to predict? control?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC14

Explainability#OSIsoftUC#PIWorld 2018 OSIsoft, LLC15

TradeoffSource: ResearchGate GmbH#OSIsoftUC#PIWorld 2018 OSIsoft, LLC17

Pitfalls – Veering off the processBuilding model for something uncontrollableMixing correlation with causationNot including data engineering concerns fordeploymentNot leveraging PI capabilities in featureengineering#OSIsoftUC#PIWorld 2018 OSIsoft, LLC19

Evaluation – Loop back with the BusinessGuarantees we answered the right questionForces us to measure real value, often indollars, man-hours, or other tangible resourcesNot trivial!Caution: data scientists speak a differentlanguage than process people#OSIsoftUC#PIWorld 2018 OSIsoft, LLC20

Deployment – Data Engineers Are KeyProductizing the modelSimpler models can be deployed in PI; some controlmodels are built into the control networkConsult with PI Admins and Data Engineers earlyData Governance can pose challenges in production#OSIsoftUC#PIWorld 2018 OSIsoft, LLC21

Reproducible Work Is the DifferentiatorAssume your work is going to be repeated and tweaked frequentlyOver time: Models veer offPhysical systems changePriorities evolveNew business owners comeYou get reassigned!Leverage tools such as Jupyter Notebooks or other commercial platforms#OSIsoftUC#PIWorld 2018 OSIsoft, LLC22

The Cycle Repeats#OSIsoftUC#PIWorld 2018 OSIsoft, LLC23

Case study: Interacting withPI System data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC24

Reduce wasted cooling energyOptimize the startup of the Variable Air Volume Cooling(VAVCO) units to improve the building’s energy Time7:00 ftUC#PIWorld 2018 OSIsoft, LLC

Setting ourselves up for success How are the data streamsstructured? How do the data behave? What information is relevantfor the problem?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC26

Understanding the Asset FrameworkPI ServerData ArchiveAssetFrameworkDataSources#OSIsoftUC#PIWorld 2018 OSIsoft, LLC27

Explore hierarchy and trendsPI Vision(Values)PI System Explorer(Hierarchy)#OSIsoftUC#PIWorld 2018 OSIsoft, LLC28

Leveraging data science tools Data science tools are greatfor data exploration R and Python libraries thatuse PI Web API are availablevia PI Developers Club https://github.com/osimloeff/PIWeb-API-Client-R on#OSIsoftUC#PIWorld 2018 OSIsoft, LLC29

Transforming data to information How should I aggregate timeseries data? Which features are relevantfor model prediction? How can I make the dataavailable for modeling?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC30

Time series data are complex!VAVCO-1TemperatureAir flowHumidityVAVCO-2TemperatureAir flowHumidityCO2#OSIsoftUC#PIWorld 2018 OSIsoft, LLC31

Need to shape and export data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC33

Labeling the data – easy, right? Separate first cooling periodof day from othersCooling rateTemperatureSetpoint When is a cooling periodfinished? Typical process data issues(data alignment, gaps, etc.)#OSIsoftUC#PIWorld 2018 OSIsoft, LLC34

Event Frames help aggregate data#OSIsoftUC#PIWorld 2018 OSIsoft, LLC35

Data ready to go into model PI Integrator for BusinessAnalytics PI OLEDB Enterprise Custom AF SDK#OSIsoftUC#PIWorld 2018 OSIsoft, LLC36

That looks funny Data gap?Warning signs: Unexpected straight lines Missing data System digital statesHumidity?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC37

Potential energy savings discovered Identified important factors forpredicting cooling time Linear regression fits the data𝑡𝑐𝑜𝑜𝑙 𝑏 𝑚1 𝑥1 𝑚𝑘 𝑥𝑘#OSIsoftUC#PIWorld 2018 OSIsoft, LLC38

Putting the model to work How can I operationalize amodel after it has beendeveloped? What options are available forrecording model predictions?#OSIsoftUC#PIWorld 2018 OSIsoft, LLC39

Data flow implemented in the labPI SystemPI Integrator forBusiness AnalyticsAdvancedPI Web APIApacheKafkaConsumerPython-basedML Model Microsoft Azure Event/IoT Hubs SAP HANA Smart Data Streaming Asset Analytics - MATLAB Integration#OSIsoftUC#PIWorld 2018 OSIsoft, LLC40

Different tools for different stages Asset Framework PI Vision/PI ProcessBook PI DataLink (MS Excel) Python/R libraries#OSIsoftUC#PIWorld 2018 OSIsoft, LLC41

Different tools for different stages PI Integrator for BusinessAnalytics PI OLEDB Enterprise#OSIsoftUC#PIWorld 2018 OSIsoft, LLC42

Different tools for different stages PI Integrators Asset Analytics with MATLABIntegration#OSIsoftUC#PIWorld 2018 OSIsoft, LLC43

Keys to success Communication is king Process data has unique challenges PI System has tools to enable data science Your knowledge of data science is a major differentiator.Leverage it!#OSIsoftUC#PIWorld 2018 OSIsoft, LLC

Keep on learning! Labs and online courses PI World presentations Talk to other users, partners,and usList of talks available on PI Squarebit.ly/DSPIWorld18#OSIsoftUC#PIWorld 2018 OSIsoft, LLC45

Ahmad Fattahiafattahi@osisoft.comManager, Data Science EnablementOSIsoftDallas Swifttswift@osisoft.comData ScientistOSIsoft#OSIsoftUC#PIWorld 2018 OSIsoft, LLC46

QuestionsPlease wait for themicrophone before askingyour questionsPlease remember to Complete the Online Surveyfor this sessionState yourname & company#OSIsoftUC#PIWorld 2018 OSIsoft, LLC47

MerciThank YouGrazie#OSIsoftUC#PIWorld 2018 OSIsoft, LLC48

Machine Learning Deep Learning Nomenclature Data Science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms. -wikipedia 3. #OSIsoftUC #PIWorld 2018 OSIsoft, LLC . Python and R libraries by OSIsoft, PI Web API, AF SDK, PI Integrators, PI SQL .