MLOps: Machine Learning Operationalization

Transcription

MLOps: Machine LearningOperationalization

MLOps: Machine LearningOperationalization Nisha Talagala, Co-Founder, CTO & VP Engineering,ParallelM Boris Tvaroska, Global Artificial Intelligence SolutionsLead, Lenovo

MLOps: Machine LearningOperationalization Webinar recording and slides will be available shortly Share questions with panelists using the Question panel Q&A session following presentations

MLOps: Machine ntime Focus:

Machine LearningOperationalization

MLOps: Machine a TalagalaCo-Founder, CTO & VP EngineeringParallelMnisha.talagala@parallelm.com

MLOps: The Last MileFrom Data Science toBusiness ROINISHA TALAGALACTO, ParallelMCONFIDENTIAL

Growing AI Investments; Few Deployed at Scale20%Out of 160 reviewed AIuse cases:But successful earlyAI adopters report:88% did notProfit marginsprogress beyondthe experimentalstagehigher thanindustry averageAI in gSurvey of 3073 AI-awareC-level Executives3–15%Source: “Artificial Intelligence: The Next Digital Frontier?”, McKinsey Global Institute, June 2017CONFIDENTIAL

The ML Development and Deployment CycleBulk of effort today is in the left side of this process (development) Many tools, libraries, etc.Democratization of Data ScienceAuto-MLCONFIDENTIAL

What makes ML uniquely challenging in production?Part I : Dataset dependency ML ‘black box’ into which many inputs (algorithmic, human, datasetetc.) go to provide output. Difficult to have reproducible, deterministically ‘correct’ result as inputdata changes ML in production may behave differently than in developer sandboxbecause live data training dataCONFIDENTIAL

What makes ML uniquely challenging in production?Part II : Simple to Complex Practical Topologies Multiple loosely coupled pipelines running possibly in parallel, withdependencies and human interactions Feature engineering pipelines must match for Training and Inference(CodeGen Pipelines can help here) Control pipelines, Canaries, A/B Tests etc. Further complexity if ensembles, federated learning etc are usedCONFIDENTIAL

What makes ML uniquely challenging in production?Part III : Heterogeneity and Scale Possibly differing engines (Spark, TensorFlow, Caffe, PyTorch, Sci-kitLearn, etc. ) Different languages (Python, Java, Scala, R .) Inference vs Training engines Training can be frequently batch Inference (Prediction, Model Serving) can be REST endpoint/custom code, streaming engine,micro-batch, etc. Feature manipulation done at training needs to be replicated (or factored in) at inferenceEach engine presents its own scale opportunities/issuesCONFIDENTIAL

What makes ML uniquely challenging in production?Part IV : Compliance, Regulations Established: Example: Model Risk Management in Financial Services Emerging: Example GDPR on Reproducing and Explaining MLDecisions e-gdpr/Emerging: New York City Algorithm Fairness Monitoring ONFIDENTIAL

What makes ML uniquely challenging in production?Part V : Collaboration, ProcessCOLLABORATION Expertise mismatch between Data Science & Ops complicateshandoff and continuous management and optimizationPROCESS Many objects to be tracked and managed (algorithms, models,pipelines, versions etc.) ML pipelines are code. Some approach them as code, some not Some ML objects (like Models and Human approvals) are not besthandled in source control repositoriesCONFIDENTIAL

MLOps – Automating the Production ML LifecycleML OrchestrationMachine baseModelGovernanceML HealthBusinessImpactBusiness ValueCONFIDENTIAL

MLOps, DevOps and SDLC Integrate with SDLC (Source control repositories, etc.) for code Integrate with DevOps for Automation, Scale and CollaborationManage ML ApplicationMeasure Business SuccessManage RiskMLOpsCompliance & ateCONFIDENTIAL

How it Works – MCenter Architecture)CDSW(Data SciencePlatformsMCenterDeveloper ConnectorsMCenterMCenter gentAnalyticEnginesModels, RetrainingControl, StatisticsEvents, AlertsDataData StreamsCONFIDENTIALData LakesMCenterAgentMCenterAgent

Summary We are at the beginnings of ML Operationalization Much like databases (backbone of production applications) needDBAs and software needs DevOps, ML needs MLOps (specializedoperationalization practices, tools and training) For more information https://www.mlops.org for MLOps resources https://www.parallelm.comCONFIDENTIAL

Thank Younisha.talagala@parallelm.comCONFIDENTIAL

Machine LearningOperationalization

MLOps: Machine s TvaroskaGlobal Artificial Intelligence Solutions,Lenovobtvaroska@lenovo.com

Integrating data science into SDLCBoris TvaroskaSeptember 20182018 Lenovo Internal. All rights reserved.22

Evolution of AIMoving from research papers to applicationsResearch aboutAIAI inproducts & servicesReports using ML/DL2018 Lenovo Internal. All rights reserved.23

What can happen?I did not change a single line of code.Junior Software Engineer after breaking the build2018 Lenovo Internal. All rightsreserved.24

Different rmData Starts with change in code Established practice Iterations in days / weeks2018 Lenovo Internal. All rightsreserved. Starts which change in code, data ormetrics Emerging practice Iterations as fast as possible, severaltimes per day25

Main challengesTestBuild & Deploy The wrong result is acceptable Need to test for False Positives Need to test for False Negatives More artifacts to work with Frequent changes Versioning of artifacts and source data Longer test times More test cases needed2018 Lenovo Internal. All rightsreserved.26

Training in test/build cycleCodeBuildTestPossible for simple modelswith small amount of dataTrainCodeTestBuildExisting toolsetHyperparametersRisks:- Slow CI/CD cycle- More failing buildsIndependent cyclesExperimentCross-ValidTrainCVTestDataset2018 Lenovo Internal. All rightsreserved.27

Model as a serviceCodeBuildTestModel is independentFit - Interface is vector- Pre-mature serviceboundaries- Multi-step applicationTestIntegrateIndependent aset2018 Lenovo Internal. All rightsreserved.28

SW emerged in Data ScienceCodeBuildTestClearly defined serviceData Science toolsetData Science frameworkRisks:- Culture VTestDataset2018 Lenovo Internal. All rightsreserved.29

Practical exampleLibrariesTransform2019 Lenovo Internal. All rightsreserved.TrainValidateBuildTestDeploy30

Boris TvaroskaGlobal Solution Lead for LenovoAI Innovation Centers20 years of experience running engineering teams across Europe,North and South America, Middle East, nkedin: www.linkedin.com/in/boristvaroskaTwitter:2017 Lenovo Internal. All rightsreserved.@btvaroska31

Q&A

Making ationalization Nisha Talagala, Co-Founder, CTO & VP Engineering,ParallelM Boris Tvaroska, Global Artificial Intelligence SolutionsLead, Lenovo

Learn more about our Platform:https://www.activestate.com/platform Watch a demo:https://www.youtube.com/watch?v c5AIxN9ehrI Contact platform@activestate.com for moreinformation.

Platform PresentationWhere to find usTel: 1.866.631.4581Website: www.activestate.comTwitter: @activestateFacebook: /activestatesoftware

Different languages (Python, Java, Scala, R .) Inference vs Training engines Training can be frequently batch Inference (Prediction, Model Serving) can be REST endpoint/custom code, streaming engine, micro-batch, etc. Feature manipulation done at trai