Evolving Autonomous Networks - NetSys 2021

Transcription

Evolving Autonomous NetworksSep 13th, 2021Laurent CiavagliaAutonomous Networking Research & Innovation Dept.Rakuten Mobile, Inc.

Content presented here is based on my colleagues work and artThey are much more expert than me on the inner workings and dirty detailsAll hard questions and complaints should be sent to them :-)Thank you !2

Rakuten Mobile3

Rakuten Mobile4

Rakuten Communications Platform (RCP)5

Autonomous NetworkingDivisionDedicated to makingTruly Autonomous Networksa realityTeam of PhD researchers, SWengineers & experienced Telcoprofessionals6

Why Autonomous Networks ?Peter Baer7

Why Autonomous Networks EXFutureProof8

What is Autonomy onomicSelf *IntelligentNo single, universally applicable and agreed definition of autonomous networkingBut we can refer to common principles and properties9

Self-CHOP [*]Self-configuration Adapt to changing conditionsby changing their ownconfigurations Addition and removal ofcomponents or resourceswithout service disruptionSelf-optimization Constantly monitorpredefined system goals andperformance levels to ensurethat all systems run atoptimum levelsSelf-healing Recognize and diagnosedeviations from normalconditions and take action tonormalize them Proactively circumvent issuesthat could cause servicedisruptionsSelf-protection Incorporation of intelligenceto recognize and circumventsecurity threats[*] "The vision of autonomic computing” by J.O. Kephart et al.10

The four ‘A’AutomaticAwareAdaptiveAutonomous- Automatic, because machines are more proficient for systematic and exhaustive tasks than humans- Aware, to gain situational awareness and guide reactive/proactive decision processes- Adaptive, to change its decisions and operations to maintain value delivery ; because anomalies and (new)attacks are constantly detected- Autonomy, as each event translates into different local actions11

The four y, the four properties qualify an autonomic system and are referred to as the 4 ‘A’.Sometimes, a fifth ‘A’ is added:- Abstraction, to enable coordination between heterogeneous equipmentUltimately, this boils down to the essential coupling of automation with the intelligence that will drive ittowards cognitive operation.Tackling the automation challenge is necessary but not sufficient. Automation alone can only adapt within thefunction pre-defined scope and settings. Higher levels of (networked) autonomy can be reached by combiningthe automatic, aware and adaptive (and abstraction) properties [*].[*] partially based on “Towards Autonomic Networks” by S. Schmid et al.12

An attempt at terms disambiguationAutomatic, automated, automationthat occurs without human interventionAutonomous, autonomicthat manages itself without external interventionCognitive, cognitionthat involves intellectual processes involved in gaining knowledge, comprehension, problem solving anddecision making Self-organizingthat achieves steady state without external control13

Our GoalDevise an “artificial engineer”that has the capability to problem-solve with minimal to no human intervention14

Beyond Automation towards AutonomyAutonomyAutomationImplies: large degree of adaptation, learning and decision makingby the system itself.Independent operation of a system: within well-defined parameters based on a limited set of predefinedrules or constraints0102Decide when to act ?Deal with unknown situations ?03Invent a new approach ?15

Evolving autonomous networksMake NewLogicValidate LogicReal-time ResponsiveExperimentationEvolutionary ExplorationDynamic AdaptationApply Logic16

Autonomy engineCreate LogicValidate LogicAutonomy engineApply LogicAdaptation FactoryCreating & Validating New ControllersAutonomous Control PlaneNetwork Operation & ManagementKnowledge CenterStoring Controllers & Knowledge of Different Types17

Closed Loop (controller) : How to Apply ageNetworkModularizationWhat is an autonomous building block?SpecificationHow to describe building blocks?StandardizationWhat is the right form of interoperability?18

Evolution: Make New Logic“Codify-able process of creativity.”Parameters of amoduleChoice ofmoduleTopology of acontrollerAnalysis 2Analysis 1State spaceHow to reduce the number of possiblechoices?AutoML-Zero: Evolving Code that LearnsConvergence ProblemHow can we make the right choices in areasonable time?Exploitation vs ExplorationWhen is the right time to try something new?19

Online Experimentation: Validate Logic“Humans test their ideas so should machines.”Trust & ValidationGuarantee controllersare fit for purposeDesigned by FreepikHow to recreate an effectiveenvironment per use case automatically ?Simulation & Canary TestingHow to balance simulation and Canary testing ?Digital TwinHow to experiment without breaking the real network?20

Specialized approach :one tool per caseGeneric approach : FrameworkMotivations Generalization reduces workload Reuse of technology becomes easy All knowledge and state is available21

Building Blocks“All functionalitydeconstructed into smallatomic modules”1Use Case : Traffic DistributionAnalysisDecision Re-use across domains Flexible and adaptableHTTP serverrequest metricOther ServiceDeployment DBTime seriesDBActionSensingMotivations Extensibility22

Hybrid Intelligence“The AI / ML / EL to useis just anotherFunctional Building Block .We can use better one,once available”2Use Case : Traffic DistributionTraffic PredictionMotivations Ease of new algorithm intro Adaptable to use case / environmentHeuristicApproachDeep LearningApproach It’s just another building blockRule basedApproach23

Cognitive Loop“We build thefundamental cognitiveloop out of atomicmodules”3Use Case : Traffic DistributionMotivationsCONTROLLER “Standard” representation of CognitionAnalysisSensingDecision Same concept can be applied everywhere and changed as needed.Action24

Composition“But not just anything canwork together. So, we usespecialized connectors toensure modules dock onlyon to compatible ones.”4Sensing BSensing AUse Case : Traffic DistributionMotivations Ensure API/functionality compatibility Keep the research space manageable Standardized and generalized interfacesimprove reuse and replaceability of similarmodules.25

OnlineEvolutionIGenetic mutation and recombination“ gives us the tools, weconstruct various loops,try them out, improve ,so that utility improvesDarwin's finches5Curved parrot likebeak for crushing nutsfeeds primarilyon insects,small arthropods andcaterpillarsUse Case : Traffic Distributionfeeds primarily onseeds, eat flowers,buds, and theoccasional insectEvolutionAdapting tothe conditionsSlender beak forcatching smallinsects and spiders.Motivations Modern networks becoming hard toreason about and simulate Evolution traverses massive searchspaces easily.26

OnlineEvolutionIINatural Selection“ trial and errorexperimentation.”Developmental research in 2004 showed that the development of the different beak shapes in Darwin'sfinches are influenced by slightly different timing and spatial expressions of a gene called calmodulin (CaM)and the bone morphogenetic protein 4 (BMP4).6Use Case : Traffic DistributionMotivationsTraffic ConditionsOnlineExperimentation Do not require domain knowledgeUtilitymeasure Trying out novel solutions in the netcan come out with unexpectedImprovements, instead of reflecting thearchitects' knowledge.27

MetaEvolutionSelf reflective“We construct our loopby means of anotherloop.”7Use Case : Traffic DistributionC-LevelMiddle ManagerDevise strategySelect better ManagerEducate ManagerDevise TacticSelect better workerTrain workerMotivations Autonomy means self-reflection andself-improvement ( self-*)WorkerImplement Tactic Flexibility to find best solutionrequires ability to adapt framework Separate control loop to ensure thatsupervision does not deteriorate.Credit: 03028

Controller Hierarchy“Just like a companyupper level supervisesand teach subordinates.”Use Case : Traffic Distribution8Motivations Can integrate all use cases Higher layers’ jobs get more abstract Clear responsibilities and auditing29

Putting it all Together“Human’s stillproduce themodules until wecan get emergentbehavior right.”930

Ecosystem“You will not reach themoon by trying to flightyour plane higher ”Levels of AutonomyPartner With AcademiaWin- Win relations and virtuous cycleR&DAnd more.31

Overview of autonomous networks standardization landscape32

The need for AN standardsThe need for standards is simple The problem and challenges are too big to be solved by individual initiatives Solutions will emerge from collaborative work and partnerships But global scale adoption will require interoperable systemsThe key question is: What needs to be standardized ? Communication interfaces between functional blocks and devices Resource models Service interfaces Common and consistent management principles and language Context- and goal-oriented management33

Autonomous networking standardizationITU-T FutureNetworks FocusGroup, Rec. Y.3001“Holistic”AutonomicCommunicationsForum (ACF)2008SON Functions 3GPPRelease 1120132014NGMN NGCORITU-T FG ANITU-T ML5GRFC7575RFC7576AFI-002 GSETSI AFI ISG2012“IP” networksIETF ANIMA WGcreationMobile networksSON Functions 3GPPRelease 82004IRTF NMRGAN workshopseries2015ETSI ZSMETSI ENITMF ANPBBF AIM2017201820192020TMF ZOOM34

ITU-T Focus Group on Autonomous Networks35

AN (pre-)standardization in IRTF and IETFNMRG Autonomic Networking (2013-2014) RFC 7575 - Autonomic Networking: Definitions and Design Goals RFC 7576 - General Gap Analysis for Autonomic NetworkingIntent-based Networking (2016-Present) ANIMA WG Reference model RFC 8993 - A Reference Model for Autonomic NetworkingControl Loops ner-anima-control-loops-01 Good overview of control loops state-of-the-art and requirements; expired document RFC 8969 - A Framework for Automating Service and Network Management with YANGOPSAWG Network Telemetry Framework psawg-ntf-07Service Assurance for Intent-based Networking Archietcture wg-service-assurance-architecture/36

Artificial Intelligence for Network and Service Automation37

Joint evolution of AI and Ops2020Raw AI & Automated Ops2022-2023Advanced AI & AI-assisted Ops2025Lean AI & AI-empowered OpsAI & DataLimited view and use of AI potentialBig dumb dataMore diversified, network-adapted AI techniquesSmarter dataBroad set of AI techniques for N&S environmentIntelligent dataScale &AdoptionUse case-drivenIsolated, small-scale solutions with limited re-useCross use casesLarge scale application and penetration of AI-based N&Sautomation solutions“AI-as-a-Service”Full scale deployment and applicability of AI-enabled, plug-nplay solutionsBeyond 2025Intuitive AI & Autonomous OpsZero-touch AI-OpsMachine ReasoningPractice &IntegrationRetrofit ML technologies for N&S automationManually-intensive integrationAI know-how is leveraged for N&S automationSemi-automated design and integrationDesigned with AISeamless design and integrationSymbiotic Human-AI interactionMission autonomyTransparent, trusted, open AIReliable, robust and distributed AIConfidence &SecurityControlled autonomy and confined in scopeNo AI-specific security measuresTowards operation autonomyTrust framework safeguards AI-based solutionsAI-specific security techniques protect N&S operationsTowards mission autonomyAI continuously and reliably delivers on the business targetsGuaranteed AI functional safetyStandards &RegulationLack of standardsConsultations with authorities and stakeholdersEmerging standards and basic interoperabilityFirst compliant AI-based N&S automation solutionsComprehensive standards and increased interoperabilityFully embedded policies and principlesSource: ETSI ZSM, IRTF NMRG38

Standardization scopeEnable innovation and differentiation with AI in multi-vendornetwork and service management environmentKey enablers and functionalitySupport for deployment diversityTrust and adoptionMediation between data sources and data processing,augmented with meta-data models and datagovernanceData: data sources, their locations and characteristics (local,ephemeral ), data distribution, data storageOn-par privacy and security environment;improvement and alignment to capabilities andconstraints of AI-based solutionsSupport for unified and expressive data formats toallow AI workflow automation and plug-and-playCoordination between multiple, distributed AIapplications, ensuring compliance with intents,consistent end-to-end operational view and means toact on itAI models life-cycle management, re-usability ofgenerated knowledge and acceleration of modelsdeploymentCompute: computation elements locations, types andcapabilitiesOperations: constraints and capabilities for various AImodels training and inference options; connecting the AIapplications to the orchestration and control end pointsConsidering also other factors for regulatory and sustainableapproach (energy, data sharing/replication, compute/dataco-location)Support for different levels of supervision andvisibility for human operatorsSupport incremental evolution to AI/ML, integrationof learnings from experience and deployments to thestandardization processOpenness vs. trust dilemma: new disaggregatedsolutions add management complexity and call formore transparency and accountabilitySource: ETSI ZSM, IRTF NMRG39

Key enabling areasSUSTAINABILITYINNER THICSINTER AISource: ETSI ZSM, IRTF NMRG40

Relevant SDOs landscapeSource: ETSI ZSM41

To go deeper Rakuten Mobile Innovation Studio websitehttps://netlab.mobile.rakuten.co.jp/Vision paper: Towards a Truly Autonomous pdf/towards a truly autonomous network.pdfITU-T Focus Group on Autonomous n/Pages/default.aspx42

buds, and the occasional insect Slender beak for catching small insects and spiders. Darwin's finches Curved parrot like beak for crushing nuts Adapting to the conditions feeds primarily on insects, small arthropods and caterpillars Modern networks becoming hard to reason about and simulate Evolution traverses massive search spaces easily.