Adventures Of Building A Multi- Tenant PaaS On Microsoft Azure

Transcription

Adventures of building a multitenant PaaS on Microsoft AzureTom KerkhoveAzure Architect at Codit, Microsoft Azure MVP, Creator of PromitorTwitter: @TomKerkhoveGitHub: @TomKerkhoveblog.tomkerkhove.becodit.eu

DisclaimerYou’ll learn about my adventures & findings, not about silver bullets2

Scale3

Scale up/downScale Easiest way of scaling is to get a bigger box The only trade-off is that it means your app will be unavailable for a while At some point you’ll run out of “bigger boxes”June 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure4

Scale out / inScale Provide multiple copies of your application based on your workload No impact on your uptime, but more complex My preferred way of scaling, but your application needs to be designed for itJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure5

Choose the right compute infrastructureFunctionsServiceAppClusterBare MetalFunctionsContainerInstancesServiceFabric MeshServiceFabricVM ScaleSetsKubernetesVMsCloudServices As control increases, so does complexity Every service has it’s own characteristics How you run your application How you package your application How you scale your applicationJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure6

Designing for scaleAzure FunctionsScaleJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft AzureOrder FunctionOrder FunctionOrder FunctionOrder FunctionOrder FunctionOrder Function7

Designing for scale with serverlessScale The good The service handles scaling for you The bad The service handles scaling for you Does not provide a lot of awareness The ugly June 2019Dangerous to burn a lot of moneyAdventures of building a (multi-tenant) PaaS on Microsoft Azure8

Source: ry/June 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure9

Designing for scale with PaaSScaleCloud ServicesOrders ceInstanceInstanceScale!MessageCount 1,Add instanceAutoscalerJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure10

Designing for scale with PaaSScale The good You need to define how it scales Provides you with scaling awareness The bad You need to define how it scales Hard to determine the perfect scaling rules The ugly Be aware of “flapping” (http://bit.ly/monitor-autoscale-best-practices) Be aware of infinite scaling loopsUse an Azure Monitor AutoscaleJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure11

Designing for scale with CPaaSScaleClusterNode 1Node 2Node 3PodPodPodPodPodPodPodPodPodPodPodCustom MetricProviderHorizontal PodAutoscaler(s)ClusterAutoscalerPod12

Azure Container InstancesContainer GroupContainer GroupPodPodContainer GroupPodClusterNode 1Node 2Node 3PodPodPodPodPodPodPodPodPodCustom MetricProviderHorizontal PodAutoscaler(s)Virtual Kubelet13

Designing for scale with CPaaSScale The good Share resources across different teams Serverless scaling capabilities are available with Virtual Kubelet & Virtual Nodes The bad You are in charge of providing enough resources With great power, comes great responsibilities No autoscaling out-of-the-boxScaling on different levels Scaling can become complex(er) The ugly Takes a lot of effort to ramp up on how to scale There’s a lot to manageJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure14

Use the tool that fits your needsDon’t use a service because you know it, evaluate your optionsEvery technology has its trade-offs, learn themDon’t overengineer, “because we’ll need it later”You don’t need hyper scale from day I15

Create awareness around your autoscalingScale Avoid burning money, get notified before it’s too late! Gain insights in your autoscaling rules Either configured by you or managed by Azure (ie. Azure Functions) Learn from them and tweak them Detect autoscaling loops in TEST instead of during live-site issue Choose the approach that fits your needs Configure Azure Monitor notifications Use built-in metrics to visualize and alert on Provide your own tooling around itJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure16

Create awareness around your autoscalingScaleJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure17

TipsScale Resource consolidation pattern does not play nice with autoscaling Configure maximum instance count for your autoscaling Provide representable metrics of your remaining work Azure Monitor Autoscale is a hidden gem in Azure, use it! Does all the great things an autoscaler should do Use budget alerts, if feasibleJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure18

Tenancy19

What is our pricing model?Do we need to reflect this in our tenancy?How much isolation does it require between tenants?How much customization will we allow?Multi-tenancyMulti-tenancyis moreis all thanaboutdatachoicesshardingDo our tenants need access to their data?Will it run in multiple regions?How will you deploy your application?Will one region require multiple deployments?20

Choosing a tenancy modelTenancyAppTenant AAAppTenant BBFull isolation between tenants bydeploying everything for every tenantApp#1 #nRun a multi-tenant application, but usesharded data layerStamp AStamp BStamp CStamp D#1 #nApp deployed in multiple stamps &geographies with sharded data layer21

Choosing a sharding strategyTenancy Spread all your data across multiple smaller databases instead of one big one Good example of scaling out to handle load A shard key is used to determine the shard based on the chosen strategy Choose your strategy wisely and think about your query patterns Does your customer need access to it? Then you should shard per tenant! You cannot easily change your strategy later on More information: http://bit.ly/sharding-patternJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure22

Locating shardsTenancyHow do I connect to “Sello”?OrderProcessorShardManagerGet Secret“Sql-Tenant-Sello”Use this connection string#1June 2019#2#3#4Adventures of building a (multi-tenant) PaaS on Microsoft Azure#5#6#7#823

Using shard managersTenancy Provide catalog of all shard in the platform Determine current shard based on shard key & chosen approach Metadata is stored in a store of choice Be careful where you store your secrets Choosing a good shard manager They should handle secrets in a secure manner Build your own, ie on top of Azure Key Vault Use existing tool, ie Azure SQL Database Elastic ToolsJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure24

Cost-efficient shardingTenancy In a PaaS world you need to pay for every data store instance you have.Elastic PoolS1 – 5%#6#7#8S1 – 41%S1 – 15%#5S1 – 21%S1 – 5%#4S1 – 7%#3S1 – 50%#2S1 – 25%#1 We pay 200 for 160 DTU, but only use 20 % of itUse an Azure SQL Elastic Pools25

Cost-efficient shardingTenancy Resource pools have a resource limit for all shardsWe ran out, we onlyhave 200 DTUElastic PoolI need more resources!!!#1#2#3#4#5#6#7#87 DTU15 DTU8 DTU25 DTU24 DTU7 DTU9340 DTU21 DTUJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure26

Cost-efficient shardingTenancy Enforce resource limitation on a per-shard levelYou’ve had enoughElastic PoolI need more resources!!!#1#2#3#4#5#6#7#87 DTU15 DTU8 DTU25 DTU3724 DTU7 DTU5040 DTU21 DTUJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure27

Cost-efficient shardingTenancy Provide multiple resource pools to reduce impact of noisy neighborsResource Pool AResource Pool CResource Pool B#1#2#3#4#5#6#7#87 DTU15 DTU8 DTU37 DTU24 DTU7 DTU50 DTU21 DTUJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure28

Cost-efficient shardingTenancy Reflect your pricing model in your resource poolingBasic Pool5 DTU#6Provision 250 Standard DTUs , capped at 100Adventures of building a (multi-tenant) PaaS on Microsoft Azure#7#8121 DTU24 DTUJune 2019#5158 DTU37 DTUProvision 100 Basic DTUs , capped at 50#477 DTU#363 DTU#251 DTU#1Premium PoolStandard PoolProvision 500 Premium DTUs,capped at 25029

Cost-efficient shardingTenancy Consider moving all shards in a resource pool Configure maximum consumption per database Consider using multiple resource pools to reduce impact of noisy neighbor Resource pools are a great way to reflect your pricing model Monitor your pools as you would do for individual databasesJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure30

Determining tenantsTenancy Determining the tenant that is consuming your service via your API gatewayBill Bracket owns ABC.Part of “Sello” groupPOST api/v1/ordersX-API-Key ABCAPIGatewayPOST api/v1/ordersX-Tenant SelloWhat shard is “Sello”?ShardManagerAPIDB Map the authentication key to the registered application and use its contextJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure31

Determining tenantsTenancy policies inbound base/ set-header name "X-Tenant" exists-action "override" value @(string.Join(";", (from item in context.User.Groups whereitem.Name.ToLower().Contains("sello -") select item.Name.Replace("Sello - ", String.Empty).Trim()))) /value /set-header /inbound backend base/ /backend outbound base/ /outbound on-error base/ /on-error /policies June 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure

Monitoring33

34

35

Monitoring is a shared responsibilityYou only value good monitoring, if you’ve been on the other side of the fenceTrain your developers to use their own toolchain,use automated tests on live infrastructure36

Enrich your telemetryCorrelated all your telemetry to provide a logical flow, not just tracesProvide app-specific contextual information to all telemetryAlways return your correlation ids to your consumersNever track personal identifiable informationUse different layers of correlation idsUse consistent terminology37

Correlate your telemetryMonitoringSession XYZOperation ABCGet ProductsDBFrontendCreate OrderJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft AzureAPIOperation DEFOrderProcessor38

Correlate your telemetryMonitoringSession XYZOperation ABCGet ProductsDBFrontendAPIOperation DEFCycle 123Create OrderJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft AzureOrderProcessor39

Health checksReport status of your application - Is my application healthy? Is it ready?Use them to verify deployments, measure latency, up time, cold start, Always provide throttling to block noisy consumersThink about your connection managementGo as far as you wantNo direct business value, until it’s too late40

Health checks41

Handling alertsMonitoring Always automate alert creation, they are part of your infrastructure as well Build a centralized alert handling process Azure Logic Apps is a good fit for thisAzure MonitorMetrics AlertAdapterAzure MonitorClassic AlertAdapterAzure MonitorCommon AlertAdapter Different alerts have different contracts Use adapters to receive notifications Map to internal metric contract Handle via centralized alert handlerCentralizedAlert HandlerTime to move it! Azure Classic Alerts will be deprecated by end of August monitor/platform/monitoring-classic-retirementJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure42

Handling alertsMonitoring Use the Logic App template for Azure Monitor!June 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure43

Write Root Cause Analysis (RCA)Train your team for PROD outages, write RCAs in all environmentsDid our alerts detect it? Did we have enough telemetry?Provides a structured way of analysing your platformUse as a knowledge transfer to customers & teamDefine action points and follow-up on themUse them to detect recurring issuesThere is no such thing as failure, only opportunities to learn44

Webhooks45

Consuming webhooksWebhooks Generated URLs are evil, provide good DNS names of your services And this goes for everything, not only webhooksWhere did123456 go?!3rdPartyPOST rovider.com)(12345.provider.com)Use an API GatewayJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure46

Consuming webhooksWebhooks Do not reduce your API security because of your 3rd Party3rdPartyPOST der.com)Use an API GatewayJune 2019APIAdventures of building a (multi-tenant) PaaS on Microsoft AzureWhere isyour cert?47

Consuming webhooksWebhooks Always route webhooks through an API gateway3rdPartyPOST http://sello.com/api/v1/webhooksAPIGatewayPOST cation: CertificateAPI This decouples the webhook from your internal architectureJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure48

Consuming webhooksWebhooks Always route webhooks through an API gatewayOrdersAPI3rdPartyPOST http://sello.com/api/v1/webhooksAPIGatewayPOST uthentication: CertificateSome gateways support authkey via query parametersStockAPI This decouples the webhook from your internal architectureJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure49

Provide user-friendly webhooksProvide a way for consumers to provide context during registrationProvide a self-service CRUD API to register new subscriptionsPass your correlation id via Request-Id headerProvide an invocation historyThink as a webhook consumer, not publisher.50

Spaghetti infrastructure rvicePaymentServiceStockServiceJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft AzureInvoiceService51

Spaghetti infrastructure 2.0?Webhooks Long-term this can start to become a burden A lot of bookkeeping to know who to update, how we should authenticate, etc No central place to route all webhooks through Your platform needs to be robust What if subscriber II is not responding? Let’s build a retry mechanism! Who says subscriber II owns foo.bar.com? Webhooks should use a “I don’t care, here’s an update” approachJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure52

Event terWarehouseServiceStockServiceJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft AzurePaymentServiceInvoiceService53

Event RoutersWebhooks Event Routers do all the heavy lifting for you Provide a centralized hub for all things eventsBookkeeping of whom subscribes to what webhooks and eventsThey will retry sending events when they did not replyThey will perform webhook validations Publishers can publish events to the event router and takes it from there Great for for internal usage, but harder to use with 3rd parties Webhook validation is not always easy to setupJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure54

TipsWebhooks Webhooks are not durable, if you are not around you will miss it. If you need to ensure at-least-once delivery, consider using a broker instead Store audit entry of webhook that are being pushed Can be important in case of a dispute Optionally even include the response of the consumer Do not only allow global registrations, consider serving more granular updates For example, I want updates of one flight instead of all flights Provide rate limiting on your webhook endpoints Don’t let your platform go down by your 3rd party provider Webhooks are contracts as well Provide good documentation and version themJune 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure55

Use Webhooks & Events internallyBuild fully automated reactive applications / data ingestion pipelinesDecouple teams from each otherProvide capability to extend56

June 2019Adventures of building a (multi-tenant) PaaS on Microsoft Azure57

Azure Event Grid, the heart of AzureExample Azure Key Vault is working on native Event Grid Events This provides the capability to fully automate certificate management The power of these events can leverage closer integration by other services such asAzure App Services, API Management who can consume latest version of -heart-of-azure/58

No.59

Embrace Change60

61

How we used to shipStable releases every few yearsHard to shift product focus62

And then came agile 63

Releasing Software to Productionmultiple times a dayDevOps64

DevOops65

You are not NetflixDevOps is a culture and requires a mind shiftManual interventions are evil, automate (as much as possible)Create automated pipelines for shipping software(deployment rings are awesome)Use infrastructure/build as code66

We live in a world of constant changeOur underlying infrastructure is constantly moving & changingCloud vendors are competing to offer unique servicesStaying up to date is a lot harder67

Service Bus for Windows ServerAzure BizTalk ServicesAzure Hybrid ConnectionAzure Data Factory v1Who knows these services?Azure RemoteAppAzure Container ServicesAzure Access Control ServiceAzure Alerts68

The lifecycle of a serviceEmbrace ChangePrivate Preview Rough version ofproduct Shared under NDA tolimited groupGeneral Available Covered by SLA Supported versionPublic PreviewThe End Available to the masses Deprecation Silent Sunsetting Reincarnation in 2.069

The end of the roadCan also be tooling cfr.Azure DevOps Cloudbased LoadtestingEmbrace Change Official deprecation Officially announced as deprecated Migration is required before service shutdown Reincarnation A new version of the service arises in a new version Can be part of service or service in total Silent deprecation No further development in the product Service is still running smoothly Does not mean you should stop using it70

Choosing an Alternative71

Let’s use the shiny one, right?!Maybe.72

Choosing an AlternativeUse the tool that fits your needs, not perse what you knowBe careful with the latest shiny technologyDecide as a teamBuild or buyThere is no silver bullet73

Questions you should askWhat is the learning curve? Is it worthwhile?Does it have a vendor lock-in?Is it operable?Does it have a future?74

75

You learn by doingAnd sometimes, you regret your choices.76

Cloud platforms are never finishedYour platform evolves, and so does its needsPrepare for your migrationsNothing is written in stoneUse a product mindsetChange is coming, so you’d better be prepared77

Stay up to date with Azure Deprecation NoticDashboard with deprecation notices concerning Azure services,regions, features, APIs and SDKsSearch for services which you depend onGet automated reminders (WIP)@AzureEndOfLife on Twitter78

Conclusion79

ConclusionTechnologies have scalability capabilities & trade-offsProvide user-friendly webhooks & route them via API gatewaysDefine & roll out a good monitoring strategyAutomate everything, it will save you one dayYou build it, you run itWe live in a world of constant change, so be prepared80

Questions?Twitter: @TomKerkhoveGitHub: @TomKerkhoveblog.tomkerkhove.becodit.eu

Designing for scale with PaaS June 2019 Adventures of building a (multi-tenant) PaaS on Microsoft Azure 11 Scale The good You need to define how it scales Provides you with scaling awareness The bad You need to define how it scales Hard to determine the perfect scaling rules The ugly