Kubernetes Security Guide - Sysdig

Transcription

KubernetesSecurity Guide

ContentsIntroC H A P T E R41Securing your container images and CI/CD pipeline6Image scanning6What is image scanning7Sysdig Secure Container image scanning: Scan engine8Securing your CI/CD pipeline9Image scanning in CI/CDC H A P T E R102Securing Kubernetes Control Plane14Kubelet security14Access to the kubelet API15Kubelet access to Kubernetes API16RBAC example, accessing the kubelet API with curl16Kubernetes API audit and security log17Audit log policies configuration19Extending the Kubernetes API using security admission controllers20Securing Kubernetes etcd24PKI-based authentication for etcd24etcd peer-to-peer TLS24Kubernetes API to etcd cluster TLS25Using a trusted Docker registry25Kubernetes trusted image collections: Banning non trusted registry27Kubernetes TLS certificates rotation and expiration27Kubernetes kubelet TLS certificate rotation28Kubernetes serviceAccount token rotation29Kubernetes user TLS certificate rotation30Securing Kubernetes hosts30Using a minimal host OS31KubernetesSecurity Guide2

Update system patches31Node recycling31Running CIS benchmark security tests32C H A P T E R3Understanding Kubernetes RBAC33Kubernetes role-based access control (RBAC)33RBAC configuration: API server flags35How to create Kubernetes users and serviceAccounts35How to create a Kubernetes serviceAccount step by step36How to create a Kubernetes user step by step38Using an external user directory41C H A P T E R4Security at the pod level: K8s security context, PSP, Network Policies42Kubernetes admission controllers42Kubernetes security context43Kubernetes security policy46PSP and RBAC49Implementing PSPs51Working with PSPs by example52Practical considerations56Kubernetes network policies57Kubernetes resource allocation management59C H A P T E R5Securing workloads at runtime61How to implement runtime security61Challenges implementing abnormal behavior detection71Threat blocking and Incident remediation with open source tools72KubernetesSecurity Guide3

IntroKubernetes has become the de facto operating system of the cloud. This rapid success isunderstandable, as Kubernetes makes it easy for developers to package their applications intoportable microservices. However, Kubernetes can be challenging to operate. Teams often putoff addressing security processes until they are ready to deploy code into production.Kubernetes requires a new approach to security. After all, legacy tools and processes fallshort of meeting cloud-native requirements by failing to provide visibility into dynamiccontainer environments. Fifty-four percent of containers live for five minutes or less, whichmakes investigating anomalous behavior and breaches extremely challenging.One of the key points of cloud-native security is addressing container security risks assoon as possible. Doing it later in the development life cycle slows down the pace of cloudadoption, while raising security and compliance risks.The Cloud/DevOps/DevSecOps teams are typically responsible for security and complianceas critical cloud applications move to production. This adds to their already busy schedule tokeep the cloud infrastructure and application health in good shape.We’ve compiled this security guide to provide guidance on choosing your approach tosecurity as you ramp up the use of containers and Kubernetes.Kubernetes attack surfaceLet’s first take a glance at a Kubernetes cluster to understand which elements you need to protect.AAPPControl PlaneComponentsAPPBContainerPodContainer RuntimeCkubeletNodeAAccess via Kubernetes API Proxy etcd APIBExploit vulnerability in apps or 3rd party librariesCAccess via APIDAccess to the servers or virtual machineskube-apiserverMaster NodeDClusterKubernetesSecurity Guide4

The first area to protect is your applications and libraries. Vulnerabilities in your base OSimages for your applications can be exploited to steal data, crash your servers or scaleprivileges. Another component you need to secure are third-party libraries. Often, attackerswon’t bother to search for vulnerabilities in your code because it’s easier to use knownexploits in your applications libraries.The next vector is the Kubernetes control plane - your cluster brain. Programs like thecontroller manager, etcd or kubelet, can be accessed via the Kubernetes API. An attackerwith access to the API could completely stop your server, deploy malicious containers ordelete your entire cluster.Additionally, your cluster runs on servers, so access to them needs to be protected.Undesired access to these servers, or the virtual machines where the nodes run, will enablean attacker to have access to all of your resources and the ability to create serious securityexposures.KubernetesSecurity Guide5

C H A P T E R1Securing your container imagesand CI/CD pipeline1One of the final steps of the CI (Continuous Integration) pipeline involves building thecontainer images that will be pulled and executed in our environment. Therefore, whetheryou are building Docker images from your own code or using unmodified third party images,it’s important to identify any known vulnerabilities that may be present in those images. Thisprocess is known as Docker vulnerability scanning.Image scanningDocker images are composed of several immutable layers, basically a diff over a base imagethat adds files and other changes. Each one is associated with a unique hash id:Any new Docker image that you create will most likely be based on an existing image(FROM statement in the Dockerfile). That’s why you can leverage this layered design toavoid having to re-scan the entire image every time you make a new one, with a smallchange. If a parent image is vulnerable, any other images built on top of it will be vulnerabletoo.KubernetesSecurity Guide6

The Docker build process follows a manifest (Dockerfile) that includes relevant securityinformation that you can scan and evaluate including the base images, exposed ports,environment variables, entrypoint script, external installed binaries and more. By theway, don’t miss our Docker security best practices article for more hints in building yourDockerfiles.In a secure pipeline, Docker vulnerability scanning should be a mandatory step of your CI/CD process, and any image should be scanned and approved before entering “Running”state in the production clusters.What is image scanningThe Docker security scanning process typically includes: Checking the software packages, binaries, libraries, operative system files and moreagainst well known vulnerabilities databases. Some Docker scanning tools have arepository containing the scanning results for common Docker images. These tools canbe used as a cache to speed up the process. Analyzing the Dockerfile and image metadata to detect security sensitive configurationslike running as privileged (root) user, exposing insecure ports, using based images taggedwith “latest” rather than specific versions for full traceability, user credentials, etc. User defined policies, or any set of requirements that you want to check for every image.This includes software packages blacklists, base images whitelists, whether a SUID filehas been set, etc.You can classify and group the different security issues you might find in an image byassigning different priorities: a warning notification is sufficient for some issues, while otherswill be severe enough to justify aborting the build.KubernetesSecurity Guide7

Sysdig Secure Container image scanning: Scan engineSysdig Secure Scan Engine addresses Docker vulnerability scanning as part of the SecureDevOps methodology. DevOps teams can use it as a centralized service for inspection andanalysis while applying user-defined acceptance policies to allow automated validation andcertification of container images. Scan Engine allows developers to perform detailed analysis on their container images, run queries, produce reports and define policies that can be used in CI/CD pipelines. Using Scan Engine, container images can be downloaded from Docker V2 compatiblecontainer registries, as well as analyzed and evaluated against user defined policies. It can be accessed directly through a RESTful API or via CLI. The scanning includes not just CVE-based security scans but also policy-based scansthat can include checks around security, compliance, and operational best practices.Scan Engine sources and endpointsScan Engine architecture is comprised of six components that can either be deployed in asingle container or scaled out: API Service: Central communication interface that can be accessed by code, using aREST API, or directly, using the command line. Image Analyzer Service: Executed by the “worker”, these nodes perform the actualDocker image scanning. Catalog Service: Internal database and system state service. Queuing Service: Organizes, persists and schedules the engine tasks. Policy Engine Service: Policy evaluation and vulnerabilities matching rules. Kubernetes Webhook Service: Kubernetes-specific webhook service to validate imagesbefore they are spawned.KubernetesSecurity Guide8

StateKubernetesWebhookExternal APIAPIsCatalogSimpleQueueAnalysisPolicy EngineWorkerScan Engine architectureSecuring your CI/CD pipelineDevOps has introduced some interesting concepts in the world of software development.“You code it, you run it” means that there are no longer two separated teams fordevelopment and operations. This indicates that there is a strong decoupling between theapplications and the infrastructure in which they run. CI/CD concepts allow developersto deploy very fast and often. But with these new tools come new challenges. Securityteams have lost control over many aspects of the systems running the applications, as thecontainers are now the atomic unit of working. A container is opaque in many regards, andsecurity teams are no longer responsible for what it has installed. As long as many aspectsof development have moved left in the workflow, security has to be moved left as well,creating Secure DevOps: You code it, you run it, you secure it.KubernetesSecurity Guide9

SEC OPSSecurity andCompliance PoliciesPASSCodeDEVCICDSYSDIG SECUREENGINEWARNPRE-DEPLOYMENTFAILThis new responsibility of DevOps teams means there is a need for new tools andprocedures to establish the new security processes. It is not enough to adapt old securityoperations; there are new requirements that need new tools.Image scanning in CI/CDSecure DevOps teams need to ensure that the containers they’re shipping are secure.The best way to do this is to include image scanning in the CI/CD pipelines. Some of thebenefits of this are: Early detection of security issues. This allows quicker responses. If the issue is detected in the pipeline, the problem is much easier to fix. The problems are detected before deployments. This means the chances of outages dueto security incidents are reduced by a significant percentage.Like most things in IT, the earlier you detect container security issues, the easier they are tofix without further consequences.KubernetesSecurity Guide10

Embedding container security in your build pipeline is a best practice for several reasons: The vulnerabilities will never reach your production clusters, or even worse, a clientenvironment. You can adopt a secure-by-default approach by knowing any image available in yourDocker container registry has already passed all of the security policies you have definedfor your organization, as opposed to manually checking container and Kubernetescompliance after-the-fact. The original container builder will be (almost) instantly informed when the developer stillhas all the context. The issue will be substantially easier to fix, rather than if it was foundby another person months later.Inline scanningSome security requirements restrict access to some registries in various environments inorder to keep them safe. These access or permission restrictions can make image scanningtricky or even impossible. Inline scanning allows you to scan the images locally, at build time,without using any registry.Metadata from the analysis can be uploaded to a database or security backend in order tostore the information and use it for image deployment control.Inline image scanning provides several benefits over traditional image scanning within theregistry: The analysis is performed inline (locally) on the runner. This means that in casevulnerabilities are found, you can prevent the image from being published at all. As verifications are done locally, the image contents are not sent anywhere for analysis,so any confidential information is kept under control without being exposed. During theanalysis, only metadata information is extracted from the image. The metadata obtained from the analysis can be reevaluated later if new vulnerabilitiesare discovered or policies are modified, all without requiring a new scanning of theimage. You can set policies to enforce and adhere to various container compliance standards(like NIST 800-190 and PCI) and provide checks for containers running in Kubernetesand Openshift.KubernetesSecurity Guide11

Integrating image scanning with JenkinsJenkins is an open source automation server with a plugin ecosystem that supports thetypical tools that are a part of your delivery pipelines. Jenkins helps to automate the CI/CD process. Scan Engine has been designed to plug seamlessly into a CI/CD pipeline; adeveloper commits code into the source code management system, like Git. This changetriggers Jenkins to start a build which creates a container image, etc.In a typical workflow, this container image is then run through various automated testing.If an image does not pass the Docker security scanning (doesn’t meet the organization’srequirements for security or compliance), then it doesn’t make sense to invest the timerequired to perform automated tests on the image. A better approach is to “learn fast” byfailing the build and returning the appropriate reports back to the developer to address theissues.Docker scanner with JenkinsYou can use the “Sysdig Secure Container Image Scanner plugin” available in the officialplugin list that you can access via the Jenkins interface.You can further read how to integrate image scanning in Jenkins in this article.Integrat ing image scanning with Azure pipelinesAzure DevOps gives teams tools like version control, reporting, project management,automated builds, lab management, testing and release management. Azure Pipelinesautomates the execution of CI/CD tasks, like building the container images when a commitis pushed to your git repository or performing vulnerability scanning on the container image.Image scanning allows DevOps teams to shift left security, detecting known vulnerabilitiesand validating container build configuration early in their pipelines. This is done before thecontainers are deployed in production or images are pushed into any container registry. Thisallows you to detect and fix issues faster, improving delivery to production time.You can find here detailed information on how to introduce image scanning in Azurepipelines.KubernetesSecurity Guide12

Integrat ing image scanning with AWS CodePipeline and AWS CodeBuildAWS provides several tools for DevOps teams: CodeCommit for version control, CodeBuildfor building and testing code, and CodeDeploy for automatic code deployment. The blockon top of all of these tools is CodePipeline, which allows them to visualize and automatethese different stages.Image Scanning for AWS CodePipeline raises the confidence that DevOps teams have inthe security of their deployments, detecting known vulnerabilities and validating containerbuild configuration early in their pipelines. By detecting those issues before the images arepublished into a container registry or deployed in production, fixes can be applied faster anddelivery to production time improves.You can learn more about how to use inline scanning with AWS CodePipeline in this article.Integrat ing image scanning with BambooAtlassian Bamboo is a continuous integration and delivery server integrated with Atlassiansoftware development and collaboration platform. Some of the features that distinguishBamboo from similar CI/CD tools are its native integration with other Atlassian products(like Jira project management and issue tracker), improved support for Git workflows(branching and merging) and flexible scalability of worker nodes using ephemeral AmazonEC2 virtual machines.Learn more in the article: Integrating Sysdig Secure with Atlassian Bamboo CI/CD.Integrat ing image scanning with Gitlab CI/CDGitlab CI/CD is an open source continuous integration and delivery server integrated withthe Gitlab software development and collaboration platform.Once you have configured Gitlab CI/CD for your repo, every time a developer pushesa commit to the tracked repository branches, the pipeline scripts will be automaticallytriggered.You can use these pipelines to automate many processes. Common tasks include QAtesting, building software distribution artifacts (like Docker images or linux packages) or, asis the case for this article, checking compliance with security policies.Learn more in the article: Integrating Gitlab CI/CD with Sysdig Secure.KubernetesSecurity Guide13

C H A P T E R2Securing Kubernetes Control Plane2In addition to configuring the Kubernetes security features, a fundamental part ofKubernetes security is securing sensitive Kubernetes components such as kubelet andinternal Kubernetes etcd. We also shouldn’t forget the common external resources, like theDocker registry, that we pull images from. In this part, we will learn best practices on how tosecure the Kubernetes kubelet and the Kubernetes etcd cluster, as well as how to configurea trusted Docker registry.Kubelet securityThe kubelet is a fundamental piece of any Kubernetes deployment. It’s often described asthe “Kubernetes agent” software, and is responsible for implementing the interface betweenthe nodes and the cluster logic.The main task of a kubelet is managing the local container engine (i.e. Docker) and ensuringthat the pods described in the API are defined, created, run and remain healthy; and alsothat they are destroyed when appropriate.There are two different communication interfaces to be considered: Access to the Kubelet REST API from users or software (typically just the Kubernetes APIentity). Kubelet binary accessing the local Kubernetes node and Docker engine.Kubernetes apiKubeletKubernetes node /Docker daemonThese two interfaces are secured by default using: Security related configuration (parameters) passed to the kubelet binary – Next section(Kubelet security – access to the kubelet API). NodeRestriction admission controller – See below Kubelet security – kubelet access toKubernetes API. RBAC to access the kubelet API resources – See below RBAC example, accessing thekubelet API with curl.KubernetesSecurity Guide14

Access to the kubelet APIThe kubelet security configuration parameters are often passed as arguments to the binaryexec. For newer Kubernetes versions (1.10 ) you can also use a kubelet configuration file.Either way, the parameters syntax remains the same.Let’s use this example configuration as reference:/home/kubernetes/bin/kubelet–v 2–kubereserved cpu 70m,memory 1736Mi–allow-privileged true–cgroup-root /–pod-manifest-path ath /home/kubernetes/containerized ties-before-mount true –cert-dir /var/lib/kubelet/pki/–enable-debugging-handlers true–bootstrap-kubeconfig /var/lib/kubelet/bootstrap-kubeconfig–kubeconfig /var/lib/kubelet/kubeconfig–anonymous-auth false–authorization-mode Webhook–client-ca-file bin-dir /home/kubernetes/bin–network-plugin cni–non-masquerade-cidr 0.0.0.0/0–feature-gates ExperimentalCriticalPodAnnotation trueVerify the following Kubernetes security settings when configuring kubelet parameters: anonymous-auth is set to false to disable anonymous access (it will send 401Unauthorized responses to unauthenticated requests). kubelet has a --client-ca-file flag, providing a CA bundle to verify clientcertificates. --authorization-mode is not set to AlwaysAllow, as the more secure Webhook modewill delegate authorization decisions to the Kubernetes API server. --read-only-port is set to 0 to avoid unauthorized connections to the read-onlyendpoint (optional).KubernetesSecurity Guide15

Kubelet access to Kubernetes APIAs we mentioned in the first part of this guide, the level of access granted to a kubelet isdetermined by the NodeRestriction Admission Controller (on RBAC-enabled versions ofKubernetes, stable in 1.8 ).kubelets are bound to the system:node Kubernetes clusterrole.If NodeRestriction is enabled in your API, your kubelets will only be allowed to modify theirown Node API object, and only modify Pod API objects that are bound to their node. It’s justa static restriction for now.You can check whether you have this admission controller from the Kubernetes nodesexecuting the apiserver binary: ps aux grep apiserver grep admission-control--admission-control QuotaRBAC example, accessing the kubelet API with curlTypically, only the Kubernetes API server will need to use the kubelet REST API. As wementioned before, this interface needs to be protected as you can execute arbitrary podsand exec commands on the hosting node.You can try to communicate directly with the kubelet API from the node shell:# curl -k https://localhost:10250/podsForbidden (user system:anonymous, verb get, resource nodes,subresource proxy)Kubelet uses RBAC for authorization and it’s telling you that the default anonymous systemaccount is not allowed to connect.KubernetesSecurity Guide16

You need to impersonate the API server kubelet client to contact the secure port:# curl --cacert /etc/kubernetes/pki/ca.crt --key /etc/kubernetes/pki/apiserver-kubelet-client.key --cert /etc/kubernetes/pki/apiserverkubelet-client.crt -k https://localhost:10250/pods jq .{“kind”: “PodList”,“apiVersion”: “v1”,“metadata”: {},“items”: [{“metadata”: {“name”: e”: “kube-system”,.Your port numbers may vary depending on your specific deployment method and initialconfiguration.Kubernetes API audit and security logKube-apiserver provides a security-relevant chronological set of records documenting thesequence of activities that have affected the system by individual users, administrators orother components. It allows the cluster administrator to answer the following questions: What happened? When did it happen? Who initiated it? What was affected? Where was it observed? From where was it initiated? To where was it going?The API audit output, if correctly filtered and indexed, can become an extremely usefulresource for the forensics, early incident detection and traceability of your Kubernetescluster.KubernetesSecurity Guide17

The audit log uses the JSON format by default, a log entry has the following aspect:{“kind”: “Event”,“apiVersion”: “audit.k8s.io/v1beta1”,“metadata”: {“creationTimestamp”: “2018-10-08T08:26:55Z”},“level”: “Request”,“timestamp”: “2018-10-08T08:26:55Z”,“auditID”: e”: “ResponseComplete”,“requestURI”: “/api/v1/pods?limit 500”,“verb”: “list”,“user”: {“username”: “admin”,“groups”: [“system:authenticated”]},“sourceIPs”: [“10.0.138.91”],“objectRef”: {“resource”: “pods”,“apiVersion”: “v1”},“responseStatus”: {“metadata”: {},“code”: 200},“requestReceivedTimestamp”: p”: �: {“authorization.k8s.io/decision”: “allow”,“authorization.k8s.io/reason”: “RBAC: allowed byClusterRoleBinding “admin-cluster-binding” of ClusterRole “clusteradmin” to User “admin””}}From this document you can easily extract the user (or serviceaccount software entity)that originated the request, the request URI, API objects involved, timestamp and the APIresponse, allow, in this example.You can define which events you want to log passing a YAML-formatted policy configurationto the API executable.For instance, if you configure append the following parameters to the kube-apiservercommand line:- --audit-log-path /var/log/apiserver/audit.log- --audit-policy-file /extra/policy.yamlKubernetesSecurity Guide18

The API will load the configuration from the path above and output the log to /var/log/apiserver/audit.logThere are many other flags you can configure to tune the audit log, like log rotation, maxtime to live, and more. It’s important to note that you can also configure your API to sendaudit entries, using a webhook trigger, in case you want to store and index them using anexternal engine (like ElasticSearch or Splunk).Audit log policies configurationThe YAML policies file has the following structure:apiVersion: audit.k8s.io/v1beta1kind: PolicyomitStages:- “RequestReceived”rules:- level: Requestusers: [“admin”]resources:- group: “”resources: [“*”]- level: Requestuser: [“system:anonymous”]resources:- group: “”resources: [“*”]Using this config, you can match the different keys of a log entry to a specific value, set ofvalues or a catch-all wildcard. The example above will log the requests made by the adminuser as well as any request made by an anonymous system user.If you create a new user (see above) that is not associated to any Role or ClusterRole, andthen try to get the list of pods:kubectl get podsNo resources found.Error from server (Forbidden): pods is forbidden: User“system:anonymous” cannot list pods in the namespace “default”KubernetesSecurity Guide19

The log will register the request:{“kind”: “Event”,“apiVersion”: “audit.k8s.io/v1beta1”,“metadata”: {“creationTimestamp”: “2018-10-08T10:00:20Z”},“level”: “Request”,“timestamp”: “2018-10-08T10:00:20Z”,“auditID”: e”: “ResponseComplete”,“requestURI”: “/api/v1/namespaces/default/pods?limit 500”,“verb”: “list”,“user”: {“username”: “system:anonymous”,“groups”: [“system:unauthenticated”]},“sourceIPs”: [“10.0.141.137”],“objectRef”: {“resource”: “pods”,“namespace”: “default”,“apiVersion”: “v1”},“responseStatus”: {“metadata”: {},“status”: “Failure”,“reason”: “Forbidden”,“code”: 403},“requestReceivedTimestamp”: p”: �: {“authorization.k8s.io/decision”: “forbid”,“authorization.k8s.io/reason”: “”}}You have a comprehensive audit policy example here. Rule ordering is important becausedecision is taken in a top-down first match fashion.Extending the Kubernetes API using securityadmission controllersKubernetes was designed to be highly extensible, offering you the possibility to plug anysecurity software that you might want to use to process and filter the workloads launched inyour system.KubernetesSecurity Guide20

A feature that makes admission webhooks especially interesting for the security complianceis that they are evaluated before actually executing the requests. That means you can blockthe access to any suspicious software before the pods are even created.You can create your own admission controller implementing the webhook interface definedby Kubernetes.They can also block pods from running if the cluster is out of resources or if the images arenot secure. And, as we saw earlier, they can even mutate the request to tweak the resourcesrequest from a pod.Again, all of this is done before the request is persisted in etcd, which means before it isexecuted. This is what makes Kubernetes admissions controllers such a perfect candidate todeploy preventive security controls on your cluster.There are three specific admission controllers let you expand the API functionality viawebhooks: ImagePolicyWebhook to decide if an image should be admitted. MutatingAdmissionWebhook to modify a request. ValidatingAdmissionWebhook to decide whether the request should be allowed to runat all.Let’s imagine we want to implement an ImagePolicyWebhook.KubernetesSecurity Guide21

First, we’ll need to make sure that the webhook is enabled when we start kube-apiserver:kube-apiserver --enable-admission-plugins ImagePolicyWebhook We also need to configure the webhook server that will be called by the API server:kube-apiserver --admission-control-config-file admission-config.yaml An example admission-config.yaml contains an AdmissionConfiguration object:apiVersion: apiserver.config.k8s.io/v1kind: AdmissionConfigurationplugins:- name: nfigFile: path-to-kubeconfig-file allowTTL: 50denyTTL: 50retryBackoff: 500defaultAllow: trueAnd then, the webhook server is configured into a kubeconfig file:# clusters refers to the remote service.clusters:- name: ate-authority: /path/to/ca.pem# CA for verifying theremote service.server: https://images.example.com/policy # URL of remote serviceto query. Must use ‘https’.# users refers to the API server’s webhook configuration.users:- name: name-of-api-serveruser:client-certificate: /path/to/cert.pem # cert for the webhookadmission controller to useclient-key: /path/to/key.pem# key matching the certPlease refer to the ImagePolicyWebhook documentation for a detailed description of theconfiguration options and alternatives.We can now code our HTTP server to attend the webhook requests.KubernetesSecurity Guide22

Once the Kubernetes API server receives a request for a deployment, our webhook willreceive a JSON request similar }],“namespace”:”mynam

The Cloud/DevOps/DevSecOps teams are typically responsible for security and compliance as critical cloud applications move to production. This adds to their already busy schedule to keep the cloud infrastructure and application health in good shape. We’ve compiled this security guide to provide guidance o