background

Azure Kubernetes Service Key Points — 2

Azure Kubernetes Service Key Points — 2

Part 2 — Cluster Security

Access

  • One of the most important factors in cluster security is restriction of access to API Server. If you want to completely close the API Server to the outside, you set up a private cluster. The private cluster is subject to all restrictions of private link. It also has various other restrictions. The second method is to set up a public cluster and allow access to the API Server only by certain IPs. Thus, for example, you can prevent the access to your cluster's API Server from outside your company’s network.
  • Let us say you do not want to set up any IP restrictions while accessing the API Server and you set up a public cluster but want to Access the API Server via Active Directory. If you integrate Kubernetes RBAC with Azure Active Directory, you can easily restrict access to API Server. You can also manage who works with admin rights and who works as a standard user with Azure Kubernetes Service Cluster Admin Role and Azure Kubernetes Service Cluster User Role privileges.
  • When you enable Azure Active Directory in the cluster, through the Kubernetes RBAC, you can decide on, for example, which of your AD users can do what in which namespace. For example, if you authorize only one namespace for a user group, that user group will not be able to access resources outside this namespace.
  • A final note on AAD: The Contributor is a very privileged role in Azure. If you do not want your users to access the cluster as admin, you must grant the Reader privilege, not the Contributor, under the user's resource group or subscription.

Cluster Security

  • Access to containers' node resources should be restricted. The root privileges should not be given to pods. Since the pod with root privilege can access all resources in the operating system of the node on which the pod is located, it can render the entire node unusable. Root access can be denied by using “allowPrivilegeEscalation: false” in the pod manifest. The built-in Linux security features such as AppArmor and seccomp can be used for more granular control of the container actions.
  • AppArmor: It is enabled by default in AKS nodes and controls application level access. Default AppArmor restrict access to most of /proc and /sys folders.
  • It works at the process level and used for restricting many actions e.g. use of chmod command is prevented.
  • You can use licensed tools to control all these security rules. For example, Twistlock scans both your image registry and the containers that are trying to operate in your cluster. It determines both which commands these containers will run in the runtime, and which container can access which container, etc. It offers dashboards where you can see all vulnerabilities as classified. It also sets up alarm mechanisms according to all these rules and allows you to define policies.

Up-to-date Cluster

  • It is necessary to upgrade Kubernetes version. The “az aks upgrade” command can drain one node at a time and upgrade securely. But the governance process which triggers the upgrade process should be managed manually.
  • Linux node updates and reboots can be implemented with kured. AKS automatically downloads and installs the Linux security fixes but does not do this if reboot is necessary. Kured finds pending reboots and allow nodes to reboot one at a time. You can integrate Prometheus with kured and control when to reboot.

Container Image Management

  • Scan for and remediate image vulnerabilities. The images can be scanned in CI/CD process using Twistlock or Aqua.
  • When a base image is used for applications is updated, building new images is recommended. For example, if there is a security fix in the dotnet runtime, it should be reflected to the application. This is possible with Azure Pipelines and Jenkins.

Network

  • Compare the kubenet and Azure CNI network modes to make selection. Kubenet may be preferred for clusters without performance priority. Azure CNI is used for integration with existing virtual networks and the on – prem networks. Note: In Azure CNI, the pods receive individual IPs and can thus be accessed faster. There is no need for additional routing. When Azure CNI is used, the virtual network will be under a separate resource group. The AKS cluster service principal must have at least Network Contributor permissions for connection.

  • If you use Azure CNI, you need to plan out IP ranges for the AKS subnets. These ranges must be large enough to cover every node, pods, and network resources. Additional IP addresses are available for scale out and node upgrades. You also need to take into account all of that.

  • Traffic is distributed with Ingress. VS provide access to multiple applications over a single ip with ingress controller and ingress paths. Ingress controllers should run on Linux nodes (IpTables)
  • Traffic is secured with Web Application Firewall (WAF). Potential attacks are prevented with WAFs such as Barracuda WAF for Azure.

Storage

  • If you will deploy an application that uses volume on AKS, you need to select appropriate storage type. SSD type (Premium) disk must be used to store critical data in production. Also, for multiple concurrent connections, network – based storage can be planned. The data such as archive can be stored on standard (hdd). When AS is set up, you see two storage classes:


    1 — default: Azure standard storage. Azure standard storage is appropriate for infrequently used clusters without performance concern. HDD is used in the background.

    2 — managed-premium: Azure premium storage. It is performance oriented and recommended for use in production. SSD is used in the background.


  • Disk is deleted when the pod (and its controller) is deleted in both disk types, In other words, if the applications in the default storage classes provided by Azure are deleted, the data is also deleted. In order to protect the data, a storage class which protects (retains) data must be created.


    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: managed-premium-retain
    provisioner: kubernetes.io/azure-disk
    reclaimPolicy: Retain
    parameters
    storageaccounttype: Premium_LRS
    kind: Managed

  • when determining the size of the nodes, storage needs should also be taken into account. When choosing VMs, storage performance is definitely taken into consideration. The node names ending in s are VMs that support SSD. These VMs must be used in production.
  • Dynamic provisioning is used for volumes. PVC is used. The reclaim policy must be taken into consideration.
  • The data must be backed up without exception. Azure Site Recovery or Velero can be used for this.

Continuity and Disaster Recovery

  • It is recommended to create an AKS cluster in multiple regions. When selecting regions, check the availability and paired regions. The clusters installed in two separate regions can be used as hot / hot, hot / warm or hot / cold. In this case, applications should be deployed in two clusters.
  • If hot / hot is used, in other words, if more than one cluster is up and running, Azure Traffic Manager is used. The Azure Traffic Manager directs clients to the nearest cluster and prevents performance degradation.
  • Container registries are geo-replicated. Each cluster pulls images from the registry in its region. In this way, the registry access of two different clusters works faster, safer and cheaper.
  • States should not be kept in the container. A code compatible with 12 factor app should be developed. Even though this is the subject of a completely different and long article, the applications that you will be on AKS should be written as cloud native to the extent possible. If there is a file dependency, Azure DBs or Azure Storages can be used.
  • Create storage migration plan: Use Gluster, Ceph, Rook or Portworx

Development

  • Azure Dev Spaces can be used for development. Imagine you have an application written in the Microservice architecture. Debugging a service can turn into a nightmare due to its dependencies on other services. Azure Dev Spaces makes the service you debug in your local development environment talk to other services on AKS, giving you debug experience on the cluster and makes debugging very easy.
  • Use of VS Code Kubernetes extension is recommended.

We have talked about many items that need to be considered while installing and using AKS, but as I write, more stuff comes to my mind. For example, there are topics such as monitoring (Azure Monitor, App Insights, Log Analytics, Prometheus, Jaeger, Kiali), logging (Log Analytics, ELK / EFK, etc.), performance monitoring (New Relic, AppDynamics, Dynatrace, etc.) and experiences which are learned as we live. It is not easy to fit them all into two articles. So let us leave it here for now. Let the rest be the subject of other articles. Thank you for reading.

Source:

How can we help you?