Footnote. :The excitement that our teammates felt when they saw this piece on the IBM Cloud blog, a kind of "extension" of the legendary Twelve-Factor App, speaks for itself.The issues raised by the author are not just on everyone’s lips, but truly vital, i.e., relevant to everyday life.Understanding them is useful not only for DevOps engineers, but also for developers who create modern applications running in Kubernetes.
The well-known methodology " 12 factor application " is a set of well-defined rules for developing microservices.They are widely used for launching, scaling and deploying applications.In the IBM Cloud Private platform, we follow the same 12 principles when developing containerized applications. In the article " Kubernetes 12-factor apps " discusses the specifics of these 12-factor applications (they are supported by the Kubernetes container orchestration model).
Reflecting on the principles of developing containerized microservices running under Kubernetes control, we came to the following conclusion : the above 12 factors are perfectly valid, but for organizing a production environment, others are extremely important, and in particular :
- observability (observable) ;
- predictability (schedulable) ;
- updateability (upgradable) ;
- minimum privileges (least privilege) ;
- controllability (auditable) ;
- security (securable) ;
- measurability (measurable)
Let us dwell on these principles and try to estimate their importance. To maintain uniformity, let’s add them to the ones we already have – accordingly, let’s start with XIII…
Principle XIII: Observability
Applications must provide information about their current status and metrics.
Distributed systems can be difficult to manage because multiple microservices are combined into an application. Essentially, the different cogs must move in concert for the mechanism (application) to work. If one of the microservices fails, the system must automatically detect and fix it. Kubernetes provides excellent rescue mechanisms, such as tests for readiness (readiness) and liveliness (liveliness).
With these, Kubernetes makes sure that the application is ready to receive traffic. If readiness completes with an error, Kubernetes stops sending traffic to the pod until another test shows that the pod is ready.
Suppose we have an application which consists of 3 microservices: frontend, business logic and databases. In order for the application to work, the frontend has to make sure that the business logic and databases are ready before accepting traffic. You can do this with a readiness test – it allows you to make sure that all dependencies are working.
The animation shows that no requests are sent to the pod until the readiness test shows it is ready :
Readiness test in action : Kubernetes uses readiness probe to test pods readiness to receive traffic
There are three types of tests: via HTTP, TCP requests and commands. You can control the configuration of the tests, e.g. the frequency of starts, success/failure thresholds and how long to wait for a response. In the case of liveness-tests, it is necessary to set one very important parameter –
initialDelaySeconds Make sure that the test only starts after the application is ready. If this parameter is set incorrectly, the application will constantly restart. Here is how it can be implemented :
livenessProbe:# an http probehttpGet:path: /readinessport: 8080initialDelaySeconds: 20periodSeconds: 5
With liveliness tests, Kubernetes checks whether your application is working. If the application is functioning normally, Kubernetes does nothing. If it is "dead", Kubernetes deletes the pod and starts a new one instead. This is consistent with microservices’ need for statelessness and their recyclability ( factor IX, Disposability ). The animation below illustrates the situation when Kubernetes restarts the pod after failing the liveliness test:
Liveliness test in action: Kubernetes checks if pods are alive
A huge advantage of these tests is that applications can be deployed in any order without worrying about dependencies.
However, we found that these tests are not sufficient for a production environment. Typically, applications have their own metrics that need to be tracked, such as the number of transactions per second. Clients set thresholds for these and set up notifications. IBM Cloud Private fills this gap with a perfectly secure monitoring stack consisting of Prometheus and Grafana with role-based access control. For more information, see IBM Cloud Private cluster monitoring
Prometheus collects target data from endpoint metrics. Your application must set endpoint metrics using the following annotation :
Prometheus then automatically detects the endpoint and collects metrics from it (as shown in the following animation):
Collection of custom metrics
Footnote. : It would be more correct to point the arrows in the opposite direction, since Prometheus itself walks and polls the endpoints, and Grafana itself takes data from Prometheus, but in terms of general illustration this is not so critical.
Principle XIV: Predictability
Applications must provide predictability of resource requirements.
Imagine that management has chosen your team to experiment with a project on Kubernetes. You worked hard to create the appropriate environment. You ended up with an application with exemplary response time and performance. Then another team joined in. They created their own application and ran it in the same environment. After starting the second app, the performance of the first app suddenly dropped. In this case, the reason for this behavior should be looked for in the computing resources (CPU and memory) available for your containers. It is highly probable that they are scarce. This begs the question: how do you guarantee that the computing resources needed by your application are allocated?
Kubernetes has a great option that allows you to set resource minima and set limits for containers. The minima are guaranteed. If a container requires some resource, Kubernetes runs it only on a node that can provide that resource. On the other hand, an upper limit ensures that a container’s appetite never exceeds a certain value.
Minimums and restrictions for containers
The following fragment of YAML code shows the configuration of computing resources :
resources:requests:memory: "64Mi"cpu: "150m"limits:memory: "64Mi"cpu: "200m"
Footnote. : Learn more about Kubernetes resource provisioning, requests, and limits from our recent report and its review of " Autoscaling and resource management in Kubernetes ", and also see K8s documentation
Another interesting option for administrators in a production environment is to set quotas for namespaces If a quota is set, Kubernetes will not create containers for which no request/limits are defined in that namespace. An example of setting quotas for namespaces can be seen in the figure below :
Quotas for namespaces
Principle XV. Renewability
Applications must update data formats from previous generations.
It is often necessary to patch a running production application to remove a vulnerability or to extend its functionality. It is important to update without interruption. Kubernetes provides mechanism to roll out (rolling updates), which allows you to update the application without downtime. With this mechanism you can update one pod at a time without stopping the whole service. Here is a schematic representation of this process (it updates the application to the second version):
An example of a corresponding YAML description :
minReadySeconds: 5strategy:# specify which strategy to updatetype: RollingUpdaterollingUpdate:maxSurge: 1maxUnavailable: 1
Notice the parameters
maxUnavailable– optional parameter which sets the maximum number of pods that can be unavailable during the update process. Although it is optional, it is still worth setting a specific value to ensure that the service is available;
maxSurge– is another optional but critical parameter. It sets the maximum number of pods that can be created over and above their desired number.
Principle XVI: Minimal Privilege
Containers should run with a minimum of privileges.
This may sound pessimistic, but you should think of every permission in the container as a potential vulnerability (see illustration). For example, if the container is running under root, anyone with access to it can inject a malicious process into it. Kubernetes provides a policy Pod Security Policy (PSP) that allows you to restrict access to the file system, host port, Linux capabilities, and more. IBM Cloud Private offers an out-of-the-box set of PSPs that are bound to containers when they are provisioned in the namespace. More information is available at Using namespaces with Pod Security Policies
Every resolution is a potential vector of attack
Principle XVII: Controllability
Need to know who, what, where, and when for all critical operations.
Controllability is critical for any actionon a Kubernetes cluster or application.For example, if an application processes credit card transactions, you need to connect auditing to have a audit trail of every transaction.IBM Cloud Private uses an industry standard Cloud Auditing Data Federation (CADF), which is invariant to specific cloud implementations.More information is available at Audit logging in IBM Cloud Private
The CADF event contains the following data :
initiator_id– The ID of the user who performed the operation;
target_uri– CADF target URI (for example : data/security/project);
action– action performed, usually
Principle XVIII: Security (identification, network, scope, certificates)
You need to protect the application and resources from outsiders.
This point deserves its own article. Suffice it to say that production applications need end-to-end security. IBM Cloud Private applies the following measures to secure production environments :
- authentication : proof of identity;
- authentication : checking the access of authenticated users;
- Certificate management : handling digital certificates, including creation, storage, and renewal;
- data protection : ensuring the security of transmitted and stored data;
- network security and isolation : preventing unauthorized users and processes from accessing the network;
- vulnerability advisor : identifying vulnerabilities in images;
- mutation advisor : detecting mutations in containers.
For more details, see IBM Cloud Private security guide
The certificate manager deserves special attention. This service in IBM Cloud Private is based on an open source project Jetstack The Certificate Manager allows you to issue and manage certificates for services running in IBM Cloud Private. It supports both public and self-trusted certificates, is fully integrated with kubectl and role-based access control.
Principle XIX: Measurability
The use of the application must be measurable for quota purposes and inter-unit settlements.
Ultimately, companies have to pay for IT costs (see figure below). The computing resources allocated to run containers must be measurable, and the organizations using the cluster must be accountable. Make sure you follow Principle XIV – Predictability. IBM Cloud Private offers. accounting service. which collects computational resource data for each container and aggregates it at the namespace level for further calculations (as part of showbacks or chargebacks).
Application usage must be measurable
I hope you have enjoyed the topic raised in this article and have noted the factors you are already using and thought about those that are still left out.
For more information, I recommend reading entry of our performance at KubeCon 2019 in Shanghai. In it. I and Michael Elder Discussing 12+7 principles for Kubernetes-based container orchestration.
P.S. from the translator
Read also in our blog :