Kubernetes CPU Resource Requests at Runtime

  • besteffort: no resource requests
  • burstable: CPU request of 200m
  • guaranteed: CPU request and limit of 200m
  • The root cgroup has 1024 CPUShares; but regardless of value represents 100% of the available CPU time
  • The user.slice and system.slice cgroups have 1024 CPUShares; used for processes other than containers
  • The kubepods cgroup’s, for Kubernetes container processes, CPUShares is based on the Node’s allocatable CPU, e.g., the example Node had 940 mCPU allocatable; translates to a CPUShares of 962 [962 = 940 * 1024 / 1000]
  • Each Containers’ CPUShare (cgroup name based on the Containers’ containerId) is based on its CPU resource request, e.g., for the Container in the burstable Pod its CPUShares is 204 [204 = 200 * 1024 / 1000]. For Containers without a CPU resource request, they get the minimum CPUShare of 2
  • Each Pods’ CPUShare (cgroup name based on the Pods’ uid) is the sum of it’s Containers’ CPUShares (except for the hidden pause container which have the minimum CPUShare of 2)
  • The besteffort cgroup contains all Pods with besteffort QoS with a CPUShare of 2
  • The burstable cgroup containers all Pods with burstable QoS with a CPUShare that is the sum of all its Pod’s CPUShares; in addition to our burstable Pod the DaemonSet pods all have burstable QoS
  • Pods with guaranteed QoS are direct children of the kubepods cgroup
  • At each level in the hierarchy, a croup’s allocated CPU time is a fraction of its parent cgroup’s allocated CPU time; the fraction is determined by dividing the cgroup’s CPUShares by the total the of cgroups’ CPUShares at the level, e.g., for kubepods we get 32% [962 / (962 + 1024 + 1024) * 100%]
  • Containers of Pods with besteffort QoS always get allocated less than 1% of allocated CPU time
  • Containers of Pods with burstable and guaranteed QoS get allocated a varying (inversely related) of CPU time based on the amount of the total CPU request of a Node, i.e., Containers running on Nodes with a low total CPU request get a higher allocation than on Nodes with a high total CPU request
  • Depending on the number of Node CPUs, a Container’s allocated CPU time can be less or more than the fraction of CPU request to Node CPUs, e.g., this example with a Node with 2 CPUs the burstable and besteffort Pods Container’s 8% CPU allocation is less than 10% [200m / 2000m]. The same set of Pods on a Node with 4 CPUs the Container’s CPU allocation is 23% which is more than 5% [200m / 4000m] (see diagram below)
  • While the workload’s Containers were allocated a small (besteffort <1%, burstable / guaranteed 8%), their actual CPU utilization exceeded their allocation; this is because once the CPU has been allocated the remaining CPU is available to be shared across the cgroups
  • As expected, the guaranteed workload was limited to 0.2 s / s due to its CPU limit of 200m
  • The CPU utilization of the burstable workload exceeded that of the besteffort workload; this is expected as the burstable workload was allocated 8% (160m s /s) of the CPU time before the remaining was shared
  • The remaining CPU is shared evenly across the cgroups; the evidence is that the besteffort workload (even with its small CPU allocation) received around the expected amount of CPU time assuming this was true

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
John Tucker

John Tucker

4.93K Followers

Broad infrastructure, development, and soft-skill background