Kubernetes CPU Resource Requests at Runtime

  • besteffort: no resource requests
  • burstable: CPU request of 200m
  • guaranteed: CPU request and limit of 200m
  • The root cgroup has 1024 CPUShares; but regardless of value represents 100% of the available CPU time
  • The user.slice and system.slice cgroups have 1024 CPUShares; used for processes other than containers
  • The kubepods cgroup’s, for Kubernetes container processes, CPUShares is based on the Node’s allocatable CPU, e.g., the example Node had 940 mCPU allocatable; translates to a CPUShares of 962 [962 = 940 * 1024 / 1000]
  • Each Containers’ CPUShare (cgroup name based on the Containers’ containerId) is based on its CPU resource request, e.g., for the Container in the burstable Pod its CPUShares is 204 [204 = 200 * 1024 / 1000]. For Containers without a CPU resource request, they get the minimum CPUShare of 2
  • Each Pods’ CPUShare (cgroup name based on the Pods’ uid) is the sum of it’s Containers’ CPUShares (except for the hidden pause container which have the minimum CPUShare of 2)
  • The besteffort cgroup contains all Pods with besteffort QoS with a CPUShare of 2
  • The burstable cgroup containers all Pods with burstable QoS with a CPUShare that is the sum of all its Pod’s CPUShares; in addition to our burstable Pod the DaemonSet pods all have burstable QoS
  • Pods with guaranteed QoS are direct children of the kubepods cgroup
  • At each level in the hierarchy, a croup’s allocated CPU time is a fraction of its parent cgroup’s allocated CPU time; the fraction is determined by dividing the cgroup’s CPUShares by the total the of cgroups’ CPUShares at the level, e.g., for kubepods we get 32% [962 / (962 + 1024 + 1024) * 100%]
  • Containers of Pods with besteffort QoS always get allocated less than 1% of allocated CPU time
  • Containers of Pods with burstable and guaranteed QoS get allocated a varying (inversely related) of CPU time based on the amount of the total CPU request of a Node, i.e., Containers running on Nodes with a low total CPU request get a higher allocation than on Nodes with a high total CPU request
  • Depending on the number of Node CPUs, a Container’s allocated CPU time can be less or more than the fraction of CPU request to Node CPUs, e.g., this example with a Node with 2 CPUs the burstable and besteffort Pods Container’s 8% CPU allocation is less than 10% [200m / 2000m]. The same set of Pods on a Node with 4 CPUs the Container’s CPU allocation is 23% which is more than 5% [200m / 4000m] (see diagram below)
  • While the workload’s Containers were allocated a small (besteffort <1%, burstable / guaranteed 8%), their actual CPU utilization exceeded their allocation; this is because once the CPU has been allocated the remaining CPU is available to be shared across the cgroups
  • As expected, the guaranteed workload was limited to 0.2 s / s due to its CPU limit of 200m
  • The CPU utilization of the burstable workload exceeded that of the besteffort workload; this is expected as the burstable workload was allocated 8% (160m s /s) of the CPU time before the remaining was shared
  • The remaining CPU is shared evenly across the cgroups; the evidence is that the besteffort workload (even with its small CPU allocation) received around the expected amount of CPU time assuming this was true



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
John Tucker

John Tucker


Broad infrastructure, development, and soft-skill background