How Admission Control Really Works

confusionThere is a moment in every vSphere admin’s life when he faces vSphere Admission Control. Quite often this moment is not the most pleasant one. In one of my previous posts I talked about some of the common issues that Admission Control may cause and how to avoid them. And quite frankly Admission Control seems to do more harm than good in most vSphere environments.

Admission Control is a vSphere feature that is built to make sure that VMs with reservations can be restarted in a cluster if one of the cluster hosts fails. “Reservations” is the key word here. There is a common belief that Admission Control protects all other VMs as well, but that’s not true.

Let me go through all three vSphere Admission Control policies and explain why you’re better of disabling Admission Control altogether, as all of these policies give you little to no benefit.

Host failures cluster tolerates

This policy is the default when you deploy a vSphere cluster and policy which causes the most issues. “Host failures cluster tolerates” uses slots to determine if a VM is allowed to be powered on in a cluster. Depending on whether VM has CPU and memory reservations configured it can use one or more slots.

Slot Size

To determine the total number of slots for a cluster, Admission Control uses slot size. Slot size is either the default 32MHz and 128MB of RAM (for vSphere 6) or if you have VMs in the cluster configured with reservations, then the slot size will be calculated based on the maximum CPU/memory reservation. So say if you have 100 VMs, 98 of which have no reservations, one VM has 2 vCPUs and 8GB of memory reserved and another VM has 4 vCPUs and 4GB of memory reserved, then the slot size will jump from 32MHz / 128MB to 4 vCPUs / 8GB of memory. If you have 2.0 GHz CPUs on your hosts, the 4 vCPU reservation will be an equivalent of 8.0 GHz.

Total Number of Slots

Now that we know the slot size, which happens to be 8.0 GHz and 8GB of memory, we can calculate the total number of slots in the cluster. If you have 2 x 8 core CPUs and 256GB of RAM in each of 4 ESXi hosts, then your total amount of resources is 16 cores x 2.0 GHz x 4 hosts = 128 GHz and 256GB x 4 hosts = 1TB of RAM. If your slot size is 4 vCPUs and 8GB of RAM, you get 64 vCPUs / 4 vCPUs = 16 slots (you’ll get more for memory, but the least common denominator has to be used).

total_slots

Practical Use

Now if you configure to tolerate one host failure, you have to subtract four slots from the total number. Every VM, even if it doesn’t have reservations takes up one slot. And as a result you can power on maximum 12 VMs on your cluster. How does that sound?

Such incredibly restrictive behaviour is the reason why almost no one uses it in production. Unless it’s left there by default. You can manually change the slot size, but I have no knowledge of an approach one would use to determine the slot size. That’s the policy number one.

Percentage of cluster resources reserved as failover spare capacity

This is the second policy, which is commonly recommended by most to use instead of the restrictive “Host failures cluster tolerates”. This policy uses percentage-based instead of the slot-based admission.

It’s much more straightforward, you simply specify the percentage of resources you want to reserve. For example if you have four hosts in a cluster the common belief is that if you specify 25% of CPU and memory, they’ll be reserved to restart VMs in case one of the hosts fail. But it won’t. Here’s the reason why.

When calculating amount of free resources in a cluster, Admission Control takes into account only VM reservations and memory overhead. If you have no VMs with reservations in your cluster then HA will be showing close to 99% of free resources even if you’re running 200 VMs.

failover_capacity

For instance, if all of your VMs have 4 vCPUs and 8GB of RAM, then memory overhead would be 60.67MB per VM. For 300 VMs it’s roughly 18GB. If you have two VMs with reservations, say one VM with 2 vCPUs / 4GB of RAM and another VM with 4 vCPUs / 2GB of RAM, then you’ll need to add up your reservations as well.

So if we consider memory, it’s 18GB + 4GB + 2GB = 24GB. If you have the total of 1TB of RAM in your cluster, Admission Control will consider 97% of your memory resources being free.

For such approach to work you’d need to configure reservations on 100% of your VMs. Which obviously no one would do. So that’s the policy number two.

Specify failover hosts

This is the third policy, which typically is the least recommended, because it dedicates a host (or multiple hosts) specifically just for failover. You cannot run VMs on such hosts. If you try to vMotion a VM to it, you’ll get an error.

failover_host

In my opinion, this policy would actually be the most useful for reserving cluster resources. You want to have N+1 redundancy, then reserve it. This policy does exactly that.

Conclusion

When it comes to vSphere Admission Control, everyone knows that “Host failures cluster tolerates” policy uses slot-based admission and is better to be avoided.

There’s a common misconception, though, that “Percentage of cluster resources reserved as failover spare capacity” is more useful and can reserve CPU and memory capacity for host failover. But in reality it’ll let you run as many VMs as you want and utilize all of your cluster resources, except for the tiny amount of CPU and memory for a handful of VMs with reservations you may have in your environment.

If you want to reserve failover capacity in your cluster, either use “Specify failover hosts” policy or simply disable Admission Control and keep an eye on your cluster resource utilization manually (or using vROps) to make sure you always have room for growth.

Advertisements

Tags: , , , , , , , , , , , , , , , , ,

2 Responses to “How Admission Control Really Works”

  1. My vSphere Best Practices | Notes from MWhite Says:

    […] failed host, you may want to disable HA Admission Control (I am starting to do this now – see this for more […]

  2. Newsletter: October 29, 2016 | Notes from MWhite Says:

    […] Admission Control really works This is a good reminder on how admission control really […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: