Just a heads up on a new KB article we published on how to determine if a cluster is over-committed in System Center Virtual Machine Manager 2008. If you do any kind of admin work with System Center Virtual Machine Manager 2008 you'll want to add this to your favorites.
=====
Symptoms
The cluster status displayed in the System Center Virtual Machine Manager 2008 Administrator's Console may change from "OK" to "over-committed" after a cluster refresh operation completes.
Note : The host status values do not change in the VMM Administrator Console until the VMM server performs a host refresh, which runs automatically every 30 minutes. You can run a refresh on demand by right-clicking the host and then clicking Refresh.
When the displayed status of a managed cluster becomes over-committed then the administrator is not able to use this cluster for placement of virtual machines for new or migrated virtual machines.
Cause
This occurs because the sum of free slots in the entire cluster is greater than the sum of slots in the largest host (both free and used).
Resolution
The resolution will depend on a variety of factors. The primary factors are
· The cluster reserve value defined in SCVMM console
· The amount of memory in each cluster node
· The amount of memory assigned to each VM
· The placement of the VMs on nodes in the cluster
Below you will see a discussion on how to determine if a cluster is over-committed. This will help to demonstrate how the factors mentioned above play a role in a cluster showing as over-committed.
In some cases, the resolution might be as simple as ensuring an even distribution of VMs on each cluster node or ensuring all nodes have same amount of RAM. A more advanced resolution might require considering memory requirements of VM’s and placing VMs with similar memory requirements in the same cluster.
How VMM determines if a cluster is over-committed:
1. Find out the HAVM with the largest allocated memory across all nodes in the cluster. The allocated memory of this VM represents the size of a slot.
2. Calculate number of “used slots” on each host:
Note : More than one virtual machine can be used to fill a slot. For instance, if the slot size is 8 GB, two virtual machines with 4 GB RAM each can be used to fill one slot.
a. Using the slot size determined in step 1, group as many VMs as possible into a slot based on their allocated memory.
i. For example: Consider case of three VM’s with 2,4 and 8 GB of memory allocated and a slot size of 8GB. 2 slots would be required to group the three VM’s without exceeding the slot size.
b. Continue the process until all VM’s have been grouped into a slot.
c. The number of groupings will be the number of “used slots"
3. Calculate number of “free slots” on each host:
a. Determine the physical memory on the host
b. Determine the host memory reserve defined in SCVMM for each node
c. Determine the memory allocated to each VM
d. Plug the values into the following formula and divide by the slot size determined in step 1.
(PhysicalMemory – HostMemoryReserve – VMMemory) / SlotSize
4. Determine the number of slots that need to be in reserve.
The cluster reserve, R, defines the number of nodes that we must protect against failing. By summing the number of “used slots” and “free slots” on the R largest host[s] we are able to the determine number of slots to be held in reserve.
5. Determine if the Cluster is over-committed. As long as the number of “free slots” in the entire cluster (summation of # obtained in step 3) is greater than the slots that need to be in reserve (step 4), the cluster is not overcommitted.
Example: Overcommitted formula implementation
Cluster name: VMM-Cluster1
Cluster reserve = 1
Cluster nodes: All cluster nodes run on identical hardware with 32 GB RAM each
N1: VMM-ClusterN1
N2: VMM-ClusterN2
N3: VMM-ClusterN3
N4: VMM-ClusterN4
Virtual Machines on this cluster: 16
Note : This example assumes that the default value for the cluster reserve of 512kb. It is common for this value to be increased and subsequently impacts this calculation.
6. Find the HAVM with the largest allocated memory to define the slot size. Using the above table as our example, we see the largest memory allocated for any VM is 8GB. This represents the slot size for this cluster at this time.
7. Calculate number of used slots on each host:
Based off of an 8GB slot size and being able to fit more than one VM per slot until reaching the 8GB maximum, we determine the number of “used slots” per host (see above table for details):
VMM-ClusterN1: 3
VMM-ClusterN2: 3
VMM-ClusterN3: 2
VMM-ClusterN4: 1
8. Calculate number of “free slots” on each host:
(PhysicalMemory – HostMemoryReserve – VMMemory) / SlotSize
Note : We can’t have a partial slot. If the formula returns 1.8 slots, then use a value of 1.
VMM-ClusterN1: 1 slots (32GB - 512kb - 20GB in use / 8GB)
VMM-ClusterN2: 1 slots (32GB - 512kb - 18GB in use / 8GB)
VMM-ClusterN3: 2 slots (32GB - 512kb - 10GB in use / 8GB)
VMM-ClusterN4: 2 slots (32GB - 512kb - 8GB in use / 8GB)
9. Determine the number of slots that need to be in reserve
Cluster Reserve is 1, so we look for 1 largest host(s). In our example we determine nodes 1, 2 and 3 represent the largest total slots per node. We will use VMM-ClusterN1 as the host that represents the total number of slots that must be held in reserve
10. Determine if the Cluster is over-committed
Add up the free slots of remaining hosts:
VMM-ClusterN2: 1
VMM-ClusterN3: 2
VMM-ClusterN4: 2
Total: 5
As long as the number of free slots in the entire cluster (summation of number obtained in step 3 minus the largest host) 5 is greater than sum of slots in the largest hosts (both free and used) 4, the cluster is not overcommitted.
More Information
HAVM Placement and “Over-committed” Status
Cluster reserve is a unique feature of VMM 2008 and VMM 2008 R2.The cluster reserve specifies the number of node failures a cluster must be able to sustain while still supporting all virtual machines that are currently deployed on the clustered hosts. If a host cluster cannot withstand the specified number of node failures and still keep all of the virtual machines running, the cluster is placed in an Over-committed state.
For example, if you specify a cluster reserve of 2 for an 8-node cluster, the rule is applied in the following ways:
- If all 8 nodes of the cluster are functioning, the host cluster is marked over-committed if any combination of 6 nodes (8-2) in the cluster lacks the capacity to accommodate existing virtual machines.
- If only 5 nodes in the cluster are functioning, the cluster is marked Overcommitted if any combination of 3 (5-2) nodes in the cluster lacks the capacity to accommodate existing virtual machines.
When placing a virtual machine in a failover cluster, the placement process in VMM calculates whether the new virtual machine will over-commit the cluster. If the action will over-commit the cluster, the cluster hosts are not made available for placement.
Note : An administrator can override this and place a VM on a host in an over-committed cluster during manual placement.
VMM’s cluster refresher updates the host cluster’s over-committed status after each of the following events:
- A change in the cluster reserve value
- The failure or removal of nodes from the host cluster
- The addition of nodes to the host cluster
- The discovery of new virtual machines on nodes in the host cluster
The cluster reserve is set on the General tab of the host cluster properties. For a procedure, see How to View and Modify the Properties of a Host Cluster (http://go.microsoft.com/fwlink/?LinkID=162986).
=====
For the latest version of this article see the link below:
J.C. Hornbeck | System Center Knowledge Engineer
The App-V Team blog: http://blogs.technet.com/appv/
The WSUS Support Team blog: http://blogs.technet.com/sus/
The SCMDM Support Team blog: http://blogs.technet.com/mdm/
The ConfigMgr Support Team blog: http://blogs.technet.com/configurationmgr/
The SCOM 2007 Support Team blog: http://blogs.technet.com/operationsmgr/
The SCVMM Team blog: http://blogs.technet.com/scvmm/
The MED-V Team blog: http://blogs.technet.com/medv/
The DPM Team blog: http://blogs.technet.com/dpm/
The OOB Support Team blog: http://blogs.technet.com/oob/
The Opalis Team blog: http://blogs.technet.com/opalis