Why Does VM Sizing Matter?
“Understanding elemental behavior is crucial for building a stable, consistent and proper performing infrastructure” – Frank Denneman, Sr. Staff Architect, VMware
When creating a VM in ESXi, the cores per socket layout is important to the guest OS since it is ultimately the operating system making scheduling decisions based on the hardware architecture it sees. When VMs are created, assigning virtual CPUs and cores per socket should be a consideration since performance could vary between 1×16, 16×1, 2×8 and 4×4 configurations. VMs with large memory requirements may change the optimal virtual CPU socket allocation depending on physical resources of the host. This post is to discuss sizing VMs and how the default cores per socket value might not be the best for the performance of the virtual machine. I’m not going to get into right-sizing, which is a different topic, so I’ll cover that in a future. Keep reading for a few list of best practices for cores-per-socket values, when to use Hot Add and using hyper threads to create monster VMs.
What is vNUMA?
Virtual NUMA (vNUMA) exposes the NUMA structure of the virtual machine to the Guest OS running in the virtual machine, not the NUMA topology of the host. It’s called virtual NUMA because it is the logical representation of the physical process of how the CPU access RAM abstracted through the hypervisor running on the server. This CPU/RAM relationship is called Non-Uniform Memory Access, or NUMA.
Ever hear the term “Wide VM”? A wide VM is simply a VM with multiple vNUMA Nodes. vNUMA is a combination of two things; a Virtual Proximity Domain (VPD) which exposes vNUMA to the VM and a Physical Proximity Domain (PPD) which groups vCPUs within a CPU Package. In VMware ESXi, vNUMA is automatically enabled when a VM has 9 vCPUs (or more) and the number of allocated CPUs is greater than the number of physical cores of one socket. One little gotcha is in vSphere, vNUMA optimizations are automatic for CPU but memory topology is manual, left to the admin building the VM to optimize.
For more deep dive info on NUMA (and so much more) jump into this series by Frank Denneman here
General Best Practices
- Alleviate bottlenecks between CPUs and memory for higher better scalability
- A VMs vNUMA topology should be correct and optimal regardless of the version of vSphere you are using
- Contain VMs within the fewest number of physical NUMA nodes to reduce the need for remote memory to be accessed
- Configure the Cores per Socket to align with the physical characteristics of the CPU package
- Configure cores & sockets for proper alignment with most operating system and application licenses models.
- When Multiple vNUMA nodes exist, avoid odd vCPU counts
- Provide the OS/Application in the VM the best chance for optimal self-configuration
Mark Achtemichuk (VMware Performance Engineering) put together these example charts to use as a rule of thumb. When followed, this guidance will hel mainting the proper VM sizing for VMs with optimal vNUMA nodes. His original post is here and is worth a read. The following charts use a host with 2 sockets with 10 cores each. If your physical server has more than 20 cores, just follow the pattern shown here.
As mentioned previously, when a VM has a large memory requirement that is more than 50% of the total host capacity, it’s recommended to slightly adjust the vCPU cores per socket for some vCPU counts
Best Practices for CPU Hot Add
- CPU Hot Add is not compatible with vNUMA
- With CPU Hot Add enabled, the VM will only ever see one vNUMA node no matter how many cores are allocated
- For optimal performance, disable Hot Add on VMs that have more vCPUs allocated than there are physical cores on a single socket of the host
- Use Hot Add up to the point where the VMs allocated cores equals the number of cores on a single socket on he hosts CPU
- Leaving Hot Add enabled on VMs allocated more cores than the physical socket has results in a single vNUMA node. (This is why hot-add isn’t recommended for large VMs)
- Don’t enable CPU Hot Add in VM templates
Best Practices for Sizing OVA Deployments
- OVAs are most often released with 1 core and multiple sockets to be most adaptable to any environment they are being deployed into
- When deploying, check CPU configuration and reconfigure to best align with the physical host’s CPU architecture
- Consult vendor deployment guides or resources for their specific recommendation
Best Practices for Hyper Threading
Hyper threading is a great technology for making more use of server processors. Best practice is to enable hyper-threading on hosts used for virtualization. However, if you’re asked to create a monster VM by allocating more vCPU to the VM than the host (i.e. 40 vCPUs on a host that has 20 cores which is 40 hyerthreads), this is usually a bad idea that leads to performance problems. This scenario is not the same as creating four VMs each with 10 cores on the 20 core server. This would be an example of a 2:1 over allocation and leverages VMware CPU scheduling to keep everything in check. The problem with the example of a monster VM with 40 cores is the application is told by the OS it has 40 cores to use and if it tries to schedule workload on 40 cores, the physical host can’t comply.
- While the ability to assign a VM as many CPUs are there are logical processors (cores x hyper-threads) this is generally not an optimal configuration
- The purpose of hyperthreading is to keep the physical CPU cores active. If a process on one hyper-thread is consuming 100% of the physical core, then the second hyper-thread is unable to run
- Allocating vCPU beyond physical cores and leveraging logical cores leads very quickly to CPU contention and most often decrease performance
- If the specific workload benefits from sharing cache and memory, it might be desirable to have the NUMA scheduler count the available HT with numa.vcpu.preferHT=TRU
- There is a potential danger in one VM being allocated more vCPUs that physical cores
- Leads to PCPU saturation and high CPU Ready
This quote from the VMware Performance team sums it up plainly:
“VMware’s conservative guidance about overcommitting your pCPU:vCPU ratio for Monster virtual machines is simple – don’t do it.” – Mark Achtemichuk
If you’re running a 5.x vSphere version and upgrading to a newer vSphere version, the VMware performance team offers these sizing considerations when migrating to vSphere 6.0 to 6.5 or 6.7
- Reconfigure all VM’s that are larger than a pNUMA node. If you’re not sure where to start, first check the cores er socket configuration on VMs with 9 vCPUs or more.
- Reconfigure templates so all new VM’s are created appropriately. Crack open those templates and align the cores per sockets to match the best practices chart above.
- Evaluate the effort of re-configuring everything else (good hygiene). If you have hundreds of VMs to sort through, the pay off of this CPU tweak might be more effort than it’s worth. But, as the steward of your environment, just make a plan and slow work your way through the environment.
When given the opportunity, experiment with CPU cores per socket settings to compare against a known baseline and unless you have a good reason, backed up by data or experimentation, the best bet is to use the default settings. Configure virtual machine CPU allocations to use the least number of physical NUMA boundaries. The objective is to align the VM’s sockets with the physical layout of the host when possible. For VMs that have been running a while and have some historical performance data, review the actual usage and right size the VM using metrics like CPU demand, Readiness and contention. For new VM deployments, size VMs small to begin with and add resources as required, making data driven decisions.
“If you’re VM is right sized, and it actually needs lots of vCPU’s and Memory, then crossing a NUMA boundary, regardless of the penalty, is much more beneficial for performance than not having the necessary resources at all.” – Michael Webster, Technical Director, Business Critical Applications Engineering at Nutanix
VMware vSphere 6.5 Host Resources Deep Dive
Free e-book: https://www.rubrik.com/blog/vmware-vsphere-ebook
VMware VROOM! Blog
Systernals CoreInfo – Use this nifty executable to view NUMA configurations in Windows. It’s easy to run and you’ll quickly see how different cores per socket settings change the way Windows Server modifies it’s memory usage.
DVD Store – This is a great way to benchmark applications and infrastructure using a close to real life application
VMDumper – A really cool one liner (a long one liner) to see NUMA configuration of a VM at from withing ESXi