As of June 30, 2016 vSphere Enterprise licence will no longer be available. As more and more customers start moving to Enterprise Plus licencing scheme, we will see wider adoption of Enterprise Plus features, such as vSphere Distributed Switch, SIOC, NIOC and Storage DRS.
Therefore, there will be a continuing demand in better coverage of these features and I want to start blogging about them more to meet this demand. And the first blog will be about one of the hidden gems – vSphere distributed switch Health Check.
Feature overview
The reason why I picked health check specifically is because it’s very helpful when troubleshooting connectivity issues on vSphere distributed switch uplinks. But at the same time it’s lesser known, because it’s buried deep in vDS setting section, available only from the Web Client and is disabled by default.
vDS health check is capable of doing the following tests:
- VLAN and MTU
- Teaming and failover
By sending broadcasts from one link and receiving them from another, vDS health check can determine if a VLAN is not allowed on a trunk or there is an MTU mismatch. In the same way if you’re using LACP, vDS will alert you if there are any port channel misconfigurations.
Usage example
Before you can start using vDS health check you need to enable it in vSphere Web Client > Networking > dvSwitch > Manage > Settings > Health check. Click on the Edit button and enable both tests.
Now if you go to the Monitor tab and click on the Health section, after a few minutes of initial checks you will see a per host breakdown of identified issues.
In my case I was able to immediately determine that VLAN 120 was not trunked on the physical switch. The port group this VLAN ID was assigned to had no VMs at the time. And the issues was fixed proactively, before it could start causing issues.
Possible use cases
The above example is a very straightforward one. VLAN was not added to the trunk port on the physical switch on any of the uplink ports and the issue would’ve been determined right after the first VM was added to the port group.
But what if the VLAN was missing only on one of the host’s uplinks? VM would be running fine on another host and after a vMotion (during a potential maintenance work on that host) it could get migrated to the affected host and lose connectivity. Result – impact to production workloads and time wasted on troubleshooting.
MTU checks are particularly helpful for the environments where a non-standard MTU size is used, such as 9000 byte jumbo frames for iSCSI. It’s important for MTU to match on both vDS and physical switch. This check confirms exactly that.
And last but not least, teaming and failover tests can be useful when you’re using LACP capability of vDS and one of the uplinks is not added to the port channel configuration, which can also cause some nasty issues.
Conclusion
In my opinion vSphere Distributed Switch Health Check is one of those valuable, but overlooked features. I suggest to give it a go if you haven’t already done so. It will notify you for any newly introduced network issues or who knows, maybe it will even find a network mismatch in your current vDS configuration.