Posts Tagged ‘MTU’

vDS Health Check: Useful, but Overlooked

June 11, 2016

healthcheckAs of June 30, 2016 vSphere Enterprise licence will no longer be available. As more and more customers start moving to Enterprise Plus licencing scheme, we will see wider adoption of Enterprise Plus features, such as vSphere Distributed Switch, SIOC, NIOC and Storage DRS.

Therefore, there will be a continuing demand in better coverage of these features and I want to start blogging about them more to meet this demand. And the first blog will be about one of the hidden gems – vSphere distributed switch Health Check.

Feature overview

The reason why I picked health check specifically is because it’s very helpful when troubleshooting connectivity issues on vSphere distributed switch uplinks. But at the same time it’s lesser known, because it’s buried deep in vDS setting section, available only from the Web Client and is disabled by default.

vDS health check is capable of doing the following tests:

  • VLAN and MTU
  • Teaming and failover

By sending broadcasts from one link and receiving them from another, vDS health check can determine if a VLAN is not allowed on a trunk or there is an MTU mismatch. In the same way if you’re using LACP, vDS will alert you if there are any port channel misconfigurations.

Usage example

Before you can start using vDS health check you need to enable it in vSphere Web Client > Networking > dvSwitch > Manage > Settings > Health check. Click on the Edit button and enable both tests.

enable_healthcheck

Now if you go to the Monitor tab and click on the Health section, after a few minutes of initial checks you will see a per host breakdown of identified issues.

healthcheck_results

In my case I was able to immediately determine that VLAN 120 was not trunked on the physical switch. The port group this VLAN ID was assigned to had no VMs at the time. And the issues was fixed proactively, before it could start causing issues.

vlan_mismatch

Possible use cases

The above example is a very straightforward one. VLAN was not added to the trunk port on the physical switch on any of the uplink ports and the issue would’ve been determined right after the first VM was added to the port group.

But what if the VLAN was missing only on one of the host’s uplinks? VM would be running fine on another host and after a vMotion (during a potential maintenance work on that host) it could get migrated to the affected host and lose connectivity. Result – impact to production workloads and time wasted on troubleshooting.

MTU checks are particularly helpful for the environments where a non-standard MTU size is used, such as 9000 byte jumbo frames for iSCSI. It’s important for MTU to match on both vDS and physical switch. This check confirms exactly that.

And last but not least, teaming and failover tests can be useful when you’re using LACP capability of vDS and one of the uplinks is not added to the port channel configuration, which can also cause some nasty issues.

Conclusion

In my opinion vSphere Distributed Switch Health Check is one of those valuable, but overlooked features. I suggest to give it a go if you haven’t already done so. It will notify you for any newly introduced network issues or who knows, maybe it will even find a network mismatch in your current vDS configuration.

Advertisement

Beginner’s Guide to Dell N4000 Series Switches

January 18, 2016

Dell N-Series switches run on Dell Network Operating System (DNOS) version 6.x. Unlike Dell S-Series switches which run on DNOS 9.x, derived from  Force10 Operation System (FTOS), DNOS 6.x came from the PowerConnect switch series and share the same codebase. So if you’ve ever worked with PowerConnect switches, N-Series syntax should be very familiar.

In my case I had two Dell N4032F switches. But the same set of commands applies to any other N4000 Series switch.

Initial Configuration

When you first turn the switch on, it gives you 60 seconds to enter the wizard, where you can set up network settings for the Out-of-Band (OOB) management interface and change the admin password. If you miss it you can reboot the switch and it will show the same wizard prompt again when it boots up. Or you can set it up from the CLI:

# interface out-of-band
# ip address 10.10.10.10 255.255.255.0 10.10.10.254

# show ip interface out-of-band

Once you get to the CLI prompt, configure hostname and enable SSH:

# hostname n4032f-prod

# crypto key generate rsa
# crypto key generate dsa
# ip ssh server
# ip telnet server disable

Stacking

Dell N4000 Series switches support both stacking and MLAG (Multi-chassis Link Aggregation). One of the drawbacks of the stack configuration is disruptive firmware upgrades. When you update firmware on the stack master, firmware is distributed to all stack members and all switches are rebooted simultaneously.

In MLAG each switch has its own Control Plane and can be rebooted independently. Which is MLAG’s shortcoming at the same time, because unlike stack, where all units act as one switch, in MLAG you have to manage each switch separately.

In my case I chose stacking for its simplicity.

Dell N4000

N4000 switches are stacked using the two 40Gb QSFP ports located at the front. QSFP ports are not configured in stack mode by default. Which you need to change on both switches before you can build a stack:

# stack
# stack-port Fortygigabitethernet 1/1/1 stack
# stack-port Fortygigabitethernet 1/1/2 stack

# show switch stack-ports

Once QSFP ports on both switches are configured, disconnect power from both switches and boot the switch you want to be the stack master first (typically the top switch). When the first switch has fully booted, boot the second switch and check the status. This is what you should see:

# show switch

n4000_stack

Firmware Upgrade

If it’s not a brand new switch, save the config before doing the firmware upgrade:

# copy run start
# copy running-config tftp://10.10.10.100/backup.txt

You can use any TFTP server for the firmware upgrade, such as the free Tftpd64 server.

tftpd64

Then you upload the firmware image to the stack master and reload the stack:

# copy tftp://10.10.10.100/N4000v6.2.7.2.stk backup
# boot system backup
# reload
# show version

Firmware is uploaded to a backup image. Then you select the backup image for the next boot and reload the stack. When both switches reboot you should see something similar to this:

frimware_upgraded

As part of the upgrade process the new firmware is automatically uploaded from the master to all stack members, which is a default behaviour. You can confirm it is enabled using the following command:

# show auto-copy-sw

Flow Control, Jumbo Frames and iSCSI Optimization

In my case I used two N4032F switches for an iSCSI backbone, so I needed to make sure that Flow Control and Jumbo Frames are enabled on the switch.

Flow Control is enabled by default, which you can confirm by the following command:

# show storm-control

To globally enable Jumbo Frames on all ports type:

# system jumbo mtu 9216

# show system mtu

Interestingly, Dell N4000 Series switches also have built-in iSCSI optimization, which can detect iSCSI sessions by snooping the traffic on ports 3260 and 860. It then prioritizes iSCSI traffic over the other types of traffic to guarantee low latency for storage I/O. To show iSCSI settings:

# show iscsi

By default switches only track the sessions. Traffic prioritization is disabled by default and has to be enabled manually. This didn’t matter in my case, as the switches were dedicated for storage traffic. But if you share switches between storage and server traffic, you may want to enable it. Refer to the switch User’s Configuration Guide for details.

If you’re using a Dell Compellent storage array with N4000 switches, also make sure to apply a Compellent profile to the ports where storage array is connected to:

# macro global apply profile-compellent-nas $interface_name te1/0/1
# macro global apply profile-compellent-nas $interface_name te1/0/2
# macro global apply profile-compellent-nas $interface_name te1/0/3
# macro global apply profile-compellent-nas $interface_name te1/0/4

VLANs, Trunks and Port Channels

Again, I didn’t use any VLANs and Trunks, because switches were dedicated for iSCSI traffic and were separate from the LAN core. And I didn’t need Port Channels either, as they are not required for iSCSI.

Your scenario might be different. For instance, if you have vSphere hosts connected to a NetApp array over NFS, you may want to create a Multi-Mode (LACP) VIF on the NetApp side. If that’s the case, to create a port channel on the Multi-Mode VIF ports use the following:

# interface range te1/0/2,te2/0/2
# channel-group 1 mode active
# show intefaces po1

If the switches are used for both storage and VM traffic, then you’ll need to configure the server ports and uplink them to your network core. Create your VLANs first:

# vlan 10,20,30

Configure vSwitch uplinks from the ESXi hosts. In a typical vSphere environment, traffic is tagged on the vSwitch side, which means that server ports should be configured as trunks:

# interface range te1/0/3-6,te2/0/3-6
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30

And finally configure uplinks to the network core. Depending on how your LAN core is set up, you may want to create a port channel to the upstream switch and trunk the required VLANs:

# interface range te1/0/1,te2/0/1
# channel-group 2 mode active
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30
# show intefaces po2

Conclusion

This guide didn’t include information on Spanning Tree, QoS or any of the switch Layer 3 features, but I hope it could get you started. At the end of the day, every environment is different. If you need additional information refer to the following guides from the Dell web-site: