Posts Tagged ‘dell’

Dell Force10 Part 3: VLT Domain Configuration

July 31, 2016

dell-force10In my previous post here I went through VLT basics and how it helps to establish a loop-free network topology in a modern datacenter. Now lets dive deeper and see how VLT is configured from FTOS CLI.

VLT Configuration

The first step is to configure the backup links and VLT interconnect. Dell S4048-ON switches have six 40Gb QSFP+ ports, two of which 1/49 and 1/50 we will use for VLTi. Repeat the same configuration on both switches.

# int range fo 1/49-1/50
# no shutdown

# interface port-channel 127
# description “VLT interconnect”
# channel-member fo 1/49
# channel-member fo 1/50
# no shutdown

Now that we have a VLT interconnect set up, let’s join the first switch to a VLT domain:

# vlt domain 1
# back-up destination 172.10.10.11
# peer-link port-channel 127
# primary-priority 1

First switch points to the second switch management IP for a backup destination, uses port channel 127 as a VLT interconnect and becomes a primary peer, because it’s given the lowest priority of 1.

Do the same on the second switch, but now point to the first switch management IP for backup and use the highest priority to make this switch a secondary peer:

# vlt domain 1
# back-up destination 172.10.10.10
# peer-link port-channel 127
# primary-priority 8192

To confirm the VLT state use the following command:

# sh vlt brief

vlt_brief

As you can see, the VLTi and backup links are up and the switch can see its peer. For some additional VLT specific information use these commands:

# sh vlt statistics
# sh vlt backup-link

I would also recommend to use the following command to see the port channel state and confirm that both VLTi links are in up state:

# sh int po127

po_state

Conclusion

In this part of the Dell Force10 switch configuration series we quickly went through the initial VLT setup. We haven’t touched on VLT LAG configuration yet. We will take a closer look at it in the next blog post.

Advertisement

Dell Force10 Part 2: VLT Basics

July 10, 2016

dell-force10Last time I made a blog post on initial configuration of Force10 switches, which you can find here. There I talked about firmware upgrade and basic features, such as STP and Flow Control. In this blog post I would like to touch on such a key feature of Force10 switches as Virtual Link Trunking (VLT).

VLT is Force10’s implementation of Multi-Chassis Link Aggregation Group (MLAG), which is similar to Virtual Port Channels (vPC) on Cisco Nexus switches. The goal of VLT is to let you establish one aggregated link to two physical network switches in a loop-free topology. As opposed to two standalone switches, where this is not possible.

You could say that switch stacking gives you similar capabilities and you would  be right. The issue with stacked switches, though, is that they act as a single switch not only from the data plane point of view, but also from the control plane point of view. The implication of this is that if you need to upgrade a switch stack, you have to reboot both switches at the same time, which brings down your network. If you have an iSCSI or NFS storage array connected to the stack, this may cause trouble, especially in enterprise environments.

With VLT you also have one data plane, but individual control planes. As a result, each switch can be managed and upgraded separately without full network downtime.

VLT Terminology

Virtual Link Trunking uses the following set of terms:

  • VLT peer – one of the two switches participating in VLT (you can have a maximum of two switches in a VLT domain)
  • VLT interconnect (VLTi) – interconnect link between the two switches to synchronize the MAC address tables and other VLT-related data
  • VLT backup link – heartbeat link to send keep alive messages between the two switches, it’s also used to identify switch state if VLTi link fails
  • VLT – this is the name of the feature – Virtual Link Trunking, as well as a VLT link aggregation group – Virtual Link Trunk. We will call aggregated link a VLT LAG to avoid ambiguity.
  • VLT domain – grouping of all of the above

VLT Topology

This’s what a sample VLT domain looks like. S4048-ON switches have six 40Gb QSFP+ ports, two of which we use for a VLT interconnect. It’s recommended to use a static LAG for VLTi.

basic_vlt

Two 1Gb links are used for VLT backup. You can use switch out-of-band management ports for this. Four 10Gb links form a VLT LAG to the upstream core switch.

Use Cases

So where is this actually helpful? Vast majority of today’s environments are virtualized and do not require LAGs. vSphere already uses teaming on vSwitch uplinks for traffic distribution across all network ports by default. There are some use cases in VMware environments, where you can create a LAG to a vSphere Distributed Switch for faster link failure convergence or improved packet switching. Unless you have a really large vSphere environment this is generally not required, but you may use this option later on if required. Read Chris Wahl’s blog post here for more info.

Where VLT is really helpful is in building a loop-free network topology in your datacenter. See, all your vSphere hosts are connected to both Force10 switches for redundancy. Since traffic comes to either of the switches depending on which uplink is being picked on a ESXi host, you have to make sure that VMs on switch 1 are able to communicate to VMs on switch 2. If all you had in your environment were two Force10 switches, you would establish a LAG between the two and be done with it. But if your network topology is a bit larger than this and you have at least a single additional core switch/router in your environment you’d be faced with the following dilemma. How can you ensure efficient traffic switching in your network without creating loops?

stp_loop

You can no longer create a LAG between the two Force10 switches, as it will create a loop. Your only option is to keep switches connected only to the core and not to each other. And by doing that you will cause all traffic from VMs on switch 1 destined to VMs on switch 2 and vise versa to traverse the core.

east_west_traffic

And that’s where VLT comes into play. All east-west traffic between servers is contained within the VLT domain and doesn’t need to traverse the core. As shown above, if we didn’t use VLT, traffic from one switch to another would have to go from switch 1 to core and then back from core to switch 2. In a VLT domain traffic between the switches goes directly form switch 1 to switch 2 using VLTi.

Conclusion

That’s a brief introduction to VLT theory. In the next few posts we will look at how exactly VLT is configured and map theory to practice.

Dell Force10 Part 1: Initial Configuration

July 3, 2016

force10_S4048_on
When it comes to networking Dell has two main series of switches. PowerConnect/N-series, which run DNOS 6.x operating system. And S/Z-series switches, which run on DNOS 9.x derived from Force10 OS (FTOS). In this series of blogs we will go through the configuration of Force10 switch series and use Dell S4048-ON top of the rack switch as an example.

Interesting to note, that unlike other S-series switches S4048-ON is an Open Networking switch. Dell is one of the first companies which apart from its own OS lets customers run other operating systems on its network switches, such as Cumulus Linux OS and Big Switch Networks Switch Light OS. While Cumulus and Big Switch has its own use cases, in this blog we will look specifically at configuring FTOS.

Boot process

S4048-ON comes from the factory pre-configured for bare metal provisioning (BMP). This is what you will see when you boot the switch for the first time:

s4048_bmp

If you just want to boot FTOS, simply skip the BMP by choosing A and switch will boot the OS.

After some time BMP will time out. If you’ve missed the above wizard, you can also disable BMP from CLI using the following commands:

> enable
# stop bmp
# config
# reload-type normal-reload
# exit
# reload

When prompted choose to save the configuration and proceed with reload. After the switch has rebooted check that the next boot is set to normal reload:

# show reload-type

Initial configuration

First steps of any switch installation is assigning a hostname and management interface settings:

# hostname DELL4048-SWITCH
# int managementethernet 1/1
# ip address 172.10.10.2/24
# no shut
# management route 0.0.0.0/0 172.10.10.10

Then set admin / enable passwords and allow remote management via SSH:

# enable password 123456
# username admin password 123456
# ip ssh server enable

Configure time zone and NTP:

# clock timezone UTC 11
# ntp server 172.10.10.20
# show ntp associations
# show ntp status
# show clock

Firmware upgrade

Force10 switches have two boot banks A: and B:. It’s a good practice to upload new firmware into one boot bank and keep the old firmware in the other in case you need to roll back.

The easiest way to upgrade is via TFTP using Tftpd64, which you can download for free from here. If you’re upgrading an existing switch, make sure to save the running config and make a backup. If it’s an initial install you can skip this step.

# copy run start
# copy start tftp://10.0.0.1/FORCE10_SWITCH_01.01.16.conf

Then upload new firmware to image B:, change active boot bank to B: and reload:

# show version
# show boot system stack-unit 1
# upgrade system tftp://10.0.0.1/FTOS-SK-9.9.0.0P9.bin b:
# conf t
# boot system stack-unit 1 primary system b:
# exit
# reload

You will be prompted to save the configuration and reboot. After the reboot you may be asked to enable SupportAssist. SuppotAssist helps to automatically open Dell service tickets if there is a switch fault. You can enable SupportAssist by running the following commands and answering prompts:

supportassist

# conf t
# support-assist activate
# support-assist activity full-transfer start now
# show support-assist status

My pair of switches were configured in a Virtual Link Trunking (VLT) domain. I’ll explain how VLT works later in the series. But from the upgrade point of view, each switch in a VLT domain is treated as a separate switch and has to be upgraded separately. If you decided to use a stack instead of VLT, you can find the upgrade process for a Force10 stack in my other post about Dell MXL switches here.

Spanning tree

Spanning Tree Protocol (STP) helps to prevent network topology loops and is highly recommended for use in any network. Switches connected in an actual loop topology in today’s networks are rare. But STP can save you from consequences of a potential human error, such as port channel misconfiguration. If instead of creating one port channel with two links, you by mistake create two port channels with one link each and both carry the same VLANs, you’ve accidentally created a loop, which will bring your whole network to an immediate halt.

It’s a good practice to enable STP as a safeguard mechanism from such configuration errors. S4048-ON supports STP, RSTP, MSTP and PVST+. In my case S4048s were uplinked into HP core, which supported STP, RSTP and MSTP. If you have Cisco switches in your network core you can use PVST+. In my case I used RSTP, which is a good choice if you don’t require enhancements of MSTP and PVST+ in your network. Just make sure to not use the basic STP protocol, as it provides the slowest convergence.

# protocol spanning-tree rstp
# no disable
# show spanning-tree rstp

In every STP topology there is also a root switch, which by default is selected automatically. For a more deterministic STP behaviour it’s recommended to select the root switch manually, by assigning the lowest STP priority to it. Typically your core switch should be your root switch. In my case it was a HP core switch, which was assigned priority of “0”.

When configuring server and storage facing ports make sure to enable EdgePort mode to minimize the time it takes for the port to come online:

# int range Te1/45-1/48
# spanning-tree rstp edge-port
# switchport
# no shut

If you want to know more about how STP works, you can read a few of my previous blog posts on STP here and here.

Flow control

To avoid dropped packets on 10Gb switch ports at times of potential heavy utilization it is also a best practice to as a minimum enable bi-directional Flow Control on the storage array ports. I enabled it on the iSCSI links connected from the Dell Compellent storage array:

# int range Te1/17-1/18
# flowcontrol rx on tx on

If you specifically interested in switch best practices for Compellent and EqualLogic storage arrays, Dell has a full list of guides for various switches at communitites wiki here.

Port channels and VLANs

Port channels and VLANs are configured similarly to any other switch, but I include them here in case you want to know the syntax. In this example we have two access ports 1/46 and 1/47 and an uplink to the core configured as port channel 1:

# interface port-channel 1
# switchport
# no shutdown

# interface range Te1/1-1/2
# port-channel-protocol LACP
# port-channel 1 mode active
# no shutdown

# int vlan 254
# untagged Te1/46-1/47
# tagged po 1

Keep in mind, that port channels are used either in one switch configurations or when two or more switches are stacked together. If you’re using Virtual Link Trunking (VLT), you will need to create Virtual Link Trunks (VLTs). Which are similar to port channels, but have a slightly different syntax. We will talk about VLT in much more detail in the following Force10 blogs.

Conclusion

One feature which I didn’t specifically mentioned in this blog post was Jumbo Frames. I tend not to use it in my deployments until I see convincing evidence of it making a difference for iSCSI/NFS storage implementations. I did a post about Jumbo Frames long time ago here and hasn’t changed my opinion ever since. Interested to here your thoughts if have a different take on that.

References

Force10 and vSphere vDS Interoperability Issue

June 10, 2016

dell-force10Recently I had an opportunity to work with Dell FX2 platform from the design and delivery point of view. I was deploying a FX2s chassis with FC630 blades and FN410S 10Gb I/O aggregators.

I ran into an interesting interoperability glitch between Force10 and vSphere distributed switch when using LLDP. LLDP is an equivalent of Cisco CDP, but is an open standard. And it allows vSphere administrators to determine which physical switch port a given vSphere distributed switch uplink is connected to. If you enable both Listen and Advertise modes, network administrators can get similar visibility, but from the physical switch side.

In my scenario, when LLDP was enabled on a vSphere distributed switch, uplinks on all ESXi hosts started disconnecting and connecting back intermittently, with log errors similar to this:

Lost uplink redundancy on DVPorts: “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”. Physical NIC vmnic1 is down.

Network connectivity restored on DVPorts: “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”. Physical NIC vmnic1 is up

Uplink redundancy restored on DVPorts: “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”, “1549/03 4b 0b 50 22 3f d7 8f-28 3c ff dd a4 76 26 15”. Physical NIC vmnic1 is up

Issue Troubleshooting

FX2 I/O aggregator logs were reviewed for potential errors and the following log entries were found:

%STKUNIT0-M:CP %DIFFSERV-5-DSM_DCBX_PFC_PARAMETERS_MISMATCH: PFC Parameters MISMATCH on interface: Te 0/2

%STKUNIT0-M:CP %IFMGR-5-OSTATE_DN: Changed interface state to down: Te 0/2

%STKUNIT0-M:CP %IFMGR-5-OSTATE_UP: Changed interface state to up: Te 0/2

This clearly looks like some DCB negotiation issue between Force10 and the vSphere distributed switch.

Root Cause

Priority Flow Control (PFC) is one of the protocols from the Data Center Bridging (DCB) family. DCB was purposely built for converged network environments where you use 10Gb links for both Ethernet and FC traffic in the form of FCoE. In such scenario, PFC can pause Ethernet frames when FC is not having enough bandwidth and that way prioritise the latency sensitive storage traffic.

In my case NIC ports on Qlogic 57840 adaptors were used for 10Gb Ethernet and iSCSI and not FCoE (which is very uncommon unless you’re using Cisco UCS blade chassis). So the question is, why Force10 switches were trying to negotiate FCoE? And what did it have to do with enabling LLDP on the vDS?

The answer is simple. LLDP not only advertises the port numbers, but also the port capabilities. Data Center Bridging Exchange Protocol (DCBX) uses LLDP when conveying capabilities and configuration of FCoE features between neighbours. This is why enabling LLDP on the vDS triggered this. When Force10 switches determined that vDS uplinks were CNA adaptors (which was in fact true, I was just not using FCoE) it started to negotiate FCoE using DCBX. Which didn’t really go well.

Solution

The easiest solution to this problem is to disable DCB on the Force10 switches using the following command:

# conf t
# no dcb enable

Alternatively you can try and disable FCoE from the ESXi end by using the following commands from the host CLI:

# esxcli fcoe nic list
# esxcli fcoe nic disable -n vmnic0

Once FCoE has been disabled on all NICs, run the following command and you should get an empty list:

# esxcli fcoe adapter list

Conclusion

It is still not clear why PFC mismatch would cause vDS uplinks to start flapping. If switch cannot establish a FCoE connection it should just ignore it. Doesn’t seem to be the case on Force10. So if you run into a similar issue, simply disable DCB on the switches and it should fix it.

Dell Repository Manager: Bootable ISO Issues

May 23, 2016

problem_solutionIn one of my previous posts I described the process of upgrading a Dell FX2 chassis firmware using Dell Repository Manager (DRM).

In an ideal world you just follow the process and in an hour or two you can get your chassis upgraded. You may sometimes run into issues. I want to go through some of them in this post, including possible remediation.

Issue Description

When exporting firmware to a bootable ISO you can find DRM not being able to download some of the bundle components with the following error in the Job Result:

Processing failed:
Failed downloading files:
Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN

And errors in the Log:


60. 24/03/2016 5:58:50 PM Export to Bootable ISO : Downloaded 34 / 56
61. 24/03/2016 5:59:44 PM Export to Bootable ISO : Error downloading some files
62. 24/03/2016 5:59:45 PM Export to Bootable ISO : Failed exporting to Bootable ISO.

Workaround #1: Skip the Component

You can try the following option “Continue download irrespective of any error (in the selected components)” in the export dialog. It won’t help to get the component downloaded, but you will got a bootable ISO.

However, DRM will still keep the failed component in the bundle and try to install it during the upgrade, which will obviously fail (update 16/56):

failed_update

Once the upgrade is finished you will get the following error at the end:

Note: Some update requires machine reboot. Please reboot to CD/DVD to continue for the failed update because of the dependency…

upgrade_status

No matter how many times you reboot you will obviously get the same errors. You can ignore it if you 100% sure this is what causes the upgrade to fail or use Workaround #2.

Workaround #2: Create Custom ISO

When you create a repository in DRM it’s populated with pre-built components and bundles. But you can create custom repositories. The idea is that you can exclude the failed component from the repository by creating it manually.

Assuming you already have the base repository configured, do the following:

  • Open the existing repository and click on the Components tab
  • Deselect the failed component in the component list (in my case it was Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN)
  • Click on the “Copy To” button:

custom_components

  • In the opened dialogue select “Create NEW Repository and copy component(s) into it”
  • Follow the wizard and when you click finish, components will be copied to the newly created repository
  • Open the new repository and click on the Componenets tab
  • Select all components and click on the “Copy to” button once again
  • This time select “Create a NEW Bundle in the same repository and add component(s) into it”
  • On the next screen give the bundle a name and make sure to choose “Linux 32-bit and 64-bit” in the OS Type

custom_bundle

As a result you should get a new bundle created which you can export to a bootable ISO using the same process.

Workaround #3: Use Server Update Utility

If none of the above helps you can fall back to a proven upgrade approach and use Server Update Utility (SUU). SUU is a huge 12GB ISO to download, but you can use Dell Download Manager, which supports resuming interrupted downloads. Make sure to disable proxy! Dell Download Manager does not support resuming an interrupted download if you’re using a proxy server.

SUU is not a bootable ISO. Previously you had to use Dell Systems Build and Update Utility (SBUU) to boot from it first and then mount the ISO to proceed with the upgrade. Starting with Dell 11G servers you don’t need it anymore and can upgrade firmware straight form Dell Lifecycle Controller (LC).

You’ll need to boot into the Lifecycle Controller and choose Firmware Update > Launch Firmware Update > Local Drive(CD or DVD or USB). Mount the SUU ISO and the rest is fairly straightforward. LC will upgrade the firmware and reboot the blade.

lc_upgrade

Conclusion

Dell Repository Manager is the recommended approach to upgrade firmware on Dell hardware. Unlike SUU, DRM downloads the latest updates and only the necessary components. It is also capable of making a bootable ISO.

If you have issues, rely on Server Update Utility as it’s bulletproof and always work. But be prepared to download a 12GB ISO image and make sure you have an option to bypass proxy.

Dell Compellent is not an ALUA Storage Array

May 16, 2016

dell_compellentDell Compellent is Dell’s flagship storage array which competes in the market with such rivals as EMC VNX and NetApp FAS. All these products have slightly different storage architectures. In this blog post I want to discuss what distinguishes Dell Compellent from the aforementioned arrays when it comes to multipathing and failover. This may help you make right decisions when designing and installing a solution based on Dell Compellent in your production environment.

Compellent Array Architecture

In one of my previous posts I showed how Compellent LUNs on vSphere ESXi hosts are claimed by VMW_SATP_DEFAULT_AA instead of VMW_SATP_ALUA SATP, which is the default for all ALUA arrays. This happens because Compellent is not actually an ALUA array and doesn’t have the tpgs_on option enabled. Let’s digress for a minute and talk about what the tpgs_on option actually is.

For a storage array to be claimed by VMW_SATP_ALUA it has to have the tpgs_on option enabled, as indicated by the corresponding SATP claim rule:

# esxcli storage nmp satp rule list

Name                 Transport  Claim Options Description
-------------------  ---------  ------------- -----------------------------------
VMW_SATP_ALUA                   tpgs_on       Any array with ALUA support

This is how Target Port Groups (TPG) are defined in section 5.8.2.1 Introduction to asymmetric logical unit access of SCSI Primary Commands – 3 (SPC-3) standard:

A target port group is defined as a set of target ports that are in the same target port asymmetric access state at all times. A target port group asymmetric access state is defined as the target port asymmetric access state common to the set of target ports in a target port group. The grouping of target ports is vendor specific.

This has to do with how ports on storage controllers are grouped. On an ALUA array even though a LUN can be accessed through either of the controllers, paths only to one of them (controller which owns the LUN) are Active Optimized (AO) and paths to the other controller (non-owner) are Active Non-Optimized (ANO).

Compellent does not present LUNs through the non-owning controller. You can easily see that if you go to the LUN properties. In this example we have four iSCSI ports connected (two per controller) on the Compellent side, but we can see only two paths, which are the paths from the owning controller.

compellent_psp

If Compellent presents each particular LUN only through one controller, then how does it implement failover? Compellent uses a concept of fault domains and control ports to handle LUN failover between controllers.

Compellent Fault Domains

This is Dell’s definition of a Fault Domain:

Fault domains group front-end ports that are connected to the same Fibre Channel fabric or Ethernet network. Ports that belong to the same fault domain can fail over to each other because they have the same connectivity.

So depending on how you decided to go about your iSCSI network configuration you can have one iSCSI subnet / one fault domain / one control port or two iSCSI subnets / two fault domains / two control ports. Either of the designs work fine, this is really is just a matter of preference.

You can think of a Control Port as a Virtual IP (VIP) for the particular iSCSI subnet. When you’re setting up iSCSI connectivity to a Compellent, you specify Control Ports IPs in Dynamic Discovery section of the iSCSI adapter properties. Which then redirects the traffic to the actual controller IPs.

If you go to the Storage Center GUI you will see that Compellent also creates one virtual port for every iSCSI physical port. This is what’s called a Virtual Port Mode and is recommended instead of a Physical Port Mode, which is the default setting during the array initialization.

Failover scenarios

Now that we now what fault domains are, let’s talk about the different failover scenarios. Failover can happen on either a port level when you have a transceiver / cable failure or a controller level, when the whole controller goes down or is rebooted. Let’s discuss all of these scenarios and their variations one by one.

1. One Port Failed / One Fault Domain

If you use one iSCSI subnet and hence one fault domain, when you have a port failure, Compellent will move the failed port to the other port on the same controller within the same fault domain.

port_failed

In this example, 5000D31000B48B0E and 5000D31000B48B0D are physical ports and 5000D31000B48B1D and 5000D31000B48B1C are the corresponding virtual ports on the first controller. Physical port 5000D31000B48B0E fails. Since both ports on the controller are in the same fault domain, controller moves virtual port 5000D31000B48B1D from its original physical port 5000D31000B48B0E to the physical port 5000D31000B48B0D, which still has connection to network. In the background Compellent uses iSCSI redirect command on the Control Port to move the traffic to the new virtual port location.

2. One Port Failed / Two Fault Domains

Two fault domains scenario is slightly different as now on each controller there’s only one port in each fault domain. If any of the ports were to fail, controller would not fail over the port. Port is failed over only within the same controller/domain. Since there’s no second port in the same fault domain, the virtual port stays down.

port_failed_2

A distinction needs to be made between the physical and virtual ports here. Because from the physical perspective you lose one physical link in both One Fault Domain and Two Fault Domains scenarios. The only difference is, since in the latter case the virtual port is not moved, you’ll see one path down when you go to LUN properties on an ESXi host.

3. Two Ports Failed

This is the scenario which you have to be careful with. Compellent does not initiate a controller failover when all front-end ports on a controller fail. The end result – all LUNs owned by this controller become unavailable.

two_ports_failed_2

luns_down

This is the price Compellent pays for not supporting ALUA. However, such scenario is very unlikely to happen in a properly designed solution. If you have two redundant network switches and controllers are cross-connected to both of them, if one switch fails you lose only one link per controller and all LUNs stay accessible through the remaining links/switch.

4. Controller Failed / Rebooted

If the whole controller fails the ports are failed over in a similar fashion. But now, instead of moving ports within the controller, ports are moved across controllers and LUNs come across with them. You can see how all virtual ports have been failed over from the second (failed) to the first (survived) controller:

controller_failed

Once the second controller gets back online, you will need to rebalance the ports or in other words move them back to the original controller. This doesn’t happen automatically. Compellent will either show you a pop up window or you can do that by going to System > Setup > Multi-Controller > Rebalance Local Ports.

Conclusion

Dell Compellent is not an ALUA storage array and falls into the category of Active/Passive arrays from the LUN access perspective. Under such architecture both controller can service I/O, but each particular LUN can be accessed only through one controller. This is different from the ALUA arrays, where LUN can be accessed from both controllers, but paths are active optimized on the owning controller and active non-optimized on the non-owning controller.

From the end user perspective it does not make much of a difference. As we’ve seen, Compellent can handle failover on both port and controller levels. The only exception is, Compellent doesn’t failover a controller if it loses all front-end connectivity, but this issue can be easily avoided by properly designing iSCSI network and making sure that both controllers are connected to two upstream switches in a redundant fashion.

Changing the Default PSP for Dell Compellent

April 26, 2016

dell_compellentIf you’ve ever worked with Dell Compellent storage arrays you may have noticed that when you initially connect it to a VMware ESXi host, by default VMware Native Multipathing Plugin (NMP) uses Fixed Path Selection Policy (PSP) for all connected LUNs. And if you have two ports on each of the controllers connected to your storage area network (be it iSCSI or FC), then you’re wasting half of your bandwidth.

compellent_psp

Why does that happen? Let’s dig deep into VMware’s Pluggable Storage Architecture (PSA) and see how it treats Compellent.

How Compellent is claimed by VMware NMP

If you are familiar with vSphere’s Pluggable Storage Architecture (PSA) and NMP (which is the only PSA plug-in that every ESXi host has installed by default), then you may know that historically it’s always had specific rules for such Asymmetric Logical Unit Access (ALUA) arrays as NetApp FAS and EMC VNX.

Run the following command on an ESXi host and you will see claim rules for NetApp and DGC devices (DGC is Data General Corporation, which built Clariion array that has been later re-branded as VNX by EMC):

# esxcli storage nmp satp rule list

Name              Vendor  Default PSP Description
----------------  ------- ----------- -------------------------------
VMW_SATP_ALUA_CX  DGC                 CLARiiON array in ALUA mode
VMW_SATP_ALUA     NETAPP  VMW_PSP_RR  NetApp arrays with ALUA support

This tells NMP to use Round-Robin Path Selection Policy (PSP) for these arrays, which is always preferable if you want to utilize all available active-optimized paths. You may have noticed that there’s no default PSP in the VNX claim rule, but if you look at the default PSP for the VMW_SATP_ALUA_CX Storage Array Type Plug-In (SATP), you’ll see that it’s also Round-Robin:

# esxcli storage nmp satp list

Name              Default PSP  
----------------- -----------
VMW_SATP_ALUA_CX  VMW_PSP_RR

There is, however, no default claim rule for Dell Compellent storage arrays. There are a handful of the following non array-specific “catch all” rules:

Name                 Transport  Claim Options Description
-------------------  ---------  ------------- -----------------------------------
VMW_SATP_ALUA                   tpgs_on       Any array with ALUA support
VMW_SATP_DEFAULT_AA  fc                       Fibre Channel Devices
VMW_SATP_DEFAULT_AA  fcoe                     Fibre Channel over Ethernet Devices
VMW_SATP_DEFAULT_AA  iscsi                    iSCSI Devices

As you can see, the default PSP for VMW_SATP_ALUA is Most Recently Used (MRU) and for VMW_SATP_DEFAULT_AA it’s VMW_PSP_FIXED:

Name                Default PSP   Description
------------------- ------------- ------------------------------------------
VMW_SATP_ALUA       VMW_PSP_MRU
VMW_SATP_DEFAULT_AA VMW_PSP_FIXED Supports non-specific active/active arrays

Compellent is not an ALUA storage array and doesn’t have the tpgs_on option enabled. As a result it’s claimed by the VMW_SATP_DEFAULT_AA rule for the iSCSI transport, which is why you end up with the Fixed path selection policy for all LUNs by default.

Changing the default PSP

Now let’s see how we can change the PSP from Fixed to Round Robin. First thing you have to do before even attempting to change the PSP is to check VMware Compatibility List to make sure that the round robin PSP is supported for a particular array and vSphere combination.

vmware_hcl

As you can see, round robin path selection policy is supported for Dell Compellent storage arrays in vSphere 6.0u2. So let’s change it to get the benefit of being able to simultaneously use all paths to Compellent controllers.

For Compellent firmware versions 6.5 and earlier use the following command to change the default PSP:

# esxcli storage nmp satp set -P VMW_PSP_RR -s VMW_SATP_DEFAULT_AA

Note: technically here you’re changing PSP not specifically for the Compellent storage array, but for any array which is claimed by VMW_SATP_DEFAULT_AA and which also doesn’t have an individual SATP rule with PSP set. Make sure that this is not the case or you may accidentally change PSP for some other array you may have in your environment.

The above will change PSP for any newly provisioned and connected LUNs. For any existing LUNs you can change PSP either manually in each LUN’s properties or run the following command in PowerCLI:

# Get-Cluster ClusterNameHere | Get-VMHost | Get-ScsiLun | where {$_.Vendor -eq
“COMPELNT” –and $_.Multipathpolicy -eq “Fixed”} | Set-ScsiLun -Multipathpolicy
RoundRobin

This is what you should see in LUN properties as a result:

compellent_psp_2

Conclusion

By default any LUN connected from a Dell Compellent storage array is claimed by NMP using Fixed path selection policy. You can change it to Round Robin using the above two simple commands to make sure you utilize all storage paths available to ESXi hosts.

Painless Dell FX2 Firmware Upgrade

April 10, 2016

Overview

Recently I’ve had a chance to play with Dell’s FX2 chassis for a bit. Dell FX2 falls into the category of blade chassis and can hold up to 8 blades with Atom or 4 blades with Xeon CPUs in a 2U chassis.

Dell_FX2

Besides the compute blades FX2 also supports storage blades, which you can dedicate to particular compute blades and use as additional storage.

On the networking side you can choose from either pass-through modules or three types of I/O aggregators – four 10G SFP+ ports, four 10GBASE-T ports, or two Fibre Channel plus two SFP+ external ports.

The chassis itself also comes in two flavors – FX2 or FX2s. The main difference between the two is that FX2s additionally has PCIe slots at the back, which can be mapped to the server blades to provide additional connectivity.

Dell_FX2_Rear

First step of every hardware solution deployment is a firmware upgrade. But when it comes to firmware on Dell blade equipment be it M1000e, VRTX or FX2 you can quickly get confused. Especially when you go to the blade section and see a dozen of hardware components. Download and update each of them individually would be daunting. Fortunately there is an better way.

blade_firmware

CMC Firmware

Upgrade starts from the chassis management controller, which has two components: Chassis Infrastructure Firmware (or Main Board) and the CMC itself. You can find them on the Chassis Overview > Update tab.

CMC firmware comes as an .exe package, which you can extract. You really need just the fx2_cmc.bin file. During upgrade you will lose access to CMC for 5-10 minutes, while CMC is rebooting.

For the infrastructure firmware you’ll need the fx2_mainboard.bin file. The gotcha with the infrastructure firmware upgrade is that you’ll need all blades to be powered off. So if you have just one chassis this might be tricky.

Blade Firmware

Blades firmware is where this gets interesting. You can certainly upgrade all blades from the CMC by downloading firmware from the Dell support web-site and choosing one component at a time in Chassis Overview > Server Overview > Update section. CMC is capable of upgrading say iDRAC across all blades simultaneously, but it’s still about a dozen components.

The easier approach would be to use Dell Repository Manager (DRM). DRM can download firmware for virtually any blade or rack server (including some of the storage and network hardware) and build a bootable ISO image for an easy upgrade.

To build a bootable ISO follow the following steps:

  • Download and install Dell Repository Manager from the Dell support web-site
  • Add a source by going to Source > View Dell Online Catalog
  • Create a repository by going to Repository > New > Create New Repository
  • In the wizard select your hardware (I selected PowerEdge FC630 from the Blade category) and choose Linux (32-bit and 64-bit) as a DUP format (I’ll explain that later).
  • Go to the newly created repository, select the bundle and click Export

export_bundle

DRM can export bundles in multiple forms, we are interested in a bootable ISO and this is why we selected the Linux DUP format when we created the repository. DRM creates a Linux bootable ISO, so there was no point selecting Windows bundles.

  • Select “Create Bootable ISO (Linux Only)” and continue with the default settings for the rest

As a result you will get an .iso file, which you can mount to the server via iDRAC Remote Console and boot from it for a firmware upgrade.

Network I/O Aggregators

FX2 I/O aggregators are Dell Force10 switches, which use Force10 OS (FTOS). FTOS firmware is NOT available from the Dell web-site. You’ll need to register an account at https://www.force10networks.com to download the firmware.

Make sure to download firmware release specifically built for FX2 I/O aggregators, which can be found in M-Series Software section.

aggregators_firmware

To upgrade the aggregators go to Chassis Overview > I/O Module Overview > Update. Aggregators reset after a reboot, so make sure to upgrade them one at a time. Or if you stacked them instead of using VLT or standalone mode, you’ll have to have a downtime, as stacked switches reboot together.

Conclusion

There is nothing fancy in upgrading firmware on a blade chassis, you want it to be quick and painless. Make sure to use Dell Repository Manager for blades upgrade. It may save you heaps of time and make your life easier.

Dell Compellent Enterprise Manager: SQL Server Setup

January 25, 2016

Dell Compellent Enterprise Manager is a separate piece of software, which comes with every Compellent storage array deployment and allows you to monitor, manage, and analyze one or multiple arrays from a centralized management console.

Probably the most valuable feature of Enterprise Manager for an average user is its historical performance statistics. From the Storage Center GUI you can see only real-time data. Enterprise Manager is capable of keeping statistics for up to a year. And can obviously do a multitude of other things, such as assist with capacity planning, allow you to configure replication between production and DR sites, generate reports or simply serve as a single pain of glass interface to all of your Compellent storage arrays.

Enterprise Manager installation does not include an embedded database. If you want to deploy EM database on an existing Microsoft SQL database or install a new dedicated Microsoft SQL Express instance, you need to do it manually. The following guide describes how to install Enterprise manager with Microsoft SQL Server 2012 Express.

SQL Server Authentication

During installation, Enterprise Manager will ask you for authentication credentials to connect to the SQL database. In SQL Server you can use either the Windows Authentication Mode or Mixed Mode. Mixed Mode allows both Windows authentication and SQL Server authentication via local SQL Server accounts.

For a typical SQL Server setup, Microsoft does not recommend enabling SQL Server authentication and especially with the default “sa” account, as it’s seen as insecure. So if you have a centralized Microsoft SQL Server in your network you’ll most likely be using Microsoft Authentication. But for a dedicated SQL Server Express database it’s fine to enable Mixed Mode and use SQL Server authentication.

Make sure to enable Mixed Mode during SQL Server Express installation and enter a password for the “sa” account.

sql_authentication

SQL Server Network Configuration

Enterprise Manager uses TCP/IP to connect to the SQL database over port 1433. TCP/IP is not enabled by default in SQL Server Express, you will need to enable it manually.

sql_network

Use SQL Server Configuration Manager, which is installed with the database, and browse to SQL Server Network Configuration > Protocols for SQLEXPRESS > TCP/IP. Enable TCP/IP on the Protocols tab and assign port 1433 to TCP Port field in IPAll section.

Make sure to restart the SQL server to apply the settings and you should now be able to connect Enterprise Manager to the database.

If you get stuck, refer to Dell Compellent Enterprise Manager Installation Guide and specifically the section “Prepare a Microsoft SQL Server Database”. This guide is available in Dell Compellent Knowledge Center.

Beginner’s Guide to Dell N4000 Series Switches

January 18, 2016

Dell N-Series switches run on Dell Network Operating System (DNOS) version 6.x. Unlike Dell S-Series switches which run on DNOS 9.x, derived from  Force10 Operation System (FTOS), DNOS 6.x came from the PowerConnect switch series and share the same codebase. So if you’ve ever worked with PowerConnect switches, N-Series syntax should be very familiar.

In my case I had two Dell N4032F switches. But the same set of commands applies to any other N4000 Series switch.

Initial Configuration

When you first turn the switch on, it gives you 60 seconds to enter the wizard, where you can set up network settings for the Out-of-Band (OOB) management interface and change the admin password. If you miss it you can reboot the switch and it will show the same wizard prompt again when it boots up. Or you can set it up from the CLI:

# interface out-of-band
# ip address 10.10.10.10 255.255.255.0 10.10.10.254

# show ip interface out-of-band

Once you get to the CLI prompt, configure hostname and enable SSH:

# hostname n4032f-prod

# crypto key generate rsa
# crypto key generate dsa
# ip ssh server
# ip telnet server disable

Stacking

Dell N4000 Series switches support both stacking and MLAG (Multi-chassis Link Aggregation). One of the drawbacks of the stack configuration is disruptive firmware upgrades. When you update firmware on the stack master, firmware is distributed to all stack members and all switches are rebooted simultaneously.

In MLAG each switch has its own Control Plane and can be rebooted independently. Which is MLAG’s shortcoming at the same time, because unlike stack, where all units act as one switch, in MLAG you have to manage each switch separately.

In my case I chose stacking for its simplicity.

Dell N4000

N4000 switches are stacked using the two 40Gb QSFP ports located at the front. QSFP ports are not configured in stack mode by default. Which you need to change on both switches before you can build a stack:

# stack
# stack-port Fortygigabitethernet 1/1/1 stack
# stack-port Fortygigabitethernet 1/1/2 stack

# show switch stack-ports

Once QSFP ports on both switches are configured, disconnect power from both switches and boot the switch you want to be the stack master first (typically the top switch). When the first switch has fully booted, boot the second switch and check the status. This is what you should see:

# show switch

n4000_stack

Firmware Upgrade

If it’s not a brand new switch, save the config before doing the firmware upgrade:

# copy run start
# copy running-config tftp://10.10.10.100/backup.txt

You can use any TFTP server for the firmware upgrade, such as the free Tftpd64 server.

tftpd64

Then you upload the firmware image to the stack master and reload the stack:

# copy tftp://10.10.10.100/N4000v6.2.7.2.stk backup
# boot system backup
# reload
# show version

Firmware is uploaded to a backup image. Then you select the backup image for the next boot and reload the stack. When both switches reboot you should see something similar to this:

frimware_upgraded

As part of the upgrade process the new firmware is automatically uploaded from the master to all stack members, which is a default behaviour. You can confirm it is enabled using the following command:

# show auto-copy-sw

Flow Control, Jumbo Frames and iSCSI Optimization

In my case I used two N4032F switches for an iSCSI backbone, so I needed to make sure that Flow Control and Jumbo Frames are enabled on the switch.

Flow Control is enabled by default, which you can confirm by the following command:

# show storm-control

To globally enable Jumbo Frames on all ports type:

# system jumbo mtu 9216

# show system mtu

Interestingly, Dell N4000 Series switches also have built-in iSCSI optimization, which can detect iSCSI sessions by snooping the traffic on ports 3260 and 860. It then prioritizes iSCSI traffic over the other types of traffic to guarantee low latency for storage I/O. To show iSCSI settings:

# show iscsi

By default switches only track the sessions. Traffic prioritization is disabled by default and has to be enabled manually. This didn’t matter in my case, as the switches were dedicated for storage traffic. But if you share switches between storage and server traffic, you may want to enable it. Refer to the switch User’s Configuration Guide for details.

If you’re using a Dell Compellent storage array with N4000 switches, also make sure to apply a Compellent profile to the ports where storage array is connected to:

# macro global apply profile-compellent-nas $interface_name te1/0/1
# macro global apply profile-compellent-nas $interface_name te1/0/2
# macro global apply profile-compellent-nas $interface_name te1/0/3
# macro global apply profile-compellent-nas $interface_name te1/0/4

VLANs, Trunks and Port Channels

Again, I didn’t use any VLANs and Trunks, because switches were dedicated for iSCSI traffic and were separate from the LAN core. And I didn’t need Port Channels either, as they are not required for iSCSI.

Your scenario might be different. For instance, if you have vSphere hosts connected to a NetApp array over NFS, you may want to create a Multi-Mode (LACP) VIF on the NetApp side. If that’s the case, to create a port channel on the Multi-Mode VIF ports use the following:

# interface range te1/0/2,te2/0/2
# channel-group 1 mode active
# show intefaces po1

If the switches are used for both storage and VM traffic, then you’ll need to configure the server ports and uplink them to your network core. Create your VLANs first:

# vlan 10,20,30

Configure vSwitch uplinks from the ESXi hosts. In a typical vSphere environment, traffic is tagged on the vSwitch side, which means that server ports should be configured as trunks:

# interface range te1/0/3-6,te2/0/3-6
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30

And finally configure uplinks to the network core. Depending on how your LAN core is set up, you may want to create a port channel to the upstream switch and trunk the required VLANs:

# interface range te1/0/1,te2/0/1
# channel-group 2 mode active
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30
# show intefaces po2

Conclusion

This guide didn’t include information on Spanning Tree, QoS or any of the switch Layer 3 features, but I hope it could get you started. At the end of the day, every environment is different. If you need additional information refer to the following guides from the Dell web-site: