Posts Tagged ‘fiber channel’

Merging Brocade Fabrics

February 23, 2016

fibreRecently I needed to merge two pairs of Brocade fibre channel fabrics for one of the customers. When I was doing a bit of my own research I realised that there is very scarce information on how to do that on the Interwebs. There were a few community posts on the Brocade forums, but there seemed to be some confusion around how zoning should be configured to let the switches merge successfully. I thought I would fill the gap with this post and share my own experience.

Prerequisites

First, make sure you have the right transceivers. Short wave 8Gb FC transceivers are limited to 190m when using OM4 fibre. If you need to connect switches over a longer distance, use long wave SFP+ modules, which have maximum distance of 10km.

Second, change the default switch Domain IDs. All switches within the same fabric must have unique IDs. By default Brocade switches come with the Domain ID set to 1. If you’re merging two redundant fabrics, make sure that the second pair of the switches have Domain IDs set to 2.

Third, verify that the switches you’re interconnecting have compatible zoning configuration. Brocade is very specific on how zoning should be configured for two fabrics to merge. There are at least nine different scenarios, but we’ll touch only on three most common ones. If you want to get more details, refer to the Brocade Fabric OS Administrator’s Guide and specifically the section called “Zone merging scenarios”.

Zone merging scenarios

Scenario 1: Switch A does not have a defined configuration. Switch B has a defined configuration.

This is the most straightforward scenario when you are adding a brand new Switch A to an existing fabric. As a result of the merge configuration from the Switch B propagates to the switch A.

Scenario 2: Switch A and Switch B have different defined configurations. Switch B has an effective configuration.

This is the scenario where you have two individual fabrics with their own set of aliases, zones and defined configurations. There is a catch here. If you want to merge such fabrics, you MUST have unique set of aliases, zones and configurations on each fabric. If this requirement is not met, fabrics won’t merge and you will end up with two segmented fabrics because of the zoning conflict. You also MUST disable effective zoning configuration on Switch A.

Outage is not required, because typically you have two redundant fabrics – fabric A and B in each location. And you can do one switch at a time. If you are still concerned, implement Scenario 3.

Scenario 3: Switch A and Switch B have the same defined and effective configuration.

This is the easiest path and is what Brocade calls a “clean merge”. Under this scenario you will have to recreate the same configs on both fabrics. That means you MUST have completely identical aliases, zones and configs on Switch A and Switch B.

This is the easiest and least disruptive path if you are worried that disabling effective configuration on the switches may cause issues.

Real world scenario

In my case I went with scenario 2 for two reasons: one – it was a DR site where I could temporarily bring down both fabrics and two – I didn’t need to manually add aliases/zones/configs to the switches as I would have to in scenario 3. Once fabrics are merged, zones from Switch B propagate to Switch A and you can simply combine them in one zone in the GUI, which is just a few mouse clicks.

site_topology

Here is the step by step process. First step is to change Domain IDs on the second pair of switches. You can do that both from GUI and CLI. Bear in mind that even if you’ve picked scenario 3 as the least disruptive approach for merging zones, changing Domain IDs will still be disruptive. Because switch has to be disabled before making the change.

From the Web Tools go to Switch Administration, disable the switch in the Switch Status section, type in the new Domain ID and re-enable the switch:

domain_id

If you want to take the CLI path, run the following. Switch will ask you a series of questions. You can accept all defaults, except for the Domain field:

> switchdisable
> configure
> switchenable
> fabricshow

Next disable the effective configuration on the Switch A either from GUI or CLI:

> cfgdisable
> cfgactvshow

At this point you can interconnect the switches and you should see the following log entry on Switch A:

The effective configuration has changed to SWITCHB_CONFIG

The fabrics are now merged an you should see both switches under the Web Tools. If you see the switch in the Segmented Switches section, it means that something went wrong:

merged_fabrics

Clean up steps

Once the fabrics are merged you will see all zones in the Zone Admin interface, however, the effective configuration will be configuration from the Switch B. You will need to create a new configuration which combines all zones to enable connectivity between the devices connected to the Switch A.

From the operational perspective you can now manage zoning on either of the switches and when you save or enable a configuration it will propagate to all switches in the fabric automatically.

If you have redundant fabrics, which you normally do, repeat the steps for the second pair of switches.

Conclusion

Steps described in this post are for a basic switch setup. If you have a non-standard switch configuration or using some of the advanced features, make sure to check “Zone Merging” section in the Fabric OS Administrator’s Guide for any additional considerations.

Let me know if this was helpful.

 

Advertisements

Traffic Load Balancing in Cisco UCS

December 21, 2015

Whenever I deploy a Cisco UCS at a customer the question I get asked a lot is how traffic flows within the system between VMs running on the blades and FEX modules, FEX modules and Fabric Interconnects and finally how it’s uplinked to the network core.

Cisco has a range of CNA cards for UCS blades. With VIC 1280 you get 8 x 10Gb ports split between two FEX modules for redundancy. And FEX modules on their own can have up to 8 x 10Gb Fabric Interconnect facing interfaces, which can give you up to 160Gb of bandwidth per chassis. And all these numbers may sound impressive, but unless you understand how your VMs traffic flows through UCS it’s easy to make wrong assumptions on what per VM and aggregate bandwidth you can achieve. So let’s dive deep into UCS and shed some light on how VM traffic is load-balanced within the system.

UCS Hardware Components

Each Fabric Extender (FEX) has external and internal ports. External FEX ports are patched to FIs and internal ports are internally wired to the blade adapters. FEX 2204 has 4 external and 16 internal and FEX 2208 has 8 external and 32 internal ports.

External ports are connected to FIs in powers of two: 1, 2, 4 or 8 ports per FEX and form a port channel (make sure to use “Port Channel” link grouping preference under Chassis/FEX Discovery Policy). Same rule is applied to blade Virtual Interface Cards (VIC). The most common VIC 1240 and 1280 have 4 x 10Gb and 8 x 10Gb ports respectively and also form a port channel to the internal FEX ports. Every VIC adaptor is connected to both FEX modules for redundancy.

chassis_network

Fabric Interconnects are then patched to your network core and FC Fabric (if you have one). Whether Ethernet uplinks will be individual uplinks or port channels will depend on your network topology. For fibre uplinks the rule of thumb is to patch FI A to your FC Fabric A and FI B to FC Fabric B, which follows the common FC traffic isolation principle.

Virtual Circuits

To provide network and storage connectivity to blades you create virtual NICs and virtual HBAs on each blade. Since internally UCS uses FCoE to transfer FC frames, both vNICs and vHBAs use the same 10GbE uplinks to send and receive traffic. Worth mentioning that Cisco uses Data Center Bridging (DCB) protocol with it’s sub-protocols Priority Flow Control (PFC) and Enhanced Transmission Selection (ETS), which guarantee that FC frames have higher priority in the queue and are processed first to ensure low latency. But I digress.

UCS assigns a virtual circuit to each virtual adaptor, which is a representation of how the traffic traverses the system all the way from the VIC port to a FEX internal port, then FEX external port, FI server port and finally a FI uplink. You can trace the full path of each virtual adaptor in UCS Manager by selecting a Service Profile and viewing the VIF Paths tab.

vif_paths

In this example we have a blade with four vNICs and two vHBAs which are split between two fabrics. All virtual adaptors on fabric A are connected through VIC port channel PC-1283 which is represented as port channel PC-1025 on the FEX A side. Then traffic leaves FEX A and reaches the Fabric Interconnect A which sends the traffic out to the network core through port channel A/PC-1.

You can also get the list of port channels from the FI CLI:

# connect nxos
# show port-channel summary

ucs_portchannels

Network Load Balancing

Now that we know how all components are interconnected to each other, let’s discuss the traffic flow in a typical VMware environment and how we achieve the massive network throughput that UCS provides.

As an example let’s take a look at the vSwitch where your VM Network port group is configured. vSwitch will have two uplinks – one goes to Fabric A and the other one to Fabric B for redundancy. Default load balancing policy on a vSwitch is “Route based on the originating port ID”, which essentially pins all traffic for a VM to a particular uplink. vSphere makes sure that VMs are evenly distributed between the uplinks to use all network bandwidth available.

From each uplink (or vNIC in UCS world) traffic is forwarded through an adapter port channel to a FEX, then to a Fabric Interconnect and leaves UCS from a FI uplink. Within UCS traffic is distributed between port channel members using source/destination IP hash algorithm. Which is even more granular and is capable of very efficient traffic distribution between all members of a port channel all the way up to your network core.

ucs_loadbalancing

If you look at the vSwitch you’ll see that with UCS each uplink shows the maximum available bandwidth from vNIC and is not limited to a port channel member speed of 10Gb. Why is this so powerful? Because with UCS you don’t need to slice adapter’s available bandwidth between different types of traffic. Even though you provision multiple vNICs and vHBAs for the vSphere hosts, UCS uses the same port channel links (20Gb in the example below) from the VIC adapter to transfer all traffic and takes care of load balancing for you.

vswitch_uplinks

You may legitimately ask, if UCS uses the same pipe to transfer all data regardless of which vSwitch uplink is being used, then how can I make sure that different types of traffic, such as vMotion, storage, VM traffic, replication, etc, do not compete for the same pipe? First you need to ask yourself if you can saturate that much bandwidth with your workloads. If the answer is yes, then you can use another great feature available in UCS, which is QoS. QoS lets you assign a minimum available bandwidth guarantee on a per vNIC/vHBA basis. But that’s a topic for another blog post.

References

In this post I tried to summarise the logic behind UCS traffic distribution. If you want to dig deeper in UCS network architecture, then there’re a lot of great bloggers out there. I would like to call out the following authors:

 

Brocade 300 Initial Setup

December 8, 2015

There are a few steps you need to do on the Brocades before moving on to cabling and zoning. The process is pretty straightforward, but worth documenting especially for those who are doing it for the first time.

After you power on the switch there are two ways of setting it up: GUI or CLI. We’ll go hardcore and do all configuration in CLI, but if you wish you can assign a static IP to your laptop from the 10.70.70.0/24 subnet and browse to https://10.77.77.77. Default credentials are admin/password for both GUI and CLI.

Network Settings

To configure network settings, such as a hostname, management IP address, DNS and NTP use the following commands:

> switchname PRODFCSW01
> ipaddrset
> dnsconfig
> tsclockserver 10.10.10.1

Most of these commands are interactive and ask for parameters. The only caveat is, if you have multiple switches under the same fabric, make sure to set NTP server to LOCL on all subordinate switches. It instructs them to synchronize their time with the principal switch.

Firmware Upgrade

This is the fun part. You can upgrade switch firmware using a USB stick, but the most common way is to upgrade using FTP. This obviously means that you need to install a FTP server. You can use FileZilla FTP server, which is decent and free.

Download the server and the client parts and install both. Default settings work just fine. Go to Edit > Users and add an anonymous user. Give it a home folder and unpack downloaded firmware into it. This is what it should look like:

filezilla.JPG

To upgrade firmware run the following command on the switch, which is also interactive and then reboot:

> firmwaredownload -s

If you’re running a Fabric OS revision older than 7.0.x, such as 6.3.x or 6.4.x, then you will need to upgrade to version 7.0.x first and then to your target version, such as 7.3.x or 7.4.x.

In the next blog post I will discuss firmware upgrades in more detail, such as how to do a non-disruptive upgrade on a production switch and where to download vendor-specific FOS firmware from.

VNX/VNXe array negotiates FC port as L-Port

March 23, 2015

Hit an issue today where VNXe array FC ports negotiate to L-port instead of F-port when Fill Word is set to Mode 3 (ARB/ARB then IDLE/ARB). Result – loss of connectivity on the affected link.

vnx_lport

Recommended FC Fill Word for VNX/VNXe arrays is Mode 3. Generally it’s a good idea to set them according to best practice as part of each installation. Apparently, when changing Fill Word from legacy Mode 0 (IDLE/IDLE) to Mode 3 (ARB/ARB then IDLE/ARB) array might negotiate as L-port and FC path goes down.

Solution is to statically configure port as F-port in port settings.

vnx_lport_sol

Environment:

  • Dell M5424 8Gb Fibre Channel Switch: Brocade FOS v7.2.1b
  • EMC VNXe 3200: Block OE v3.1.1.4993502

Overview of NetApp Replication and HA features

August 9, 2013

NetApp has quite a bit of features related to replication and clustering:

  • HA pairs (including mirrored HA pairs)
  • Aggregate mirroring with SyncMirror
  • MetroCluster (Fabric and Stretched)
  • SnapMirror (Sync, Semi-Sync, Async)

It’s easy to get lost here. So lets try to understand what goes where.

Simple-Metrocluster

SnapMirror

SnapMirror is a volume level replication, which normally works over IP network (SnapMirror can work over FC but only with FC-VI cards and it is not widely used).

Asynchronous version of SnapMirror replicates data according to schedule. SnapMiror Sync uses NVLOGM shipping (described briefly in my previous post) to synchronously replicate data between two storage systems. SnapMirror Semi-Sync is in between and synchronizes writes on Consistency Point (CP) level.

SnapMirror provides protection from data corruption inside a volume. But with SnapMirror you don’t have automatic failover of any sort. You need to break SnapMirror relationship and present data to clients manually. Then resynchronize volumes when problem is fixed.

SyncMirror

SyncMirror mirror aggregates and work on a RAID level. You can configure mirroring between two shelves of the same system and prevent an outage in case of a shelf failure.

SyncMirror uses a concept of plexes to describe mirrored copies of data. You have two plexes: plex0 and plex1. Each plex consists of disks from a separate pool: pool0 or pool1. Disks are assigned to pools depending on cabling. Disks in each of the pools must be in separate shelves to ensure high availability. Once shelves are cabled, you enable SyncMiror and create a mirrored aggregate using the following syntax:

> aggr create aggr_name -m -d disk-list -d disk-list

HA Pair

HA Pair is basically two controllers which both have connection to their own and partner shelves. When one of the controllers fails, the other one takes over. It’s called Cluster Failover (CFO). Controller NVRAMs are mirrored over NVRAM interconnect link. So even the data which hasn’t been committed to disks isn’t lost.

MetroCluster

MetroCluster provides failover on a storage system level. It uses the same SyncMirror feature beneath it to mirror data between two storage systems (instead of two shelves of the same system as in pure SyncMirror implementation). Now even if a storage controller fails together with all of its storage, you are safe. The other system takes over and continues to service requests.

HA Pair can’t failover when disk shelf fails, because partner doesn’t have a copy to service requests from.

Mirrored HA Pair

You can think of a Mirrored HA Pair as HA Pair with SyncMirror between the systems. You can implement almost the same configuration on HA pair with SyncMirror inside (not between) the system. Because the odds of the whole storage system (controller + shelves) going down is highly unlike. But it can give you more peace of mind if it’s mirrored between two system.

It cannot failover like MetroCluster, when one of the storage systems goes down. The whole process is manual. The reasonable question here is why it cannot failover if it has a copy of all the data? Because MetroCluster is a separate functionality, which performs all the checks and carry out a cutover to a mirror. It’s called Cluster Failover on Disaster (CFOD). SyncMirror is only a mirroring facility and doesn’t even know that cluster exists.

Further Reading