Posts Tagged ‘fabric’

Merging Brocade Fabrics

February 23, 2016

fibreRecently I needed to merge two pairs of Brocade fibre channel fabrics for one of the customers. When I was doing a bit of my own research I realised that there is very scarce information on how to do that on the Interwebs. There were a few community posts on the Brocade forums, but there seemed to be some confusion around how zoning should be configured to let the switches merge successfully. I thought I would fill the gap with this post and share my own experience.

Prerequisites

First, make sure you have the right transceivers. Short wave 8Gb FC transceivers are limited to 190m when using OM4 fibre. If you need to connect switches over a longer distance, use long wave SFP+ modules, which have maximum distance of 10km.

Second, change the default switch Domain IDs. All switches within the same fabric must have unique IDs. By default Brocade switches come with the Domain ID set to 1. If you’re merging two redundant fabrics, make sure that the second pair of the switches have Domain IDs set to 2.

Third, verify that the switches you’re interconnecting have compatible zoning configuration. Brocade is very specific on how zoning should be configured for two fabrics to merge. There are at least nine different scenarios, but we’ll touch only on three most common ones. If you want to get more details, refer to the Brocade Fabric OS Administrator’s Guide and specifically the section called “Zone merging scenarios”.

Zone merging scenarios

Scenario 1: Switch A does not have a defined configuration. Switch B has a defined configuration.

This is the most straightforward scenario when you are adding a brand new Switch A to an existing fabric. As a result of the merge configuration from the Switch B propagates to the switch A.

Scenario 2: Switch A and Switch B have different defined configurations. Switch B has an effective configuration.

This is the scenario where you have two individual fabrics with their own set of aliases, zones and defined configurations. There is a catch here. If you want to merge such fabrics, you MUST have unique set of aliases, zones and configurations on each fabric. If this requirement is not met, fabrics won’t merge and you will end up with two segmented fabrics because of the zoning conflict. You also MUST disable effective zoning configuration on Switch A.

Outage is not required, because typically you have two redundant fabrics – fabric A and B in each location. And you can do one switch at a time. If you are still concerned, implement Scenario 3.

Scenario 3: Switch A and Switch B have the same defined and effective configuration.

This is the easiest path and is what Brocade calls a “clean merge”. Under this scenario you will have to recreate the same configs on both fabrics. That means you MUST have completely identical aliases, zones and configs on Switch A and Switch B.

This is the easiest and least disruptive path if you are worried that disabling effective configuration on the switches may cause issues.

Real world scenario

In my case I went with scenario 2 for two reasons: one – it was a DR site where I could temporarily bring down both fabrics and two – I didn’t need to manually add aliases/zones/configs to the switches as I would have to in scenario 3. Once fabrics are merged, zones from Switch B propagate to Switch A and you can simply combine them in one zone in the GUI, which is just a few mouse clicks.

site_topology

Here is the step by step process. First step is to change Domain IDs on the second pair of switches. You can do that both from GUI and CLI. Bear in mind that even if you’ve picked scenario 3 as the least disruptive approach for merging zones, changing Domain IDs will still be disruptive. Because switch has to be disabled before making the change.

From the Web Tools go to Switch Administration, disable the switch in the Switch Status section, type in the new Domain ID and re-enable the switch:

domain_id

If you want to take the CLI path, run the following. Switch will ask you a series of questions. You can accept all defaults, except for the Domain field:

> switchdisable
> configure
> switchenable
> fabricshow

Next disable the effective configuration on the Switch A either from GUI or CLI:

> cfgdisable
> cfgactvshow

At this point you can interconnect the switches and you should see the following log entry on Switch A:

The effective configuration has changed to SWITCHB_CONFIG

The fabrics are now merged an you should see both switches under the Web Tools. If you see the switch in the Segmented Switches section, it means that something went wrong:

merged_fabrics

Clean up steps

Once the fabrics are merged you will see all zones in the Zone Admin interface, however, the effective configuration will be configuration from the Switch B. You will need to create a new configuration which combines all zones to enable connectivity between the devices connected to the Switch A.

From the operational perspective you can now manage zoning on either of the switches and when you save or enable a configuration it will propagate to all switches in the fabric automatically.

If you have redundant fabrics, which you normally do, repeat the steps for the second pair of switches.

Conclusion

Steps described in this post are for a basic switch setup. If you have a non-standard switch configuration or using some of the advanced features, make sure to check “Zone Merging” section in the Fabric OS Administrator’s Guide for any additional considerations.

Let me know if this was helpful.

 

Advertisement

Overview of NetApp Replication and HA features

August 9, 2013

NetApp has quite a bit of features related to replication and clustering:

  • HA pairs (including mirrored HA pairs)
  • Aggregate mirroring with SyncMirror
  • MetroCluster (Fabric and Stretched)
  • SnapMirror (Sync, Semi-Sync, Async)

It’s easy to get lost here. So lets try to understand what goes where.

Simple-Metrocluster

SnapMirror

SnapMirror is a volume level replication, which normally works over IP network (SnapMirror can work over FC but only with FC-VI cards and it is not widely used).

Asynchronous version of SnapMirror replicates data according to schedule. SnapMiror Sync uses NVLOGM shipping (described briefly in my previous post) to synchronously replicate data between two storage systems. SnapMirror Semi-Sync is in between and synchronizes writes on Consistency Point (CP) level.

SnapMirror provides protection from data corruption inside a volume. But with SnapMirror you don’t have automatic failover of any sort. You need to break SnapMirror relationship and present data to clients manually. Then resynchronize volumes when problem is fixed.

SyncMirror

SyncMirror mirror aggregates and work on a RAID level. You can configure mirroring between two shelves of the same system and prevent an outage in case of a shelf failure.

SyncMirror uses a concept of plexes to describe mirrored copies of data. You have two plexes: plex0 and plex1. Each plex consists of disks from a separate pool: pool0 or pool1. Disks are assigned to pools depending on cabling. Disks in each of the pools must be in separate shelves to ensure high availability. Once shelves are cabled, you enable SyncMiror and create a mirrored aggregate using the following syntax:

> aggr create aggr_name -m -d disk-list -d disk-list

HA Pair

HA Pair is basically two controllers which both have connection to their own and partner shelves. When one of the controllers fails, the other one takes over. It’s called Cluster Failover (CFO). Controller NVRAMs are mirrored over NVRAM interconnect link. So even the data which hasn’t been committed to disks isn’t lost.

MetroCluster

MetroCluster provides failover on a storage system level. It uses the same SyncMirror feature beneath it to mirror data between two storage systems (instead of two shelves of the same system as in pure SyncMirror implementation). Now even if a storage controller fails together with all of its storage, you are safe. The other system takes over and continues to service requests.

HA Pair can’t failover when disk shelf fails, because partner doesn’t have a copy to service requests from.

Mirrored HA Pair

You can think of a Mirrored HA Pair as HA Pair with SyncMirror between the systems. You can implement almost the same configuration on HA pair with SyncMirror inside (not between) the system. Because the odds of the whole storage system (controller + shelves) going down is highly unlike. But it can give you more peace of mind if it’s mirrored between two system.

It cannot failover like MetroCluster, when one of the storage systems goes down. The whole process is manual. The reasonable question here is why it cannot failover if it has a copy of all the data? Because MetroCluster is a separate functionality, which performs all the checks and carry out a cutover to a mirror. It’s called Cluster Failover on Disaster (CFOD). SyncMirror is only a mirroring facility and doesn’t even know that cluster exists.

Further Reading

Monitoring ESX Storage Queues

July 30, 2013

6a00d8341c328153ef01774354e2fd970d-500wiQueue Limits

I/O data goes through several storage queues on its way to disk drives. VMware is responsible for VM queue, LUN queue and HBA queue. VM and LUN queues are usually equal to 32 operations. It means that each ESX host at any moment can have no more than 32 active operations to a LUN. Same is true for VMs. Each VM can have as many as 32 active operations to a datastore. And if multiple VMs share the same datastore, their combined I/O flow can’t go over the 32 operations limit (per LUN queue for QLogic HBAs has been increased from 32 to 64 operations in vSphere 5). HBA queue size is much bigger and can hold several thousand operations (4096 for QLogic, however I can see in my config that driver is configured with 1014 operations).

Queue Monitoring

You can monitor storage queues of ESX host from the console. Run “esxtop”, press “d” to view disk adapter stats, then press “f” to open fields selection and add Queue Stats by pressing “d”.

AQLEN column will show the queue depth of the storage adapter. CMDS/s is the real-time number of IOPS. DAVG is the latency which comes from the frame traversing through the “driver – HBA – fabric – array SP” path and should be less than 20ms. Otherwise it means that storage is not coping. KAVG shows the time which operation spent in hypervisor kernel queue and should be less than 2ms.

Press “u” to see disk device statistics. Press “f” to open the add or remove fields dialog and select Queue Stats “f”. Here you’ll see a number of active (ACTV) and queue (QUED) operations per LUN.  %USD is the queue load. If you’re hitting 100 in %USD and see operations under QUED column, then again it means that your storage cannot manage the load an you need to redistribute your workload between spindles.

Some useful documents:

Zoning vs. LUN masking explained

September 28, 2012

Zoning and masking terms are often confused by those who just started working with SAN. But it takes 5 minutes googling to understand that the main difference is that zoning is configured on a SAN switch on a port basis (or WWN) and masking is a storage feature with LUN granularity. All modern hardware supports zoning and masking. Given that, the much more interesting question here is what’s the point of zoning if there is masking with finer granularity.

Both security features do the same thing, restrict access to particular storage targets. And it seems that there is no point in configuring both of them. But that’s not true. One, not that convincing argument, is that in case one of the features is accidentally misconfigured, you still maintain security. But the much bigger issue in no-zoning configuration are RSCNs. RSCNs are Registered State Change Notification messages which are issued by SAN Name Server service when fabric changes it’s configuration (new device has been added to the fabric, a zone has changed, a switch name or IP address has changed, etc). RSCNs can be disruptive to fabric operation. And if you don’t have zones RSCNs are flooded to everyone each time something changes in a fabric, even if it has nothing to do with majority of devices. So zoning is a SAN best practice and its configuration is highly recommended.

In fact, Brocade recommends to adopt a so called Single Initiator Zoning (SIZ) practice, when one host pWWN (initiator) is zoned to one or more storage pWWNs. It reduces RSCN issue to a minimum.

As a best reference read Brocade’s: Secure SAN Zoning – Best Practices.