Posts Tagged ‘cabling’

Force10 MXL Switch: Stacking

March 3, 2015

Overview

There are two typical scenarios for stacking MXL’s – within the chassis and across the chassis. In both cases it’s recommended to use ring topology. Daisy chaining is also supported, but not desirable because of the lack of redundancy.

In this post I will be describing the more common case, which is intra-stacking. For inter-stacking configuration you can refer to Dell or Force10 documentation.

Cabling

dell_chassis

In my case I have four MXL switches in bays A1, B1, B2, A2. Cabling is simple, you basically daisy chain all switches and then plug the last switch to the first one.

Stack roles and unit numbers

When stack is built each switch is assigned an ID starting 0 and a role in the stack. There are three roles: Master, Standby and Member:

  • Master – is the switch you’ll use for all configuration. If you currently have IPs assigned to all your MXL switches, all of them except for one will be reset and only the Master will be accessible via SSH.
  • Standby – is the switch which takes over if Master switch fails. Master switch IP address is transferred to Standby in a failover scenario and stack continues to be managed via the same IP.
  • Member switch provides port capacity and doesn’t play any additional roles in the stack.

When you plug cables in, assign stack ports and restart the switches, they will go through election process and automatically pick up roles, as well as IDs. There’s an algorithm that assigns stack IDs and roles, which switches follow. But this algorithm has nothing to do with interconnect bay IDs in the chassis or order in which you cable the switches. You end up with pretty much random numbering.

If order matters, then you’ll have to reboot switches one by one in a particular order to have the desired IDs assigned. In that case IDs are assigned sequentially in a controlled fashion.

Stack configuration

If you don’t have any additional 40GbE modules in slots 0 and 1, then you’ll end up with two QSFP+ ports in a built-in module – ports 33 and 37 (refer to my Force10 MXL Switch: Port Numbering post for port numbering details). All you need to do is to designate them as stack ports on all switches, save config and reboot.

# stack-unit 0 stack-group 0
# stack-unit 0 stack-group 1
# copy run start
# reload

By default each switch is unit 0 in its own stack and stack-group is basically just a 40GbE stack port. You can have maximum of six such ports numbered from 0 to 5. To check that stack ports have been enabled run:

# do show system stack-unit 0 stack-group configured

enabled_ports

It could be that your 40GbE ports are in quad 10GbE mode and are not shown. You’ll need to convert them back to 40GbE mode to proceed. To show the list of available ports type in the command below. Switch shows empty expansion slots as stack ports as well (port 0/41 and 0/45), which is a bit confusing.

# show system stack-unit 0 stack-group

port_list

After a reboot, switches will join the stack and get a role and an id. This process is automatic by default. To see if stack ports have come up after a reboot type:

# show system stack-port status

stack_up

Conclusion

In my example I let switches to go through election process and select roles and IDs on their own. If you want to control the assignment process refer to Dell and Force10 documentation for instructions.

Now you may wonder if unit IDs are assigned automatically, how do you know which stack unit corresponds to which chassis bay ID. The hint for that is to show system inventory and map them by the Service Tag ID which is also shown in the Chassis Management Controller:

# show system brief
# show inventory

Advertisements

Overview of NetApp Replication and HA features

August 9, 2013

NetApp has quite a bit of features related to replication and clustering:

  • HA pairs (including mirrored HA pairs)
  • Aggregate mirroring with SyncMirror
  • MetroCluster (Fabric and Stretched)
  • SnapMirror (Sync, Semi-Sync, Async)

It’s easy to get lost here. So lets try to understand what goes where.

Simple-Metrocluster

SnapMirror

SnapMirror is a volume level replication, which normally works over IP network (SnapMirror can work over FC but only with FC-VI cards and it is not widely used).

Asynchronous version of SnapMirror replicates data according to schedule. SnapMiror Sync uses NVLOGM shipping (described briefly in my previous post) to synchronously replicate data between two storage systems. SnapMirror Semi-Sync is in between and synchronizes writes on Consistency Point (CP) level.

SnapMirror provides protection from data corruption inside a volume. But with SnapMirror you don’t have automatic failover of any sort. You need to break SnapMirror relationship and present data to clients manually. Then resynchronize volumes when problem is fixed.

SyncMirror

SyncMirror mirror aggregates and work on a RAID level. You can configure mirroring between two shelves of the same system and prevent an outage in case of a shelf failure.

SyncMirror uses a concept of plexes to describe mirrored copies of data. You have two plexes: plex0 and plex1. Each plex consists of disks from a separate pool: pool0 or pool1. Disks are assigned to pools depending on cabling. Disks in each of the pools must be in separate shelves to ensure high availability. Once shelves are cabled, you enable SyncMiror and create a mirrored aggregate using the following syntax:

> aggr create aggr_name -m -d disk-list -d disk-list

HA Pair

HA Pair is basically two controllers which both have connection to their own and partner shelves. When one of the controllers fails, the other one takes over. It’s called Cluster Failover (CFO). Controller NVRAMs are mirrored over NVRAM interconnect link. So even the data which hasn’t been committed to disks isn’t lost.

MetroCluster

MetroCluster provides failover on a storage system level. It uses the same SyncMirror feature beneath it to mirror data between two storage systems (instead of two shelves of the same system as in pure SyncMirror implementation). Now even if a storage controller fails together with all of its storage, you are safe. The other system takes over and continues to service requests.

HA Pair can’t failover when disk shelf fails, because partner doesn’t have a copy to service requests from.

Mirrored HA Pair

You can think of a Mirrored HA Pair as HA Pair with SyncMirror between the systems. You can implement almost the same configuration on HA pair with SyncMirror inside (not between) the system. Because the odds of the whole storage system (controller + shelves) going down is highly unlike. But it can give you more peace of mind if it’s mirrored between two system.

It cannot failover like MetroCluster, when one of the storage systems goes down. The whole process is manual. The reasonable question here is why it cannot failover if it has a copy of all the data? Because MetroCluster is a separate functionality, which performs all the checks and carry out a cutover to a mirror. It’s called Cluster Failover on Disaster (CFOD). SyncMirror is only a mirroring facility and doesn’t even know that cluster exists.

Further Reading

NetApp Active/Active Cabling

October 9, 2011

Cabling for active/active NetApp cluster is defined in Active/Active Configuration Guide. It’s described in detail but may be rather confusing for beginners.

First of all we use old DATA ONTAP 7.2.3. Much has changed since it’s release, particularly in disk shelves design. If documentation says:

If your disk shelf modules have terminate switches, set the terminate switches to Off on all but the last disk shelf in loop 1, and set the terminate switch on the last disk shelf to On.

You can be pretty much confident that you won’t have any “terminate switches”. Just skip this step.

To configuration types. We have two NetApp Filers and four disk shelves – two FC and two SATA. You can connect them in several ways.

First way is making two stacks (formely loops) each will be built from shelves of the same type. Each filer will own its stack. This configuration also allows you to implement multipathing. Lets take a look at picture from NetApp flyer:

Solid blue lines show primary connection. Appliance X (AX) 0a port is connected to Appliance X Disk Shelf 1 (AXDS1) A-In port, AXDS1 A-Out port is connected to AXDS2 A-In port. This comprises first stack. Then AY 0c port is connected to AYDS1 A-In port, AYDS1 A-Out port is connected to AYDS2 A-In port. This comprises second stack. If you leave it this way you will have to fully separate stacks.

If you want to implement active/active cluster you should do the same for B channels. As you can see in the picture AX 0c port is connected to AYDS1 B-In port, AYDS1 B-Out port is connected to AYDS2 B-In port. Then AY 0a port is connected to AXDS1 B-In port, AXDS1 B-Out port is connected to AXDS2 B-In port. Now both filers are connected to both stacks and in case of one filer failure the other can takeover.

Now we have four additional free ports: A-Out and B-Out in AXDS2 and AYDS2. You can use these ports for multipathing. Connect AX 0d to AXDS2 B-Out, AYe0d to AXDS2 A-Out, AX 0b to AYDS2 A-Out and AY 0b to AXDS2 B-Out. Now if disk shelf module, connection, or host bus adapter fails there is also a redundant path.

Second way which we implemented assumes that each filer owns one FC and one SATA disk shelf. It requires four loops instead of two, because FC and SATA shelves can’t be mixed in one loop. The shortcoming of such configuration is inability to implement multipathing, because each Filer has only four ports and each of it will be used for its own loop.

This time cabling is simpler. AX 0a is connected to AXDS1 A-In, AX 0b is connected to AYDS1 A-In, AY 0a is connected to AXDS2 A-In, AY 0b is connected to AYDS2 A-In. And to implement clustering you need to connect AX 0c to AXDS2 B-In, AX 0d to AYDS2 B-In, AYe0c to AXDS1 B-In and AY 0d to AYDS1 B-In.

Also I need to mention hardware and software disk ownership. In older system ownership was defined by cable connections. Filer which had connection to shelf A-In port owned all disks in this shelf or stack if there were other shelves daisy chained to channel A. Our FAS3020 for instance already supports software ownership where you can assign any disk to any filer in cluster. It means that it doesn’t matter now which port you use for connection – A-In or B-In. You can reassign disks during configuration.