Posts Tagged ‘upgrade’

[SOLVED] Migrating vCenter Notifications

January 6, 2018

Why is this a problem?

VMware upgrades and migrations still comprise a large chunk of what I do in my job. If it is an in-place upgrade it is often more straightforward. The main consideration is making sure all compatibility checks are made. But if it is a rebuild, things get a bit more complicated.

Take for example a vCenter Server to vCenter Server Appliance migration. If you are migrating between 5.5, 6.0 and 6.5 you are covered by the vCenter Server Migration Tool. Recently I came across a customer using vSphere 5.1 (yes, it is not as uncommon as you might think). vCenter Server Migration Tool does not support migration from vSphere 5.1, which is fair enough, as it is end of support was August 2016. But as a result, you end up being on your own with your upgrade endeavours and have to do a lot of the things manually. One of such things is migrating vCenter notifications.

You can go and do it by hand. Using a few PowerCLI commands you can list the currently configured notifications and then recreate them on the new vCenter. But knowing how clunky and slow this process is, I doubt you are looking forward to spend half a day configuring each of the dozens notifications one by one by hand (I sure am not).

I offer an easy solution

You may have seen a comic over on xkcd called “Is It Worth The Time?“. Which gives you an estimate of how long you can work on making a routine task more efficient before you are spending more time than you save (across five years). As an example, if you can save one hour by automating a task that you do monthly, even if you spend two days on automating it, you will still brake even in five years.

Knowing how often I do VMware upgrades, it is well worth for me to invest time in automating it by scripting. Since you do not do upgrades that often, for you it is not, so I wrote this script for you.

If you simply want to get the job done, you can go ahead and download it from my GitHub page here (you will also need VMware PowerCLI installed on your machine for it to work) and then run it like so:

.\copy-vcenter-alerts-v1.0.ps1 -SourceVcenter old-vc.acme.com -DestinationVcenter new-vc.acme.com

Script includes help topics, that you can view by running the following command:

Get-Help -full .\copy-vcenter-alerts-v1.0.ps1

Or if you are curious, you can read further to better understand how script works.

How does this work?

First of all, it is important to understand the terminology used in vSphere:

  • Alarm trigger – a set of conditions that must be met for an alarm warning and alert to occur.
  • Alarm action – operations that occur in response to triggered alarms. For example, email notifications.

Script takes source and destination vCenter IP addresses or host names as parameters and starts by retrieving the list of existing alerts. Then it compares alert definitions and if alert doesn’t exist on the destination, it will be skipped, so be aware of that. Script will show you a warning and you will be able to make a decision about what to do with such alert later.

Then for each of the source alerts, that exists on the destination, script recreates actions, with exact same triggers. Trigger settings, such as repeats (enabled/disabled) and trigger state changes (green to yellow, yellow to red, etc) are also copied.

Script will not attempt to recreate an action that already exists, so feel free to run the script multiple times, if you need to.

What script does not do

  1. Script does not copy custom alerts – if you have custom alert definitions, you will have to recreate them manually. It was not worth investing time in such feature at this stage, as custom alerts are rare and even if encountered, there us just a handful, that can be moved manually.
  2. Only email notification actions are supported – because they are the most common. If you use other actions, like SNMP traps, let me know and maybe I will include them in the next version.

PowerCLI cmdlets used

These are some of the useful VMware PowerCLI cmdlets I used to write the script:

  • Get-AlarmDefinition
  • Get-AlarmAction
  • Get-AlarmActionTrigger
  • New-AlarmAction
  • New-AlarmActionTrigger
Advertisement

Beginner’s Guide to HPE 5000 Series Switches

October 14, 2017

I don’t closely track the popularity of my blog. If what I share helps people in their day to day job, it’s already good enough to me. But I do look at site statistics now and then just out of curiosity and it seems that network-related posts get a lot of popularity. A blog post I wrote a while ago on Dell N4000 switches has quickly got in top five over the last year.

So it seems that there is a demand for entry-level switch configuration guides. I’ve worked with a quite a few different switch brands over the years, so I thought I will build on the success of the Dell blog post and this time write about HPE FlexNetwork/FlexFabric 5000 switch series.

Operating Systems

HPE has several network switch product lines. I won’t even try to cover all of them in this post. But it’s important to know that there are a few different operating systems you can encounter, while working with HPE network switches. There is a familiar ProCurve product portfolio (now merged with Aruba), which is based on ProVision operating system.

HPE FlexNetwork/FlexFabric 5000 series, on the other hand, is based on Comware operating system. It has a different CLI command set and can be a complete surprise if you’ve worked only with ProCurve switches before. So this blog post will be particularly valuable for those who’re dealing with HPE 5000 for the first time.

The following guide has been tested on a pair of HPE FlexFabric 5700-series switches. Even though commands are mostly the same, on other switch series, like FlexNetwork 5800, there might be some minor differences.

Initial Configuration

When the switch is booted for the first time it will start automatic configuration by trying to obtain settings over DHCP, which you can interrupt by Ctrl+C to get straight to CLI.

You start in user view where you can run display commands to review switch settings. To start the configuration, change to system view:

> system-view

Let’s start by configuring remote access to the switch. There are two ways you can do that. You either use the out-of-band management port:

> interface M-GigabitEthernet 0/0/0
> ip address 10.10.10.10 255.255.255.0
> ip route-static 0.0.0.0 0.0.0.0 10.10.10.1

Or you can configure a VLAN interface IP address:

> interface vlan-interface 1
> ip address 10.10.10.10 255.255.255.0
> ip route-static 0.0.0.0 0.0.0.0 10.10.10.1

Then configure switch name, enable SSH, set passwords and you can start managing the switch over SSH:

> sysname switchname

> public-key local create rsa
> ssh server enable
> user-interface vty 0 15
> authentication-mode scheme
> protocol inbound ssh

> super password simple yourpassword
> local-user admin
> password simple yourpassword
> authorization-attribute user-role level-0
> service-type ssh

User “admin” will have an unprivileged role. You will need to run the following command and enter password once logged in, to elevate to network admin rights:

> super

Intelligent Resilient Framework

In small non-business-critical environments one standalone switch is usually sufficient. In larger environments switches are typically deployed in pairs for redundancy. To simplify management and to avoid network loops most switches support some sort of MLAG or stacking. IRF is HPE’s version of it.

Determine what ports you’re going to use for IRF. There are two QSFP+ ports on 5700-series dedicated for it. And then on on the first switch (master) run the following commands (it’s recommended to shut down the ports before you set them up as IRF):

> irf member 1 priority 32
> int range FortyGigE 1/0/41 to FortyGigE 1/0/42
> shutdown
> irf-port 1/1
> port group interface FortyGigE 1/0/41
> irf-port 1/2
> port group interface FortyGigE 1/0/42
> int range FortyGigE 1/0/41 to FortyGigE 1/0/42
> undo shut
> save
> irf-port-configuration active

On the second switch (slave) run the following commands to change the IRF ID to 2:

> irf member 1 renumber 2
> reboot

When the switch comes up, configure IRF ports:

> irf member 2 priority 30
> int range FortyGigE 2/0/41 to FortyGigE 2/0/42
> shutdown
> irf-port 2/1
> port group interface FortyGigE 2/0/41
> irf-port 2/2
> port group interface FortyGigE 2/0/42
> int range FortyGigE 2/0/41 to FortyGigE 2/0/42
> undo shut
> save
> irf-port-configuration active

Now you can connect the physical IRF ports. IRF is a ring topology, that means (in my case) port 1/0/41 should connect to 2/0/42 and port 1/0/42 should connect to 2/0/41.

Second switch will automatically reboot and if all is configured correctly, you should see both switches join the IRF fabric. Member switch 1 has the highest priority of 32 and becomes the master:

> display irf

Firmware Upgrade

Firmware upgrade is the next logical step after you set up IRF. The latest firmware revision for the switches can be download from HPE web-site. Keep in mind you will need a HPE passport account, with a valid service agreement (SAID) added to it.

You will also need a TFTP server to upgrade the firmware. There are a few of them out there, but the most commonly used is probably Tftpd64.

When you get the TFTP server up and running and copy the firmware file to it, perform an upgrade:

> tftp 10.10.10.20 get 5700-CMW710-R2432P03.ipe
> boot-loader file flash:/5700-CMW710-R2432P03.ipe slot 1 main
> boot-loader file flash:/5700-CMW710-R2432P03.ipe slot 2 main
> irf auto-update enable
> reboot

Confirm firmware has been updated:

> display version

VLANs, Aggregation Groups and Tagging

In Comware the term “aggregation group” is used to describe what is a “port channel” in Cisco world. Trunk/access ports are also called tagged/untagged ports throughout the documentation.

In this section we will discuss a few common port configuration scenarios:

  • Untagged ports, which can be your iSCSI storage array ports
  • Tagged ports, such as your VMware host uplinks
  • Aggregation groups, typically used for LAGs to upstream switches

First of all create all VLANs and give them descriptions:

> vlan 10
> description iSCSI
> vlan 20
> description Server
> vlan 30
> description Dev and test

Then specify untagged ports:

> vlan 10
> port te 1/0/1
> port te 2/0/1

To configure tagged ports and allow certain VLANs (ports will be added to the VLANs automatically):

> int te 1/0/2
> description ESX01 vmnic0
> port link-type trunk
> port trunk permit vlan 20 30
> int te 2/0/2
> description ESX02 vmnic0
> port link-type trunk
> port trunk permit vlan 20 30

And to create an LACP aggregation group:

> interface bridge-aggregation 1
> description Trunk to upstream switch
> link-aggregation mode dynamic
> port link-type trunk
> port trunk permit vlan 20 30

> interface te 1/0/3
> port link-aggregation group 1
> interface te 2/0/3
> port link-aggregation group 1

Common Commands

Other useful commands that don’t fall under any specific category, but handy to know.

Display switch configuration:

> display current-configuration

Save switch configuration:

> save

Shut down a port:

> int te 1/0/27
> shutdown

Undo a command:

> undo shutdown

Conclusion

Whether you are a network engineer new to the Comware operating system or a VMware administrator looking for a quick cheat sheet for FlexNetwork/FlexFabric switches, I hope this guide has helped you get the job done.

If this blog post gets the same amount of popularity, maybe it will turn into another series. But for now – over and out.

Dell Force10 Part 1: Initial Configuration

July 3, 2016

force10_S4048_on
When it comes to networking Dell has two main series of switches. PowerConnect/N-series, which run DNOS 6.x operating system. And S/Z-series switches, which run on DNOS 9.x derived from Force10 OS (FTOS). In this series of blogs we will go through the configuration of Force10 switch series and use Dell S4048-ON top of the rack switch as an example.

Interesting to note, that unlike other S-series switches S4048-ON is an Open Networking switch. Dell is one of the first companies which apart from its own OS lets customers run other operating systems on its network switches, such as Cumulus Linux OS and Big Switch Networks Switch Light OS. While Cumulus and Big Switch has its own use cases, in this blog we will look specifically at configuring FTOS.

Boot process

S4048-ON comes from the factory pre-configured for bare metal provisioning (BMP). This is what you will see when you boot the switch for the first time:

s4048_bmp

If you just want to boot FTOS, simply skip the BMP by choosing A and switch will boot the OS.

After some time BMP will time out. If you’ve missed the above wizard, you can also disable BMP from CLI using the following commands:

> enable
# stop bmp
# config
# reload-type normal-reload
# exit
# reload

When prompted choose to save the configuration and proceed with reload. After the switch has rebooted check that the next boot is set to normal reload:

# show reload-type

Initial configuration

First steps of any switch installation is assigning a hostname and management interface settings:

# hostname DELL4048-SWITCH
# int managementethernet 1/1
# ip address 172.10.10.2/24
# no shut
# management route 0.0.0.0/0 172.10.10.10

Then set admin / enable passwords and allow remote management via SSH:

# enable password 123456
# username admin password 123456
# ip ssh server enable

Configure time zone and NTP:

# clock timezone UTC 11
# ntp server 172.10.10.20
# show ntp associations
# show ntp status
# show clock

Firmware upgrade

Force10 switches have two boot banks A: and B:. It’s a good practice to upload new firmware into one boot bank and keep the old firmware in the other in case you need to roll back.

The easiest way to upgrade is via TFTP using Tftpd64, which you can download for free from here. If you’re upgrading an existing switch, make sure to save the running config and make a backup. If it’s an initial install you can skip this step.

# copy run start
# copy start tftp://10.0.0.1/FORCE10_SWITCH_01.01.16.conf

Then upload new firmware to image B:, change active boot bank to B: and reload:

# show version
# show boot system stack-unit 1
# upgrade system tftp://10.0.0.1/FTOS-SK-9.9.0.0P9.bin b:
# conf t
# boot system stack-unit 1 primary system b:
# exit
# reload

You will be prompted to save the configuration and reboot. After the reboot you may be asked to enable SupportAssist. SuppotAssist helps to automatically open Dell service tickets if there is a switch fault. You can enable SupportAssist by running the following commands and answering prompts:

supportassist

# conf t
# support-assist activate
# support-assist activity full-transfer start now
# show support-assist status

My pair of switches were configured in a Virtual Link Trunking (VLT) domain. I’ll explain how VLT works later in the series. But from the upgrade point of view, each switch in a VLT domain is treated as a separate switch and has to be upgraded separately. If you decided to use a stack instead of VLT, you can find the upgrade process for a Force10 stack in my other post about Dell MXL switches here.

Spanning tree

Spanning Tree Protocol (STP) helps to prevent network topology loops and is highly recommended for use in any network. Switches connected in an actual loop topology in today’s networks are rare. But STP can save you from consequences of a potential human error, such as port channel misconfiguration. If instead of creating one port channel with two links, you by mistake create two port channels with one link each and both carry the same VLANs, you’ve accidentally created a loop, which will bring your whole network to an immediate halt.

It’s a good practice to enable STP as a safeguard mechanism from such configuration errors. S4048-ON supports STP, RSTP, MSTP and PVST+. In my case S4048s were uplinked into HP core, which supported STP, RSTP and MSTP. If you have Cisco switches in your network core you can use PVST+. In my case I used RSTP, which is a good choice if you don’t require enhancements of MSTP and PVST+ in your network. Just make sure to not use the basic STP protocol, as it provides the slowest convergence.

# protocol spanning-tree rstp
# no disable
# show spanning-tree rstp

In every STP topology there is also a root switch, which by default is selected automatically. For a more deterministic STP behaviour it’s recommended to select the root switch manually, by assigning the lowest STP priority to it. Typically your core switch should be your root switch. In my case it was a HP core switch, which was assigned priority of “0”.

When configuring server and storage facing ports make sure to enable EdgePort mode to minimize the time it takes for the port to come online:

# int range Te1/45-1/48
# spanning-tree rstp edge-port
# switchport
# no shut

If you want to know more about how STP works, you can read a few of my previous blog posts on STP here and here.

Flow control

To avoid dropped packets on 10Gb switch ports at times of potential heavy utilization it is also a best practice to as a minimum enable bi-directional Flow Control on the storage array ports. I enabled it on the iSCSI links connected from the Dell Compellent storage array:

# int range Te1/17-1/18
# flowcontrol rx on tx on

If you specifically interested in switch best practices for Compellent and EqualLogic storage arrays, Dell has a full list of guides for various switches at communitites wiki here.

Port channels and VLANs

Port channels and VLANs are configured similarly to any other switch, but I include them here in case you want to know the syntax. In this example we have two access ports 1/46 and 1/47 and an uplink to the core configured as port channel 1:

# interface port-channel 1
# switchport
# no shutdown

# interface range Te1/1-1/2
# port-channel-protocol LACP
# port-channel 1 mode active
# no shutdown

# int vlan 254
# untagged Te1/46-1/47
# tagged po 1

Keep in mind, that port channels are used either in one switch configurations or when two or more switches are stacked together. If you’re using Virtual Link Trunking (VLT), you will need to create Virtual Link Trunks (VLTs). Which are similar to port channels, but have a slightly different syntax. We will talk about VLT in much more detail in the following Force10 blogs.

Conclusion

One feature which I didn’t specifically mentioned in this blog post was Jumbo Frames. I tend not to use it in my deployments until I see convincing evidence of it making a difference for iSCSI/NFS storage implementations. I did a post about Jumbo Frames long time ago here and hasn’t changed my opinion ever since. Interested to here your thoughts if have a different take on that.

References

Dell Repository Manager: Bootable ISO Issues

May 23, 2016

problem_solutionIn one of my previous posts I described the process of upgrading a Dell FX2 chassis firmware using Dell Repository Manager (DRM).

In an ideal world you just follow the process and in an hour or two you can get your chassis upgraded. You may sometimes run into issues. I want to go through some of them in this post, including possible remediation.

Issue Description

When exporting firmware to a bootable ISO you can find DRM not being able to download some of the bundle components with the following error in the Job Result:

Processing failed:
Failed downloading files:
Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN

And errors in the Log:


60. 24/03/2016 5:58:50 PM Export to Bootable ISO : Downloaded 34 / 56
61. 24/03/2016 5:59:44 PM Export to Bootable ISO : Error downloading some files
62. 24/03/2016 5:59:45 PM Export to Bootable ISO : Failed exporting to Bootable ISO.

Workaround #1: Skip the Component

You can try the following option “Continue download irrespective of any error (in the selected components)” in the export dialog. It won’t help to get the component downloaded, but you will got a bootable ISO.

However, DRM will still keep the failed component in the bundle and try to install it during the upgrade, which will obviously fail (update 16/56):

failed_update

Once the upgrade is finished you will get the following error at the end:

Note: Some update requires machine reboot. Please reboot to CD/DVD to continue for the failed update because of the dependency…

upgrade_status

No matter how many times you reboot you will obviously get the same errors. You can ignore it if you 100% sure this is what causes the upgrade to fail or use Workaround #2.

Workaround #2: Create Custom ISO

When you create a repository in DRM it’s populated with pre-built components and bundles. But you can create custom repositories. The idea is that you can exclude the failed component from the repository by creating it manually.

Assuming you already have the base repository configured, do the following:

  • Open the existing repository and click on the Components tab
  • Deselect the failed component in the component list (in my case it was Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN)
  • Click on the “Copy To” button:

custom_components

  • In the opened dialogue select “Create NEW Repository and copy component(s) into it”
  • Follow the wizard and when you click finish, components will be copied to the newly created repository
  • Open the new repository and click on the Componenets tab
  • Select all components and click on the “Copy to” button once again
  • This time select “Create a NEW Bundle in the same repository and add component(s) into it”
  • On the next screen give the bundle a name and make sure to choose “Linux 32-bit and 64-bit” in the OS Type

custom_bundle

As a result you should get a new bundle created which you can export to a bootable ISO using the same process.

Workaround #3: Use Server Update Utility

If none of the above helps you can fall back to a proven upgrade approach and use Server Update Utility (SUU). SUU is a huge 12GB ISO to download, but you can use Dell Download Manager, which supports resuming interrupted downloads. Make sure to disable proxy! Dell Download Manager does not support resuming an interrupted download if you’re using a proxy server.

SUU is not a bootable ISO. Previously you had to use Dell Systems Build and Update Utility (SBUU) to boot from it first and then mount the ISO to proceed with the upgrade. Starting with Dell 11G servers you don’t need it anymore and can upgrade firmware straight form Dell Lifecycle Controller (LC).

You’ll need to boot into the Lifecycle Controller and choose Firmware Update > Launch Firmware Update > Local Drive(CD or DVD or USB). Mount the SUU ISO and the rest is fairly straightforward. LC will upgrade the firmware and reboot the blade.

lc_upgrade

Conclusion

Dell Repository Manager is the recommended approach to upgrade firmware on Dell hardware. Unlike SUU, DRM downloads the latest updates and only the necessary components. It is also capable of making a bootable ISO.

If you have issues, rely on Server Update Utility as it’s bulletproof and always work. But be prepared to download a 12GB ISO image and make sure you have an option to bypass proxy.

Painless Dell FX2 Firmware Upgrade

April 10, 2016

Overview

Recently I’ve had a chance to play with Dell’s FX2 chassis for a bit. Dell FX2 falls into the category of blade chassis and can hold up to 8 blades with Atom or 4 blades with Xeon CPUs in a 2U chassis.

Dell_FX2

Besides the compute blades FX2 also supports storage blades, which you can dedicate to particular compute blades and use as additional storage.

On the networking side you can choose from either pass-through modules or three types of I/O aggregators – four 10G SFP+ ports, four 10GBASE-T ports, or two Fibre Channel plus two SFP+ external ports.

The chassis itself also comes in two flavors – FX2 or FX2s. The main difference between the two is that FX2s additionally has PCIe slots at the back, which can be mapped to the server blades to provide additional connectivity.

Dell_FX2_Rear

First step of every hardware solution deployment is a firmware upgrade. But when it comes to firmware on Dell blade equipment be it M1000e, VRTX or FX2 you can quickly get confused. Especially when you go to the blade section and see a dozen of hardware components. Download and update each of them individually would be daunting. Fortunately there is an better way.

blade_firmware

CMC Firmware

Upgrade starts from the chassis management controller, which has two components: Chassis Infrastructure Firmware (or Main Board) and the CMC itself. You can find them on the Chassis Overview > Update tab.

CMC firmware comes as an .exe package, which you can extract. You really need just the fx2_cmc.bin file. During upgrade you will lose access to CMC for 5-10 minutes, while CMC is rebooting.

For the infrastructure firmware you’ll need the fx2_mainboard.bin file. The gotcha with the infrastructure firmware upgrade is that you’ll need all blades to be powered off. So if you have just one chassis this might be tricky.

Blade Firmware

Blades firmware is where this gets interesting. You can certainly upgrade all blades from the CMC by downloading firmware from the Dell support web-site and choosing one component at a time in Chassis Overview > Server Overview > Update section. CMC is capable of upgrading say iDRAC across all blades simultaneously, but it’s still about a dozen components.

The easier approach would be to use Dell Repository Manager (DRM). DRM can download firmware for virtually any blade or rack server (including some of the storage and network hardware) and build a bootable ISO image for an easy upgrade.

To build a bootable ISO follow the following steps:

  • Download and install Dell Repository Manager from the Dell support web-site
  • Add a source by going to Source > View Dell Online Catalog
  • Create a repository by going to Repository > New > Create New Repository
  • In the wizard select your hardware (I selected PowerEdge FC630 from the Blade category) and choose Linux (32-bit and 64-bit) as a DUP format (I’ll explain that later).
  • Go to the newly created repository, select the bundle and click Export

export_bundle

DRM can export bundles in multiple forms, we are interested in a bootable ISO and this is why we selected the Linux DUP format when we created the repository. DRM creates a Linux bootable ISO, so there was no point selecting Windows bundles.

  • Select “Create Bootable ISO (Linux Only)” and continue with the default settings for the rest

As a result you will get an .iso file, which you can mount to the server via iDRAC Remote Console and boot from it for a firmware upgrade.

Network I/O Aggregators

FX2 I/O aggregators are Dell Force10 switches, which use Force10 OS (FTOS). FTOS firmware is NOT available from the Dell web-site. You’ll need to register an account at https://www.force10networks.com to download the firmware.

Make sure to download firmware release specifically built for FX2 I/O aggregators, which can be found in M-Series Software section.

aggregators_firmware

To upgrade the aggregators go to Chassis Overview > I/O Module Overview > Update. Aggregators reset after a reboot, so make sure to upgrade them one at a time. Or if you stacked them instead of using VLT or standalone mode, you’ll have to have a downtime, as stacked switches reboot together.

Conclusion

There is nothing fancy in upgrading firmware on a blade chassis, you want it to be quick and painless. Make sure to use Dell Repository Manager for blades upgrade. It may save you heaps of time and make your life easier.

Upgrading Cisco UCS Fabric Interconnects

March 17, 2016

I have to do this first, as this is a high-risk change for any environment:

disclaimerDISCLAMER: I ACCEPT NO RESPONSIBILITY FOR ANY DAMAGE OR CORRUPTION OF DATA THAT MAY OCCUR AS A RESULT OF CARRYING OUT STEPS DESCRIBED BELOW. YOU DO THIS AT YOUR OWN RISK.

And now to the point. Cisco has two generations of Fabric Interconnects with the third generation released just recently. There is 6100 series, which includes 6120XP and 6140XP. Second generation is 6200 series, which introduced unified ports and also has two models in its range – 6248UP and 6296UP. And there is now a third generation of 40Gb fabric interconnects with 6324, 6332 and 6332-16UP models.

We are yet to see mass adoption of 40Gb FIs. And some of the customers are still upgrading from the first to the second generation.

In this blog post we will go through the process of upgrading 6100 fabric interconnects to 6200 by using 6120 and 6248 as an example.

Prerequisites

Cisco UCS has a pair of fabric interconnects which work in an active/passive mode from a control plane perspective. This lets us do an in-place upgrade of a FI cluster by upgrading interconnects one at a time without any further reconfiguration needed in UCS Manager in most cases.

For a successful upgrade old and new interconnects MUST run on the same firmware revision. That means you will need to upgrade the first new FI to the same firmware before you can join it to the cluster to replace the first old FI.

This can be done by booting the FI in a standalone mode, giving it an IP address and installing firmware via UCS Manager.

The second FI won’t need a manual firmware update, because when a FI of the same hardware model is joined to a cluster it’s upgraded automatically from the other FI.

Preparation tasks

It’s a good idea to make a record of all connections from the current fabric interconnects and make a configuration backup before an upgrade.

ucs_backup

If you have any unused connections which you’re not planning to move, it’s a good time to disconnect the cables and disable these ports.

Cisco strongly suggests to also upgrade the firmware on all software and hardware components of the existing UCS to the latest recommended version first.

Upgrading firmware on the first new FI

Steps to upgrade firmware on the first new fabric interconnect are as follows:

  • Rack and stack the new FI close enough to the old interconnects to make sure all cables can reach it.
  • Connect a console cable to the new FI, boot it up and when you are asked “Is this Fabric interconnect part of a cluster”, select NO to boot the FI in a standalone mode.
  • Assign an IP address to the FI and connect to it using UCS Manager.
  • Upgrade the firmware, which will reboot the fabric interconnect.
  • Reset the configuration on the FI, which will cause another reboot:
    • # connect local-mgmt
      # erase config

  • Once the FI is upgraded and reset to factory defaults you can proceed with joining it to the cluster.

Replacing the first FI

  • Determine which old FI is in the subordinate mode (upgrade a FI only if it’s in subordinate mode!) and disable server ports on it.
  • Shut down the old subordinate FI.
  • Move L1/L2, management, server and Ethernet/FC/FCoE uplink ports to the new FI.
  • Boot the new FI. This time the new FI will detect the presence of the peer FI. When you see the following prompt type YES:
    • Installer has detected the presence of a peer Fabric interconnect. This Fabric interconnect will be added to the cluster. Continue (y/n) ?

  • Follow the console prompts and assign an IP address to the new FI. The rest of the settings will be pulled from the peer FI.

Once the new FI joins the cluster you should see the following equipment topology in UCS Manager (This screenshot was made after the primary role had been moved to the new FI. Initially you should see the new FI as subordinate.):

two_fis

  • At this stage make sure that all configuration has been applied to the new FI and you can see all LAN and SAN uplinks and port channels.
  • Enable server ports on the new FI and reacknowledge all chassis.

Reacknowledging a chassis might be disruptive to the traffic flow from the blades. So make sure you don’t have any production workloads running on it. If you have two chassis and enough capacity to run all VMs on either of them, you can temporarily move VMs between the chassis and reacknowledge one chassis at a time.

Replacing the second FI

You will need to promote the new FI to be the primary, before proceeding with an upgrade of the second FI. To change the roles, use SSH to log in to the old FI, which is currently the primary (you can’t change roles from the subordinate FI) and run the following commands:

# connect local-mgmt
# cluster lead b
# show cluster state

The rest of the process is exactly the same.

After the upgrade, if needed, reconfigure any of the links which may have had their port numbers changed, such as if you had an expansion module in the old FIs, but not on the new FIs.

References

Cisco has a guide which has a step by step procedures for upgrading fabric interconnects, I/O modules, VIC cards as well as rack-mount servers. Refer to this guide for any further clarifications:

 

Beginner’s Guide to Dell N4000 Series Switches

January 18, 2016

Dell N-Series switches run on Dell Network Operating System (DNOS) version 6.x. Unlike Dell S-Series switches which run on DNOS 9.x, derived from  Force10 Operation System (FTOS), DNOS 6.x came from the PowerConnect switch series and share the same codebase. So if you’ve ever worked with PowerConnect switches, N-Series syntax should be very familiar.

In my case I had two Dell N4032F switches. But the same set of commands applies to any other N4000 Series switch.

Initial Configuration

When you first turn the switch on, it gives you 60 seconds to enter the wizard, where you can set up network settings for the Out-of-Band (OOB) management interface and change the admin password. If you miss it you can reboot the switch and it will show the same wizard prompt again when it boots up. Or you can set it up from the CLI:

# interface out-of-band
# ip address 10.10.10.10 255.255.255.0 10.10.10.254

# show ip interface out-of-band

Once you get to the CLI prompt, configure hostname and enable SSH:

# hostname n4032f-prod

# crypto key generate rsa
# crypto key generate dsa
# ip ssh server
# ip telnet server disable

Stacking

Dell N4000 Series switches support both stacking and MLAG (Multi-chassis Link Aggregation). One of the drawbacks of the stack configuration is disruptive firmware upgrades. When you update firmware on the stack master, firmware is distributed to all stack members and all switches are rebooted simultaneously.

In MLAG each switch has its own Control Plane and can be rebooted independently. Which is MLAG’s shortcoming at the same time, because unlike stack, where all units act as one switch, in MLAG you have to manage each switch separately.

In my case I chose stacking for its simplicity.

Dell N4000

N4000 switches are stacked using the two 40Gb QSFP ports located at the front. QSFP ports are not configured in stack mode by default. Which you need to change on both switches before you can build a stack:

# stack
# stack-port Fortygigabitethernet 1/1/1 stack
# stack-port Fortygigabitethernet 1/1/2 stack

# show switch stack-ports

Once QSFP ports on both switches are configured, disconnect power from both switches and boot the switch you want to be the stack master first (typically the top switch). When the first switch has fully booted, boot the second switch and check the status. This is what you should see:

# show switch

n4000_stack

Firmware Upgrade

If it’s not a brand new switch, save the config before doing the firmware upgrade:

# copy run start
# copy running-config tftp://10.10.10.100/backup.txt

You can use any TFTP server for the firmware upgrade, such as the free Tftpd64 server.

tftpd64

Then you upload the firmware image to the stack master and reload the stack:

# copy tftp://10.10.10.100/N4000v6.2.7.2.stk backup
# boot system backup
# reload
# show version

Firmware is uploaded to a backup image. Then you select the backup image for the next boot and reload the stack. When both switches reboot you should see something similar to this:

frimware_upgraded

As part of the upgrade process the new firmware is automatically uploaded from the master to all stack members, which is a default behaviour. You can confirm it is enabled using the following command:

# show auto-copy-sw

Flow Control, Jumbo Frames and iSCSI Optimization

In my case I used two N4032F switches for an iSCSI backbone, so I needed to make sure that Flow Control and Jumbo Frames are enabled on the switch.

Flow Control is enabled by default, which you can confirm by the following command:

# show storm-control

To globally enable Jumbo Frames on all ports type:

# system jumbo mtu 9216

# show system mtu

Interestingly, Dell N4000 Series switches also have built-in iSCSI optimization, which can detect iSCSI sessions by snooping the traffic on ports 3260 and 860. It then prioritizes iSCSI traffic over the other types of traffic to guarantee low latency for storage I/O. To show iSCSI settings:

# show iscsi

By default switches only track the sessions. Traffic prioritization is disabled by default and has to be enabled manually. This didn’t matter in my case, as the switches were dedicated for storage traffic. But if you share switches between storage and server traffic, you may want to enable it. Refer to the switch User’s Configuration Guide for details.

If you’re using a Dell Compellent storage array with N4000 switches, also make sure to apply a Compellent profile to the ports where storage array is connected to:

# macro global apply profile-compellent-nas $interface_name te1/0/1
# macro global apply profile-compellent-nas $interface_name te1/0/2
# macro global apply profile-compellent-nas $interface_name te1/0/3
# macro global apply profile-compellent-nas $interface_name te1/0/4

VLANs, Trunks and Port Channels

Again, I didn’t use any VLANs and Trunks, because switches were dedicated for iSCSI traffic and were separate from the LAN core. And I didn’t need Port Channels either, as they are not required for iSCSI.

Your scenario might be different. For instance, if you have vSphere hosts connected to a NetApp array over NFS, you may want to create a Multi-Mode (LACP) VIF on the NetApp side. If that’s the case, to create a port channel on the Multi-Mode VIF ports use the following:

# interface range te1/0/2,te2/0/2
# channel-group 1 mode active
# show intefaces po1

If the switches are used for both storage and VM traffic, then you’ll need to configure the server ports and uplink them to your network core. Create your VLANs first:

# vlan 10,20,30

Configure vSwitch uplinks from the ESXi hosts. In a typical vSphere environment, traffic is tagged on the vSwitch side, which means that server ports should be configured as trunks:

# interface range te1/0/3-6,te2/0/3-6
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30

And finally configure uplinks to the network core. Depending on how your LAN core is set up, you may want to create a port channel to the upstream switch and trunk the required VLANs:

# interface range te1/0/1,te2/0/1
# channel-group 2 mode active
# switchport mode trunk
# switchport trunk allowed vlan 10,20,30
# show intefaces po2

Conclusion

This guide didn’t include information on Spanning Tree, QoS or any of the switch Layer 3 features, but I hope it could get you started. At the end of the day, every environment is different. If you need additional information refer to the following guides from the Dell web-site:

 

Brocade 300 Firmware Upgrade

December 14, 2015

In my previous post Brocade 300 Initial Setup I briefly went through the firmware upgrade process, which is a part of every new switch installation. Make sure to check the post out for instructions on how to install a FTP server. You will need it to upload firmware to the switch.

I intentionally didn’t go into all details of firmware upgrade in my previous post, as it’s not necessary for a green field install. For a production switch the process is different. The reason is, if you’re upgrading to a Fabric OS version which is two or more versions apart from the current switch firmware revision, it will be disruptive and take the FC ports offline. Which is fine for a new deployment, but not ideal for production. 

Disruptive and Non-Disruptive Upgrades

Brocade Fabric OS major firmware release versions are 6.3.x, 6.4.x, 7.0.x, 7.1.x, 7.2.x, etc. For a NDU the rule of thumb is to apply all major releases consecutively. For example, if my production FC switch is running FOS version 6.3.2b and I want to upgrade to version 7.2.1d, which is the latest recommended version for my hardware platform, then I’ll have to upgrade from 6.3.2b to 6.4.x to 7.0.x to 7.1.x and finally to 7.2.1d.

First and foremost save the current switch config and make a config backup via FTP (give write permissions to your FTP user’s home folder). Don’t underestimate this step. The last thing you want to do is to recreate all zoning if switch loses config during the upgrade:

> cfgSave
> configUpload

configUpload

In case you need to restore, you can run the following command to download the backed up config back to the switch:

> configDownload

Next step is to install every firmware revision up to the desired major release (-s key is not required):

> firmwaredownload

brocade_ndu.JPG

Brocade switch has two firmware partitions – primary and secondary. Primary is the partition the switch boots from. And the secondary partition is used for firmware upgrades.

After each upgrade switch does a warm reboot. All FC ports stay up and switch continues to forward FC frames with no disruption to FC traffic. To accomplish that, switch uses the secondary partition to upload the new firmware to and then quickly swap them without disrupting FC switching.

At a high level the upgrade process goes as follows:

  • The Fabric OS downloads the firmware to the secondary partition.
  • The system performs a high availability reboot (haReboot). After the haReboot, the former secondary partition is the primary partition.
  • The system replicates the firmware from the primary to the secondary partition.

Each upgrade may take up to 30 minutes to complete, but in my experience it doesn’t take more than 10 minutes. Once the first switch is upgraded, log back in and check the firmware version. And you will see how secondary partition has now become primary and firmware is uploaded to the secondary partition.

brocade_upgrade3

As a last step, check that FC paths on all hosts are active and then move on to the second switch. The steps are exactly the same for each upgrade.

Firmware Upload and Commit

Under normal circumstances when you run the firmwareDownload command, switch does the whole upgrade in an automated fashion. After the upgrade is finished you end up with both primary and secondary partitions on the same firmware version. But if you’re a large enterprise, you may want to test the firmware first and have an option to roll-back.

To accomplish that you can use -s key and disable auto-commit:

single_mode

Switch will upload the firmware to the secondary partition, switch secondary and primary partitions after a reboot, but won’t replicate the firmware to the secondary partition. You can use the following command to restore firmware back to the previous version:

> firmwareRestore

Or if you’re happy with the firmware, commit it to the secondary partition:

>  firmwareCommit

The only caveat here, a non-disruptive upgrade is not supported in this scenario. When switch reboots, it’ll be disruptive to FC traffic.

Important Notes

When downloading firmware for your switch, make sure to use switch’s vendor web-site. EMC Connectrix DS-300B, Brocade 300 and IBM SAN24B-4 are essentially the same switch, but firmware and supported versions for each OEM vendor may slightly vary. Here are the links where you can get FC switch firmware for some of the vendors:

  • EMC: sign in to http://support.emc.com > find your switch model under the product section and go to downloads
  • Brocade: sign in to http://www.brocade.com > go to Downloads section > enter FOS in the search field
  • Dell: http://www.brocadeassist.com/dellsoftware/public/DELLAssist includes a subset of Fabric OS versions, which are tested and approved by Dell
  • IBM: http://ibm.brocadeassist.com/public/FabricOSv6xRelease and http://ibm.brocadeassist.com/public/FabricOSv7xRelease are the links where you can download FOS for IBM switches. You can also go to http://support.ibm.com, search for the switch in the Product Finder and find FOS under the “Downloads (drivers, firmware, PTFs)” section

References

Brocade 300 Initial Setup

December 8, 2015

There are a few steps you need to do on the Brocades before moving on to cabling and zoning. The process is pretty straightforward, but worth documenting especially for those who are doing it for the first time.

After you power on the switch there are two ways of setting it up: GUI or CLI. We’ll go hardcore and do all configuration in CLI, but if you wish you can assign a static IP to your laptop from the 10.70.70.0/24 subnet and browse to https://10.77.77.77. Default credentials are admin/password for both GUI and CLI.

Network Settings

To configure network settings, such as a hostname, management IP address, DNS and NTP use the following commands:

> switchname PRODFCSW01
> ipaddrset
> dnsconfig
> tsclockserver 10.10.10.1

Most of these commands are interactive and ask for parameters. The only caveat is, if you have multiple switches under the same fabric, make sure to set NTP server to LOCL on all subordinate switches. It instructs them to synchronize their time with the principal switch.

Firmware Upgrade

This is the fun part. You can upgrade switch firmware using a USB stick, but the most common way is to upgrade using FTP. This obviously means that you need to install a FTP server. You can use FileZilla FTP server, which is decent and free.

Download the server and the client parts and install both. Default settings work just fine. Go to Edit > Users and add an anonymous user. Give it a home folder and unpack downloaded firmware into it. This is what it should look like:

filezilla.JPG

To upgrade firmware run the following command on the switch, which is also interactive and then reboot:

> firmwaredownload -s

If you’re running a Fabric OS revision older than 7.0.x, such as 6.3.x or 6.4.x, then you will need to upgrade to version 7.0.x first and then to your target version, such as 7.3.x or 7.4.x.

In the next blog post I will discuss firmware upgrades in more detail, such as how to do a non-disruptive upgrade on a production switch and where to download vendor-specific FOS firmware from.

Force10 MXL: Firmware Upgrade

March 19, 2015

Uploading new firmware image

MXL switches keep two firmware images – A and B. You can set either one of them to be active. Use the following command to list firmware version of all stack members and see which one is active:

 # show boot system stack-unit all

mxl_firmware

To upload firmware you will need a TFTP server. You can use TFTPD64 (also called TFTPD32), which can be downloaded from Philippe Jounin page here .

If the active firmware image is in A, upload new firmware to B. You’ll also be asked if you want to upload firmware to all switches in the stack.

# upgrade system tftp://10.0.0.1/FTOS-XL-9.5.0.0P2.bin b:

firmware_upload

At the time of upgrade the latest version was 9.7. This version was 2 weeks old and wasn’t recommended for use in production. Version 9.6 had a major bug. As a result version 9.5SP2 was chosen for the upgrade.

Double-check that new firmware has been successfully distributed to all switches:

 # show boot system stack-unit all

check_firmware

Backing up startup config

Make sure you do not miss this step. If something goes wrong and switch looses its config, you’ll have to recreate all configuration from scratch. Imagine the consequences.

# copy start tftp://10.0.0.1/MXL_01.01.15.conf

Reloading the stack

Once firmware is uploaded and config is saved, reload the stack. Be mindful that it’s a disruptive procedure and all links connected to the stack will go down. A reboot shouldn’t take more than a couple of minutes, but make sure you do that after hours.

# conf t
# boot system stack-unit all primary system b:
# exit
# copy run start
# reload

Confirm that the active firmware image is now B. And that concludes the switch stack upgrade.

firmware_upgraded