Posts Tagged ‘port’

HCX Perftest Issue

May 9, 2021

Introduction

VMware HCX is a great tool, which simplifies VM migrations between on-prem to on-prem or on-prem to cloud at scale. I’ve worked with many different VM migration tools before and what I particularly like about HCX is it’s ability to stretch network subnets between source and destination environments. It reduces (or completely removes) the need to re-IP VMs, which simplifies the migration and reduces the risk of inadvertently introducing issues into migrated applications.

Perftest Tool

HCX is a complex set of technologies and getting initial deployment right is key to building a reliable migration fabric. Perftest is a CLI tool available on interconnect (IX) and network extension (NE) HCX appliances, which allows you to perform validation testing to ensure everything is functioning correctly, as well as provide you a performance baseline. To run this tool you will need to SSH into HCX Manager, enter CCLI and then go to one of your IX or NE appliances:

# ccli
# list
# go 0
# perftest all

Issue Description

There is one issue you can come across, when running perftest, where it partially completes with the following errors:

Message Error: map[string]interface {}{“grpc_code”:14, “http_code”:503, “http_status”:”Service Unavailable”, “message”:”rpc error: code = Unavailable desc = transport is closing”}

and

Internal failure happens. Err: http.Post(https://appliance_ip:9443/perftest/stoptest) return statusCode: 503

Solution

The reason for this error is blocked connectivity on port TCP/4500. HCX uses ports UDP/4500 and UDP/500 for establishing tunnels between IX and NE appliance pairs, but that’s not enough for perftest.

In the very beginning of the perftest it gives you a hint, but it’s easy to overlook. This requirement is not well documented (at least at the time of writing), so keep that in mind next time you deploy HCX.

Advertisement

RecoverPoint VE: Common Deployment Issues

April 19, 2016

fixIn one of my previous posts I discussed iSCSI connectivity considerations when deploying RecoverPoint VE. In this post I want to describe common issues you may encounter when deploying RecoverPoint clusters, most of which are applicable to both physical appliance and virtual editions.

VNX MirrorView ports

I already touched on that briefly in my previous post. But it’s worth mentioning again that you can NOT use MirrorView ports for iSCSI connectivity between RPAs and VNX arrays. When you try to use a MirrorView iSCSI port for RecoverPoint, it gets upset and doesn’t communicate with the array.

If you make a mistake of connecting one port per SP and this port is a MirrorView port, you will have no communication with the array at all and get the following error in Unisphere for RecoverPoint:

Error Splitter ARRAYNAME-A is down
Error Splitter ARRAYNAME-B is down

splitter_error

If you connect two ports per SP, one of which is MirrorView port and use two iSCSI network subnets you may get the following error when running a SAN connectivity test from the RPA boxmgmt interface. In this case RPA can communicate with the array only over one subnet:

On array ABCD1234567890, all paths for device with UID=0x1234567890abcdef go through RPA Ethernet port eth2 …

multipathing_issue

The solution is as simple as moving the link from port 0 to port 1 on a 10Gb I/O module. And from port 0 to port 1,2 or 3 on a 1Gb I/O module.

If you don’t want to lose two iSCSI ports (1 per SP), especially if it’s 10Gb, and you’re not using MirrorView, you can uninstall MirrorView enabler from the array. Just keep in mind that it will require an array reboot. Service processors will be rebooted one at a time, so there is no downtime. But if it’s a heavily used storage array it’s recommended to schedule uninstallation out of hours to minimize the impact.

Error when redeploying a cluster

If you’ve made configuration mistakes while deploying a RecoverPoint cluster and want to blow the whole thing away and redeploy it from scratch you may encounter the following error when deploying for the second time:

VNX path set with IP 10.10.10.1 already exists in a different path set (RP_0x123abc456def789g_0_iSCSI1)

rpa_redeploy

The cause of the issue is iSCSI sessions which stayed on the VNX after you deleted RPA VMs. You need to connect to the VNX and delete them in Unisphere manually by right-clicking on the storage array name on the dashboard and selecting iSCSI > Connections Between Storage Systems. This is what duplicate sessions look like:

duplicate_rp

As you can see there’re three sets of RecoverPoint cluster iSCSI connections after three unsuccessful attempts.

You will need to delete old sessions before you are able to proceed with the deployment in RecoverPoint Deployment Manager.

Wrong initiator names

I’ve seen this on multiple occasions when RecoverPoint registers initiators on VNX with inconsistent hostnames.

As you’ve seen on the screenshots above, hostname field of every initiator consists of the cluster ID and RPA ID (not sure what the third field means), such as this:

RP_0x123abc456def789g_1_0

In this example you can see that RPA1 has two hostnames with suffixes _0_0 and _1_0.

wrong_initiators

This issue is purely cosmetic and doesn’t affect RecoverPoint operation, but if you want to fix it you will need to restart Management Servers on VNX service processors. It’s a non-disruptive procedure and can be performed by opening the following link http://SP_IP/setup and clicking on “Restart Management Server” button.

After a restart, array will update hostnames to reflect the actual configuration.

Joining two clusters with the licences already applied

This is just not going to work. Make sure to join production and DR clusters before applying RecoverPoint licences or Deployment Manager “Connect Cluster” wizard will fail.

It’s one of the prerequisites specified in RecoverPoint “Installation and Deployment Guide”:

If you plan to connect the new cluster immediately after preparing it for connection,
ensure:

  • You do not install a license in, or modify the settings of, the new cluster before
    connecting it to the existing system.

Conclusion

There’re always much more things that can potentially go wrong. But if any of the above helped you to solve your RecoverPoint deployment issues make sure to let me know in the comments below!

Dell Compellent Enterprise Manager: SQL Server Setup

January 25, 2016

Dell Compellent Enterprise Manager is a separate piece of software, which comes with every Compellent storage array deployment and allows you to monitor, manage, and analyze one or multiple arrays from a centralized management console.

Probably the most valuable feature of Enterprise Manager for an average user is its historical performance statistics. From the Storage Center GUI you can see only real-time data. Enterprise Manager is capable of keeping statistics for up to a year. And can obviously do a multitude of other things, such as assist with capacity planning, allow you to configure replication between production and DR sites, generate reports or simply serve as a single pain of glass interface to all of your Compellent storage arrays.

Enterprise Manager installation does not include an embedded database. If you want to deploy EM database on an existing Microsoft SQL database or install a new dedicated Microsoft SQL Express instance, you need to do it manually. The following guide describes how to install Enterprise manager with Microsoft SQL Server 2012 Express.

SQL Server Authentication

During installation, Enterprise Manager will ask you for authentication credentials to connect to the SQL database. In SQL Server you can use either the Windows Authentication Mode or Mixed Mode. Mixed Mode allows both Windows authentication and SQL Server authentication via local SQL Server accounts.

For a typical SQL Server setup, Microsoft does not recommend enabling SQL Server authentication and especially with the default “sa” account, as it’s seen as insecure. So if you have a centralized Microsoft SQL Server in your network you’ll most likely be using Microsoft Authentication. But for a dedicated SQL Server Express database it’s fine to enable Mixed Mode and use SQL Server authentication.

Make sure to enable Mixed Mode during SQL Server Express installation and enter a password for the “sa” account.

sql_authentication

SQL Server Network Configuration

Enterprise Manager uses TCP/IP to connect to the SQL database over port 1433. TCP/IP is not enabled by default in SQL Server Express, you will need to enable it manually.

sql_network

Use SQL Server Configuration Manager, which is installed with the database, and browse to SQL Server Network Configuration > Protocols for SQLEXPRESS > TCP/IP. Enable TCP/IP on the Protocols tab and assign port 1433 to TCP Port field in IPAll section.

Make sure to restart the SQL server to apply the settings and you should now be able to connect Enterprise Manager to the database.

If you get stuck, refer to Dell Compellent Enterprise Manager Installation Guide and specifically the section “Prepare a Microsoft SQL Server Database”. This guide is available in Dell Compellent Knowledge Center.

Force10 MXL: Initial Configuration

March 14, 2015

Continuing a series of posts on how to deal with Force10 MXL switches. This one is about VLANs, port channels, tagging and all the basic stuff. It’s not much different from other vendors like Cisco or HP. At the end of the day it’s the same networking standards.

If you want to match the terminology with Cisco for instance, then what you used to as EtherChannels is Port Channels on Force10. And trunk/access ports from Cisco are called tagged/untagged ports on Force10.

Configure Port Channels

If you are after dynamic LACP port channels (as opposed to static), then they are configured in two steps. First step is to create a port channel itself:

# conf t
# interface port-channel 1
# switchport
# no shutdown

And then you enable LACP on the interfaces you want to add to the port channel. I have a four switch stack and use 0/.., 1/.. type of syntax:

# conf t
# int range te0/51-52 , te1/51-52 , te2/51-52 , te3/51-52
# port-channel-protocol lacp
# port-channel 1 mode active

To check if the port channel has come up use this command. Port channel obviously won’t init if it’s not set up on the other side of the port channel as well.

# show int po1 brief

port_channel

Configure VLANs

Then you create your VLANs and add ports. Typically if you have vSphere hosts connected to the switch, you tag traffic on ESXi host level. So both your host ports and port channel will need to be added to VLANs as tagged. If you have any standalone non-virtualized servers – you’ll use untagged.

# conf t
# interface vlan 120
# description Management
# tagged Te0/1-4
# tagged Te2/1-4
# tagged Po1
# no shutdown
# copy run start

I have four hosts. Each host has a dual-port NIC which connects to two fabrics – switches 0 and 2 in the stack (1 port per fabric). I allow VLAN 120 traffic from these ports through the port channel to the upstream core switch.

You’ll most likely have more than one VLAN. At least one for Management and one for Production if it’s vSphere. But process for the rest is exactly the same.

The other switch

Just to give you a whole picture I’ll include the configuration of the switch on the other side of the trunk. I had a modular HP switch with 10Gb modules. A config for it would look like the following:

# conf t
# trunk I1-I8 trk1 lacp
# vlan 120 tagged trk1
# write mem

I1 to I8 here are ports, where I – is the module and 1 to 8 are ports within that module.

Zoning vs. LUN masking explained

September 28, 2012

Zoning and masking terms are often confused by those who just started working with SAN. But it takes 5 minutes googling to understand that the main difference is that zoning is configured on a SAN switch on a port basis (or WWN) and masking is a storage feature with LUN granularity. All modern hardware supports zoning and masking. Given that, the much more interesting question here is what’s the point of zoning if there is masking with finer granularity.

Both security features do the same thing, restrict access to particular storage targets. And it seems that there is no point in configuring both of them. But that’s not true. One, not that convincing argument, is that in case one of the features is accidentally misconfigured, you still maintain security. But the much bigger issue in no-zoning configuration are RSCNs. RSCNs are Registered State Change Notification messages which are issued by SAN Name Server service when fabric changes it’s configuration (new device has been added to the fabric, a zone has changed, a switch name or IP address has changed, etc). RSCNs can be disruptive to fabric operation. And if you don’t have zones RSCNs are flooded to everyone each time something changes in a fabric, even if it has nothing to do with majority of devices. So zoning is a SAN best practice and its configuration is highly recommended.

In fact, Brocade recommends to adopt a so called Single Initiator Zoning (SIZ) practice, when one host pWWN (initiator) is zoned to one or more storage pWWNs. It reduces RSCN issue to a minimum.

As a best reference read Brocade’s: Secure SAN Zoning – Best Practices.