Posts Tagged ‘SP’

RecoverPoint VE: Common Deployment Issues

April 19, 2016

fixIn one of my previous posts I discussed iSCSI connectivity considerations when deploying RecoverPoint VE. In this post I want to describe common issues you may encounter when deploying RecoverPoint clusters, most of which are applicable to both physical appliance and virtual editions.

VNX MirrorView ports

I already touched on that briefly in my previous post. But it’s worth mentioning again that you can NOT use MirrorView ports for iSCSI connectivity between RPAs and VNX arrays. When you try to use a MirrorView iSCSI port for RecoverPoint, it gets upset and doesn’t communicate with the array.

If you make a mistake of connecting one port per SP and this port is a MirrorView port, you will have no communication with the array at all and get the following error in Unisphere for RecoverPoint:

Error Splitter ARRAYNAME-A is down
Error Splitter ARRAYNAME-B is down

splitter_error

If you connect two ports per SP, one of which is MirrorView port and use two iSCSI network subnets you may get the following error when running a SAN connectivity test from the RPA boxmgmt interface. In this case RPA can communicate with the array only over one subnet:

On array ABCD1234567890, all paths for device with UID=0x1234567890abcdef go through RPA Ethernet port eth2 …

multipathing_issue

The solution is as simple as moving the link from port 0 to port 1 on a 10Gb I/O module. And from port 0 to port 1,2 or 3 on a 1Gb I/O module.

If you don’t want to lose two iSCSI ports (1 per SP), especially if it’s 10Gb, and you’re not using MirrorView, you can uninstall MirrorView enabler from the array. Just keep in mind that it will require an array reboot. Service processors will be rebooted one at a time, so there is no downtime. But if it’s a heavily used storage array it’s recommended to schedule uninstallation out of hours to minimize the impact.

Error when redeploying a cluster

If you’ve made configuration mistakes while deploying a RecoverPoint cluster and want to blow the whole thing away and redeploy it from scratch you may encounter the following error when deploying for the second time:

VNX path set with IP 10.10.10.1 already exists in a different path set (RP_0x123abc456def789g_0_iSCSI1)

rpa_redeploy

The cause of the issue is iSCSI sessions which stayed on the VNX after you deleted RPA VMs. You need to connect to the VNX and delete them in Unisphere manually by right-clicking on the storage array name on the dashboard and selecting iSCSI > Connections Between Storage Systems. This is what duplicate sessions look like:

duplicate_rp

As you can see there’re three sets of RecoverPoint cluster iSCSI connections after three unsuccessful attempts.

You will need to delete old sessions before you are able to proceed with the deployment in RecoverPoint Deployment Manager.

Wrong initiator names

I’ve seen this on multiple occasions when RecoverPoint registers initiators on VNX with inconsistent hostnames.

As you’ve seen on the screenshots above, hostname field of every initiator consists of the cluster ID and RPA ID (not sure what the third field means), such as this:

RP_0x123abc456def789g_1_0

In this example you can see that RPA1 has two hostnames with suffixes _0_0 and _1_0.

wrong_initiators

This issue is purely cosmetic and doesn’t affect RecoverPoint operation, but if you want to fix it you will need to restart Management Servers on VNX service processors. It’s a non-disruptive procedure and can be performed by opening the following link http://SP_IP/setup and clicking on “Restart Management Server” button.

After a restart, array will update hostnames to reflect the actual configuration.

Joining two clusters with the licences already applied

This is just not going to work. Make sure to join production and DR clusters before applying RecoverPoint licences or Deployment Manager “Connect Cluster” wizard will fail.

It’s one of the prerequisites specified in RecoverPoint “Installation and Deployment Guide”:

If you plan to connect the new cluster immediately after preparing it for connection,
ensure:

  • You do not install a license in, or modify the settings of, the new cluster before
    connecting it to the existing system.

Conclusion

There’re always much more things that can potentially go wrong. But if any of the above helped you to solve your RecoverPoint deployment issues make sure to let me know in the comments below!

Storwize V7000 with vSphere 5 storage configuration

December 1, 2012

storwizeInformation on how to configure Storwize for optimal performance is very scarce. I’ll try to build some understanding of it from bits an pieces gathered throughout the Internet and redbooks.

Barry Whyte gave many insights on Storwize internals in his blog. Particularly his “Configuring IBM Storwize V7000 and SVC for Optimal Performance” series of posts. I’ll quote him here. The main Storwize redbook “Implementing the IBM Storwize V7000 V6.3” is mostly an administration guide and gives no useful information on the topic. I find “SAN Volume Controller Best Practices and Performance Guidelines” way more helpful (Storwize firmware is built on SVC code).

Total Number of MDisks

That’s what Barry says:

… At the heart of each V7000 controller canister is an Intel Jasper Forrest (Sandy Bridge) based quad core CPU. … When we added the tried and trusted (SSA) DS8000 RAID functionality in 2010 (6.1.0) we therefore assigned RAID processing on a per mdisk basis to a single core. That means you need at least 4 arrays per V7000 to get maximal CPU core performance. …

Number of MDisks per Storage Pool

SVC Redbook:

The capability to stripe across disk arrays is the single most important performance advantage of the SVC; however, striping across more arrays is not necessarily better. The objective here is to only add as many arrays to a single Storage Pool as required to meet the performance objectives.

If the Storage Pool is already meeting its performance objectives, we recommend that, in most cases, you add the new MDisks to new Storage Pools rather than add the new MDisks to existing Storage Pools.

Table 5-1 shows the recommended number of arrays per Storage Pool that is appropriate for general cases.

Controller type       Arrays per Storage Pool
DS4000/DS5000         4 - 24
DS6000/DS8000         4 - 12
IBM Storwise V7000    4 - 12

The development recommendations for Storwize V7000 are summarized below:

  • One MDisk group per storage subsystem
  • One MDisk group per RAID array type (RAID 5 versus RAID 10)
  • One MDisk and MDisk group per disk type (10K versus 15K RPM, or 146 GB versus 300 GB)

There are situations where multiple MDisk groups are desirable:

  • Workload isolation
  • Short-stroking a production MDisk group
  • Managing different workloads in different groups

We recommend that you have at least two MDisk groups, one for key applications, another for everything else.

Number of LUNs per Storage Pool

SVC Redbook:

We generally recommend that you configure LUNs to use the entire array, which is especially true for midrange storage subsystems where multiple LUNs configured to an array have shown to result in a significant performance degradation. The performance degradation is attributed mainly to smaller cache sizes and the inefficient use of available cache, defeating the subsystem’s ability to perform “full stride writes” for Redundant Array of Independent Disks 5 (RAID 5) arrays. Additionally, I/O queues for multiple LUNs directed at the same array can have a tendency to overdrive the array.

Table 5-2 provides our recommended guidelines for array provisioning on IBM storage subsystems.

Controller type                     LUNs per array
IBM System Storage DS4000/DS5000    1
IBM System Storage DS6000/DS8000    1 - 2
IBM Storwize V7000                  1

General considerations

vsphere5-logoLets take a look at vSphere use case scenario on top of Storwize with 16 x 600GB SAS drives in control enclosure and 10 x 2TB NL-SAS in extension enclosure (our personal case).

First of all we need to decide how many arrays we need. Do we have different workloads? No. All storage will be assigned to virtual machines which have in general the same random read/write access pattern. Do we need to isolate workloads? Probably yes, it’s generally a good idea to separate highly critical production VMs from everything else. Do we have different drive types? Yes. Obviously we don’t want to mix drive types in one RAID. Are we going to make different RAID types? Again, yes. RAID 10 is appropriate on SAS and RAID 5 on NL-SAS. So two MDisks – one RAID 10 on SAS and one RAID 5 on NL-SAS would be enough. Storwize nodes have 4 cores each. It may seem that you would benefit from 4 MDisks, but in fact you won’t. Here what Barry says:

In the case where you only have 1 or 2 HDD arrays, then the core stuff doesn’t really come into play. Its only when you get to larger systems, where you are driving more I/O than a single RAID core can handle that you need to spread them.

This is also true if you are running all SSD arrays, so 24x SSD would be best split into 4 arrays to get maximum IOPs, whereas 24x HDD are not going to saturate a single core, so (if you could create a 23+P! [ you can’t 15+P is largest we support ] then it would perform as well as 2x 11+P etc

To storage pools. In our example we have two MDisks, so you simply make two storage pools. In future if you hit performance limit, you create additional MDisks and then you have two options. If each MDisk separately is able to sustain your performance requirements, you make additional storage pools and redistribute workload between them. If you have huge load on storage and even redistribution of VMs between two arrays doesn’t help, then you better combine two MDisks of each type in its own storage pool for striping between MDisks.

Same story for number of LUNs. IBM recommends one to one LUN to MDisk relationship. But read carefully. Recommendation comes from the fact that different workloads can clash and degrade array performance. But if we have generally the same I/O patterns coming to the array it’s safe to make several LUNs on it, until latency is in the acceptable range. Moreover, when it comes to vSphere and VMFS, it’s beneficial to have at least two volumes in terms of manageability. With several LUNs you will at least have an ability to move VMs between LUNs for reconfiguration purposes. Also keep in mind that ESXi 5 hypervisor limit each host to storage queue of depth 32 per LUN. It means that if you have one big LUN and many VMs running on the host, you can quickly reach queue limit. On the other hand do not create too many LUNs or you will oversubscribe storage processors (SPs).

Sample configuration

IBM recommends constructing both RAID 10 and RAID 5 arrays from 8 drives + 1 spare drive. But since we have 16 SAS and 10 NL-SAS I would launch CLI and create two arrays: one 14 drives + 2 spares RAID 10 and one 8 drives + 2 spares RAID 5 (or 9 drives + 1 spare, but it’s not a good idea to create RAID with uneven number of drives). Each RAID in its own pool. Several LUNs in each pool. I would go for 2TB LUNs.