Posts Tagged ‘disk’

How to move aggregates between NetApp controllers

September 25, 2013

Stop Sign_91602




We had an issue with high CPU usage on one of the NetApp controllers servicing a couple of NFS datastores to VMware ESX cluster. HA pair of FAS2050 had two shelves, both of them owned by the first controller. The obvious solution for us was to reassign disks from one of the shelves to the other controller to balance the load. But how do you do this non-disruptively? Here is the plan.

In our setup we had two controllers (filer1, filer2), two shelves (shelf1, shelf2) both assigned to filer1. And two aggregates, each on its own shelf (aggr0 on shelf0, aggr1 on shelf1). Say, we want to reassign disks from shelf2 to filer2.

First step is to migrate all of the VMs from the shelf2 to shelf1. Because operation is obviously disruptive to the hosts accessing data from the target shelf. Once all VMs are evacuated, offline all volumes and an aggregate, to prevent any data corruption (you can’t take aggregate offline from online state, so change it to restricted first).

If you prefer to reassign disks in two steps, as described in NetApp Professional Services Tech Note #021: Changing Disk Ownership, don’t forget to disable automatic ownership assignment on both controllers, otherwise disks will be assigned back to the same controller again, right after you unown them:

> options disk.auto_assign off

It’s not necessary if you change ownership in one step as shown below.

Next step is to actually reassign the disks. Since they are already part of an aggregate you will need to force the ownership change:

filer1> disk assign 1b.01.00 -o filer2 -f

filer1> disk assign 1b.01.01 -o filer2 -f

filer1> disk assign 1b.01.nn -o filer2 -f

If you do not force disk reassignment you will get an error:

Assign request failed for disk 1b.01.0. Reason:Disk is part of a failed or offline aggregate or volume. Changing its owner may prevent aggregate or volume from coming back online. Ownership may be changed only by using the appropriate force option.

When all disks are moved across to filer2, new aggregate will show up in the list of aggregates on filer2 and you’ll be able to bring it online. If you can’t see the aggregate, force filer to rescan the drives by running:

filer2> disk show

The old aggregate will still be seen in the list on filer1. You can safely remove it:

filer1> aggr destroy aggr1


Magic behind NetApp VSC Backup/Restore

June 12, 2013

netapp_dpNetApp Virtual Storage Console is a plug-in for VMware vCenter which provides capabilities to perform instant backup/restore using NetApp snapshots. It uses several underlying NetApp features to accomplish its tasks, which I want to describe here.

Backup Process

When you configure a backup job in VSC, what VSC does, is it simply creates a NetApp snapshot for a target volume on a NetApp filer. Interestingly, if you have two VMFS datastores inside one volume, then both LUNs will be snapshotted, since snapshots are done on the volume level. But during the datastore restore, the second volume will be left intact. You would think that if VSC reverts the volume to the previously made snapshot, then both datastores should be affected, but that’s not the case, because VSC uses Single File SnapRestore to restore the LUN (this will be explained below). Creating several VMFS LUNs inside one volume is not a best practice. But it’s good to know that VSC works correctly in this case.

Same thing for VMs. There is no sense of backing up one VM in a datastore, because VSC will make a volume snapshot anyway. Backup the whole datastore in that case.

Datastore Restore

After a backup is done, you have three restore options. The first and least useful kind is a datastore restore. The only use case for such restore that I can think of is disaster recovery. But usually disaster recovery procedures are separate from backups and are based on replication to a disaster recovery site.

VSC uses NetApp’s Single File SnapRestore (SFSR) feature to restore a datastore. In case of a SAN implementation, SFSR reverts only the required LUN from snapshot to its previous state instead of the whole volume. My guess is that SnapRestore uses LUN clone/split functionality in background, to create new LUN from the snapshot, then swap the old with the new and then delete the old. But I haven’t found a clear answer to that question.

For that functionality to work, you need a SnapRestore license. In fact, you can do the same trick manually by issuing a SnapRestore command:

> snap restore -t file -s nightly.0 /vol/vol_name/vmfs_lun_name

If you have only one LUN in the volume (and you have to), then you can simply restore the whole volume with the same effect:

> snap restore -t vol -s nightly.0 /vol/vol_name

VM Restore

VM restore is also a bit controversial way of restoring data. Because it completely removes the old VM. There is no way to keep the old .vmdks. You can use another datastore for particular virtual hard drives to restore, but it doesn’t keep the old .vmdks even in this case.

VSC uses another mechanism to perform VM restore. It creates a LUN clone (don’t confuse with FlexClone,which is a volume cloning feature) from a snapshot. LUN clone doesn’t use any additional space on the filer, because its data is mapped to the blocks which sit inside the snapshot. Then VSC maps the new LUN to the ESXi host, which you specify in the restore job wizard. When datastore is accessible to the ESXi host, VSC simply removes the old VMDKs and performs a storage vMotion from the clone to the active datastore (or the one you specify in the job). Then clone is removed as part of a clean up process.

The equivalent cli command for that is:

> lun clone create /vol/clone_vol_name -o noreserve -b /vol/vol_name nightly.0

Backup Mount

Probably the most useful way of recovery. VSC allows you to mount the backup to a particular ESXi host and do whatever you want with the .vmdks. After the mount you can connect a virtual disk to the same or another virtual machine and recover the data you need.

If you want to connect the disk to the original VM, make sure you changed the disk UUID, otherwise VM won’t boot. Connect to the ESXi console and run:

# vmkfstools -J setuuid /vmfs/volumes/datastore/VM/vm.vmdk

Backup mount uses the same LUN cloning feature. LUN is cloned from a snapshot and is connected as a datastore. After an unmount LUN clone is destroyed.

Some Notes

VSC doesn’t do a good cleanup after a restore. As part of the LUN mapping to the ESXi hosts, VSC creates new igroups on the NetApp filer, which it doesn’t delete after the restore is completed.

What’s more interesting, when you restore a VM, VSC deletes .vmdks of the old VM, but leaves all the other files: .vmx, .log, .nvram, etc. in place. Instead of completely substituting VM’s folder, it creates a new folder vmname_1 and copies everything into it. So if you use VSC now and then, you will have these old folders left behind.

Mounting VMware Virtual Disks

June 11, 2013

H_Storage04There are millions of posts on that topic all over the Internet. Just another repetition mostly for myself.

VMware has Virtual Disk Development Kit (VDDK) which is more of an API for backup software vendors. But it includes a handy tool called vmware-mount, which gives you an ability to mount VMware virtual disks (.vmdk) from wherever you want.

Download VDDK from VMware site. It’s free. And then run vmware-mount with the following keys:

> vmware-mount driveletter: “[vmfs_datastore] vmname/diskname.vmdk” /i:”datacentername/vm/vmname” /h:vcname /u:username /s:password

Choose drive letter, specify vmdk path, inventory path to VM (put ‘vm’ in lowercase between datacenter and vm name, upper case will give you an error) and vCenter or ESXi host name.

Note however, that you can mount only vmdks from powered off VMs. But there is a workaround. You can mount vmdk from online VMs in read-only mode if you make a VM snapshot. Then the original vmdk won’t be locked by ESXi server and you will be able to mount it.

To unmount a vmdk run:

> vmware-mount diskletter: /d

There are also several GUI tools to mount vmdks. But vmware-mount is enough for me.

Replacing hard drives in a NetApp aggregate

May 30, 2013

netapp_disk_driveNetApp uses certain rules to assign hot spares in case of a failure. It always tries to use the exact match, but if it’s not there, the best available spare is used. “The best” means that if you have an aggregate which consists of 1TB hard drives and you have only 2TB spare left, then this 2TB spare will be downsized to 1TB and used as a data disk. After that, when you receive a correct size replacement from NetApp, you need to exchange the downsized 2TB hard drive with the delivered 1TB spare. To accomplish that, use the following command:

> disk replace start disk_name spare_disk_name

It will take considerable amount of time to copy the data. In my case it was 6.5 hours for a 1TB drive.

When the process finishes, replaced drive becomes a new spare. It’s wise to zero it out right away, so that it could be easily used again as a spare. Otherwise when time comes you’ll be waiting hours before it could be added in place of the failed drive:

> disk zero spares

As a side note I want to mention that you cannot take disks out of the raid group. There is no way to shrink aggregates. The only thing you can make is to replace a hard drive with another one.

NetApp storage architecture

October 9, 2011

All of us are get used to SATA disk drives connected to our workstations and we call it storage. Some organizations has RAID arrays. RAID is one level of logical abstraction which combine several hard drives to form logical drive with greater size/reliability/speed. What would you say if I’d tell you that NetApp has following terms in its storage architecture paradigm: disk, RAID group, plex, aggregate, volume, qtree, LUN, directory, file. Lets try to understand how all this work together.

RAID in NetApp terminology is called RAID group. Unlike ordinary storage systems NetApp works mostly with RAID 4 and RAID-DP. Where RAID 4 has one separate disk for parity and RAID-DP has two. Don’t think that it leads to performance degradation. NetApp has very efficient implementation of these RAID levels.

Plex is collection of RAID groups and is used for RAID level mirroring. For instance if you have two disk shelves and SyncMirror license then you can create plex0 from first shelf drives and plex1 from second shelf.  This will protect you from one disk shelf failure.

Aggregate is simply a highest level of hardware abstraction in NetApp and is used to manage plexes, raid groups, etc.

Volume is a logical file system. It’s a well-known term in Windows/Linux/Unix realms and serves for the same goal. Volume may contain files, directories, qtrees and LUNs. It’s the highest level of abstraction from the logical point of view. Data in volume can be accessed by any of protocols NetApp supports: NFS, CIFS, iSCSI, FCP, WebDav, HTTP.

Qtree can contain files and directories or even LUNs and is used to put security and quota rules on contained objects with user/group granularity.

LUN is necessary to access data via block-level protocols like FCP and iSCSI. Files and directories are used with file-level protocols NFS/CIFS/WebDav/HTTP.