Posts Tagged ‘Filer’

How to move aggregates between NetApp controllers

September 25, 2013

Stop Sign_91602

 

DISCLAMER: I ACCEPT NO RESPONSIBILITY FOR ANY DAMAGE OR CORRUPTION OF DATA THAT MAY OCCUR AS A RESULT OF CARRYING OUT STEPS DESCRIBED BELOW. YOU DO THIS AT YOUR OWN RISK.

 

We had an issue with high CPU usage on one of the NetApp controllers servicing a couple of NFS datastores to VMware ESX cluster. HA pair of FAS2050 had two shelves, both of them owned by the first controller. The obvious solution for us was to reassign disks from one of the shelves to the other controller to balance the load. But how do you do this non-disruptively? Here is the plan.

In our setup we had two controllers (filer1, filer2), two shelves (shelf1, shelf2) both assigned to filer1. And two aggregates, each on its own shelf (aggr0 on shelf0, aggr1 on shelf1). Say, we want to reassign disks from shelf2 to filer2.

First step is to migrate all of the VMs from the shelf2 to shelf1. Because operation is obviously disruptive to the hosts accessing data from the target shelf. Once all VMs are evacuated, offline all volumes and an aggregate, to prevent any data corruption (you can’t take aggregate offline from online state, so change it to restricted first).

If you prefer to reassign disks in two steps, as described in NetApp Professional Services Tech Note #021: Changing Disk Ownership, don’t forget to disable automatic ownership assignment on both controllers, otherwise disks will be assigned back to the same controller again, right after you unown them:

> options disk.auto_assign off

It’s not necessary if you change ownership in one step as shown below.

Next step is to actually reassign the disks. Since they are already part of an aggregate you will need to force the ownership change:

filer1> disk assign 1b.01.00 -o filer2 -f

filer1> disk assign 1b.01.01 -o filer2 -f

filer1> disk assign 1b.01.nn -o filer2 -f

If you do not force disk reassignment you will get an error:

Assign request failed for disk 1b.01.0. Reason:Disk is part of a failed or offline aggregate or volume. Changing its owner may prevent aggregate or volume from coming back online. Ownership may be changed only by using the appropriate force option.

When all disks are moved across to filer2, new aggregate will show up in the list of aggregates on filer2 and you’ll be able to bring it online. If you can’t see the aggregate, force filer to rescan the drives by running:

filer2> disk show

The old aggregate will still be seen in the list on filer1. You can safely remove it:

filer1> aggr destroy aggr1

Advertisements

Resize limits for SAN LUNs

July 1, 2013

If you run lun resize command on NetApp you might run into the following error:

lun resize: New size exceeds this LUN’s initial geometry

The reason behind it is that each SAN LUN has head/cylinder/sector geometry. It’s not an actual physical mapping to the underlying disks and has no meaning these days. It’s simply a SCSI protocol artifact. But it imposes limitation on maximum LUN resize. Geometry is chosen at initial LUN creation and cannot be changed. Roughly you can resize the LUN to the size, which is 10 times bigger than the size at the time of creation. For example, the 50GB LUN can be extended to the maximum of 502GB. See the table below for the maximum sizes:

Initial Size   Maximum Sizedata-storage
< 50g          502g
51-100g        1004g
101-150g       1506g
151-200g       2008g
201-251g       2510g
252-301g       3012g
302-351g       3514g
352-401g       4016g

To check the maximum size for particular LUN use the following commands:

> priv set diag
> lun gemetry lun_path
> priv set

If you run into this issue, unfortunately you will need to create a new LUN, copy all the data using robocopy for example and make a cutover. Because such features as volume level SnapMirror or ndmpcopy will recreate the LUN’s geometry together with the data.

Magic behind NetApp VSC Backup/Restore

June 12, 2013

netapp_dpNetApp Virtual Storage Console is a plug-in for VMware vCenter which provides capabilities to perform instant backup/restore using NetApp snapshots. It uses several underlying NetApp features to accomplish its tasks, which I want to describe here.

Backup Process

When you configure a backup job in VSC, what VSC does, is it simply creates a NetApp snapshot for a target volume on a NetApp filer. Interestingly, if you have two VMFS datastores inside one volume, then both LUNs will be snapshotted, since snapshots are done on the volume level. But during the datastore restore, the second volume will be left intact. You would think that if VSC reverts the volume to the previously made snapshot, then both datastores should be affected, but that’s not the case, because VSC uses Single File SnapRestore to restore the LUN (this will be explained below). Creating several VMFS LUNs inside one volume is not a best practice. But it’s good to know that VSC works correctly in this case.

Same thing for VMs. There is no sense of backing up one VM in a datastore, because VSC will make a volume snapshot anyway. Backup the whole datastore in that case.

Datastore Restore

After a backup is done, you have three restore options. The first and least useful kind is a datastore restore. The only use case for such restore that I can think of is disaster recovery. But usually disaster recovery procedures are separate from backups and are based on replication to a disaster recovery site.

VSC uses NetApp’s Single File SnapRestore (SFSR) feature to restore a datastore. In case of a SAN implementation, SFSR reverts only the required LUN from snapshot to its previous state instead of the whole volume. My guess is that SnapRestore uses LUN clone/split functionality in background, to create new LUN from the snapshot, then swap the old with the new and then delete the old. But I haven’t found a clear answer to that question.

For that functionality to work, you need a SnapRestore license. In fact, you can do the same trick manually by issuing a SnapRestore command:

> snap restore -t file -s nightly.0 /vol/vol_name/vmfs_lun_name

If you have only one LUN in the volume (and you have to), then you can simply restore the whole volume with the same effect:

> snap restore -t vol -s nightly.0 /vol/vol_name

VM Restore

VM restore is also a bit controversial way of restoring data. Because it completely removes the old VM. There is no way to keep the old .vmdks. You can use another datastore for particular virtual hard drives to restore, but it doesn’t keep the old .vmdks even in this case.

VSC uses another mechanism to perform VM restore. It creates a LUN clone (don’t confuse with FlexClone,which is a volume cloning feature) from a snapshot. LUN clone doesn’t use any additional space on the filer, because its data is mapped to the blocks which sit inside the snapshot. Then VSC maps the new LUN to the ESXi host, which you specify in the restore job wizard. When datastore is accessible to the ESXi host, VSC simply removes the old VMDKs and performs a storage vMotion from the clone to the active datastore (or the one you specify in the job). Then clone is removed as part of a clean up process.

The equivalent cli command for that is:

> lun clone create /vol/clone_vol_name -o noreserve -b /vol/vol_name nightly.0

Backup Mount

Probably the most useful way of recovery. VSC allows you to mount the backup to a particular ESXi host and do whatever you want with the .vmdks. After the mount you can connect a virtual disk to the same or another virtual machine and recover the data you need.

If you want to connect the disk to the original VM, make sure you changed the disk UUID, otherwise VM won’t boot. Connect to the ESXi console and run:

# vmkfstools -J setuuid /vmfs/volumes/datastore/VM/vm.vmdk

Backup mount uses the same LUN cloning feature. LUN is cloned from a snapshot and is connected as a datastore. After an unmount LUN clone is destroyed.

Some Notes

VSC doesn’t do a good cleanup after a restore. As part of the LUN mapping to the ESXi hosts, VSC creates new igroups on the NetApp filer, which it doesn’t delete after the restore is completed.

What’s more interesting, when you restore a VM, VSC deletes .vmdks of the old VM, but leaves all the other files: .vmx, .log, .nvram, etc. in place. Instead of completely substituting VM’s folder, it creates a new folder vmname_1 and copies everything into it. So if you use VSC now and then, you will have these old folders left behind.

NetApp SSH Connection Times Out

May 31, 2013

PuTTYPortable_128There is one tricky thing about SSH connections to NetApp filers. If you use PuTTY or PuTTY Connection Manager and you experience frequent timeouts from ssh sessions, you might need to fiddle around with PuTTY configuration options. It seems that there is some issue with how Data ONTAP implements SSH key exchanges, which results in frequent annoying disconnections.

In order to fix that, on PuTTY Configuration screen go to Connection -> SSH -> Bugs and change “Handles SSH2 key re-exchange badly” to ‘On’. That should fix it.

Unexpected Deduplication Impact on VMware I/O Latency

May 28, 2013

NetApp deduplication is a postponed process. During normal operation Data ONTAP only calculates hashes for the data blocks. Actual deduplication is carried out off-hours as per configured schedule. Hash calculation doesn’t affect performance in most cases. I talked about that in my previous post. NetApp states in its documentation that deduplication is a low-priority process:

When one deduplication process is running, there is 0% to 15% performance degradation on other applications.

Once I faced a situation when deduplication was configured to be carried out during business hours on one of the volumes. No one noticed that at some point volume run out of space and Data ONTAP wasn’t able to perform deduplication from that time. Situation became worse when Data ONTAP was upgraded from version 7.3.2 to 8.1.0. Now during deduplication filer tried to upgrade the fingerprint metadata to a new version at 15:00 every day with the message: “Fingerprint is being upgraded” and failed. It seems that the metadata upgrade is a very resource-intensive process and heavily affects I/O latency.

This volume was not a VMware datastore, but it sit on the same aggregate together with the several VMFS LUNs. Here what happened to the VMware I/O latency every day at 15:00 (click to enlarge):

dedup_issue_ed

I deleted the host name and the datastores names from the graph. You can see the large latency spike, which won’t turn yourVMs into kernel panic, but it’s not the thing you would want your production environment to experience every day.

The solution was simple. After space was increased on this volume, deduplication metadata upgrade performed successfully and problem went away. Additionally, deduplication was shifted to off-hours.

The simple lesson to learn: don’t schedule deduplication during the day, you never know what could possibly go wrong.

NetApp thin-provisioning for VMware LUNs

May 22, 2013

thin

LUN and Volume Thin Provisioning

I already described thin provisioning of VMware NFS volumes some time ago here. Now I want to discuss thin provisioning of LUNs.

LUNs are different from VMFS on top of NFS implementation, because LUN is an additional container inside of NetApp FlexVol. So if you’re using FC, you need to thin provision both LUN and volume:

> lun set reservation “/vol/targetvol/targetlun” disable
> vol options “targetvol” guarantee none

In fact, you can make the LUN thin and the volume thick. Then storage space that’s not used by the LUN, is returned to the volume level. But in this case it cannot be used by other volumes as a shared pool of space.

As the best practice, NetApp now recommends to set Fractional Reserve and Snap Reserve for your volumes to 0%. Don’t forget about that, if you want to save more storage space:

> vol options “targetvol” fractional_reserve 0
> snap reserve “targetvol” 0

Disable snapshots if you don’t use them:

> snap sched “targetvol” 0

It’s easy as that. Now you don’t waste your space by reserving it ahead, but use it as a shared pool of resources. But make sure to monitor aggregate free space. If you starting to run out of storage, plan purchase of new disks in advance or redistribute data between other aggregates.

Safety Features

Disabling volume level, LUN and snapshot reservations helps you to save storage space. The drawback of this approach is that you don’t have any mechanisms in place to prevent volume out-of-space situations. If you enable snapshots on the volume and they consume all the volume space, the volume goes offline. Very undesirable consequence. NetApp has two features that can serve as safety net in thin-provisioned environments: autosize and snap autodelete.

Snap autodelete automatically removes old snapshots if there is no space left inside the volume. Autosize, on the other hand, allows the volume to automatically grow to the specified limit (+20% to the volume size by default) in specified increments (5% of the volume size by the default). You can also specify what to do first autosize or snapshot autodelete by using ‘try_first’ option.

> snap autodelete “targetvol” on
> vol autosize “targetvol” on
> vol options “targetvol” try_first volume_grow

SnapMirror Considerations

If you use SnapMirroring and switch on the autosize on the source volume, then the destination volume won’t grow automatically. And SnapMirror will break the relationship if it runs out of space on the smaller destination volume. The trick here is to make the destination volume as big as the autosize limit for the source volume and thin provision the destination volume. By doing that you won’t run out of space on destination even if the source volume grows to its maximum.

Further reading

TR-3965: NetApp Thin Provisioning Deployment and Implementation Guide Data ONTAP 8.1 7-Mode

Disk to Disk to Tape backup in Backup Exec

July 14, 2012

Notice: It seems that D2D2T feature in Backup Exec 11d is buggy. D2D2T duplicate jobs (which transfer data from disk to tape) are insanely slow and nobody has yet solved this problem. You can try to implement backup of raw Backup to Disk Folder, but it is associated with number of  difficulties when restoring. Files from Backup to Disk Folders are media and they conflict with media which is currently used for backup.

Typical backup solution in most organizations consists of backup server and tape drive/autoloader/tape library connected directly to backup server. Every night backups are pushed to tape through backup server. But sometimes it is more complicated. We have NetApp filer with StorageTek tape library connected to the filer. Backup server sends NDMP commands to the filer and filer in its turn performs actual data transfer to tapes from disk shelves. Most of our hosts are VMware virtual machines. We backup whole .vmdk files, but we also want to perform file-level backups from some of virtual machines. To accomplish that we set up backup agents on all virtual machines, but we can’t backup files directly to tapes, because they do not originate from NetApp filer volumes. So we decided to implement D2D2T multistage backup. The idea here is to create a CIFS share on the filer, backup data there and then transfer data from CIFS share to tapes.

First step here is to configure disk to disk backup. Backup Exec stores disk to disk backups in binary files. Folder where files are stored is listed on Devices tab and files are listed on Media tab. Initially, you need to create a Backup to Disk Folder in Devices tab. There you choose size for backup-to-disk files and maximum number of backups per backup-to-disk file. If backup is larger than file size, it is splitted in several files. If file size is smaller than backup, several backups will be written to one file. I use defaults with 16GB file size. Then you create backup jobs as usual (by configuring selection list and policy) using Backup to Disk Folder as target device.

As a second step you need to instruct Backup Exec to transfer backed up files to tape, upon disk to disk job completion. Backup Exec has “duplicate jobs” to implement that. Go to your backup policy properties, click “New Template”, choose “Duplicate Backup Sets Template”, pick template for which you want to create duplicate, in “Devices and Media” choose your tape library, in “Schedule” choose “Run only according to rules for this template”. This will create duplicate template and rule which will start duplicate job after main job completes. As a result you will have duplicate data on disk and on tape.

Re-enable SCSI adapters on NetAapp

July 9, 2012

If your tape library or any other device is connected to NetApp’s filer SCSI adapter and you encounter problems with the device, try to re-enable adapters:

storage disable adapter 0e
storage enable adapter 0e

If filer says that adapter is busy and cannot be disabled, then force it:

storage disable -f adapter 0e

In case it’s a tape library, don’t forget to also re-enable SCSI adapters where tape drives are connected.

NetApp NDMP with Symantec BackupExec

March 16, 2012

Some time ago I uploaded a bunch of photos from the data center, where you can find our backup setup. We connect Sun StorageTek SL500 tape library directly to NetApp filer to perform backups of the virtual infrastructure using NDMP protocol. As opposed to LAN backup, NDMP allows you to offload LAN from backup traffic. Look at the following picture:

Here BackupExec only sends NDMP control commands to NDMP host, which in its turn send data to directly attached tape library. We use slightly more complicated 3-way backup architecture:

We have two filers in high availability cluster. And each of the filers has its own hard drive shelves and data. Filer under number 3 on the picture is the primary source of backup data and data from filer 2 is backed up occasionally. Since filer 2 has no connection to the library, when backup is initiated it is send via LAN from filer 2 to filer 3 and then to the tape library.

NetApp configuration

NDMP configuration involves several steps. First of all enable ndmpd on NetApp and set version 4, which Symantec BackupExec works with:

ndmpd on
ndmpd version 4

Then it’s a generally good idea to restrict NDMP access only to particular hosts and interface, because by default access is allowed from anywhere. In our setup NDMP traffic goes through completely isolated management network. We added two IP addresses to allowed hosts. First is the backup server and second is the partner filer:

options ndmpd.access hosts=ip_1,ip_2
options ndmpd.access if=manage_if

Then I’d recommend to create separate user for NMDP backups, change its group to Backup Operators and create special ndmp password which you will use to connect from BackupExec:

useradmin useradd backup
useradmin user modify backup -g “Backup Operators”
ndmpd password backup

As a last recommendation I suggest changing preferred network interface for data connections. By default for data traffic filer uses the same network interface from which it receives control commands. But if you have separate network for filer to filer communications its preferable to use it. In our configuration it’s the same management interface so for us it doesn’t make any difference:

options ndmpd.preferred_interface manage_if

Additionally you can use the following command to list your tape library robots:

storage show mc

Do the same configuration for all filers, if you have more than one.

BackupExec configuration

For NDMP to work in BackupExec you should obtain a licence key and install NDMP Option module. Then go to Devices section, click Add NDMP Server. In Add NDMP Server dialog box specify server name and logon account. If you have more than one filer, do it for each one.

That’s it. Now you have filer volumes in backup selection lists, tapes in Media section and you are ready to do backups.

NetApp thin provisioning for VMware

March 15, 2012

Thin provisioning is a popular buzzword, especially when it comes to NetApp. However, it can really save you time and headache in a number of situations. We use thin provisioning while presenting NFS volumes from NetApp to VI3 ESX hosts. Well, NetApp already let you change size of its FlexVol volumes on the fly. But you need to do it manually. Thin provisioning helps you to configure volumes so that in case of space shortage on a volume it will automatically expand without manual intervention. Of course you need to look after your volumes, otherwise they can fill all your storage space. But it will save you enough time to resolve data growth problem. Without thin provisioning in such situation your applications can easily crash.

NetApp doesn’t support iSCSI thin provisioning for VMware, so NFS is the only option. Don’t be afraid of performance issues. Without a doubt it’s slower than FC, but NetApp is famous for its NFS performance and it’s very well suited for mid-level workloads.

To be more specific, using thin provisioning you can create say 300GB virtual hard drive for particular VM and it will initially use no space. Then it will grow as long as you fill it. It can save you tremendous amount of storage space. Because you never exactly know ahead how much space you need. But be aware, if you will try to migrate thin provisioned virtual hard drive using storage migration plugin for VMware Virtual Center then it will fill all space. It means 300GB will use all 300GB even if it’s half-full.

The best article which will help you to integrate NetApp with VMware VI3 is NetApp TR-3428: NetApp and VMware Virtual Infrastructure 3 Storage Best Practices. What I will write here are basically excerpts from this article.

NetApp Configuration

Lets start from the NetApp configuration. First thing to do is to disable snapshots as usual. Generally it’s not a good idea to make snapshots of VMware virtual hard drives on the fly. They won’t be consistent. I will touch this topic in my later posts.

> snap sched <vol-name> 0 0 0
> snap reserve <vol-name> 0

Next step is to disable access time update on the volume, which is safe because VMware doesn’t rely on accurate access time for its files. It will increase performance, since Filer won’t need to update access time for files each time they are read or written.

> vol options <vol-name> no_atime_update on

Then configure the thin provisioning feature itself by switching volume auto size policy to on. It has two keys -m and -i. By -m you set maximum volume size and by -i you configure increment size.

> vol autosize <vol-name> [-m <size>[k|m|g|t]] [-i <size>[k|m|g|t]] on

NetApp recommends to disable Fractional Reserve for thin provisioned volumes, it’s just not needed anymore. Fractional Reserve guarantees successful writes to volumes in case you use snapshots. According to how snapshots work if you completely overwrite snapshot data you will use double amount of storage space. And it’s where Fractional Reserve comes into place. It reserves 100% of additional space for such cases. It means you will never run into situation when you are out of space due to active snapshots. But since we enabled auto size, our volume will resize on demand and Fractional Reserve becomes redundant. Supposedly auto size was implemented little bit later than Fractional Reserve and we have both of them in NetApp.

> vol options <vol-name> fractional_reserve 0

In case you use snapshots as a tool for instant VMware block level backups you can change auto delete policy. I said earlier that you should disable snapshot schedule, however you can manually (using scripts) create consistent snapshots. If you want to do that then you can additionally instruct NetApp to delete oldest snapshots when you are out of space on Filer and can’t auto grow volume.

> snap autodelete <vol-name> commitment try trigger volume target_free_space 5 delete_order oldest_first
> vol options <vol-name> try_first volume_grow

Now we need to create NFS export on NetApp Filer. It’s where FilerView interface comes handy. In short, you should give your ESX hosts read-write access, root access and configure Unix security style.

VMware Configuration

VMware configuration is trivial. Go to VMware Add Storage Wizard, select Network File System, then point to your NetApp filer and specify your volume path. Additionally NetApp recommends to tune NFS heartbeat parameters. Go to Host Configuration – Advanced Settings – NFS and for ESX 3.0 hosts change:

NFS.HeartbeatFrequency to 5 from 9
NFS.HeartbeatMaxFailures to 25 from 3

For ESX 3.5 hosts change:

NFS.HeartbeatFrequency to 12
NFS.HeartbeatMaxFailures to 10

There are much more information and tuning parameters that you might want to read about. Find some time to look through TR-3428 in case you need some clarifications or additional info.