Posts Tagged ‘storage vMotion’

Requirements for Unmounting a VMware Datastore

December 30, 2015

I have come across issues unmounting VMware datastores myself multiple times. In recent vSphere versions vCenter shows you a warning if some of the requirements are not fulfilled. It is not the case in the older vSphere versions, which makes it harder to identify the issue.

Interestingly, there are some pre-requisites which even vCenter does not prompt you about. I will discuss all of the requirements in this post.

General Requirements

In this category I combine all requirements which vCenter checks against, such as:

Requirement: No virtual machine resides on the datastore.

Action: You have to make sure that the host you are unmounting the datastore from has no virtual machines (running or stopped) registered on this datastore.  If you are unmounting just one datastore from just one host, you can simply vMotion all VMs residing on the datastore from this host to the remaining hosts. If you are unmounting the datastore from all hosts, you’ll have to either Storage vMotion all VMs to the remaining datastores or shutdown the VMs and unregister them from vCenter.

unmount_vmfs2

Requirement: The datastore is not part of a Datastore Cluster.

Requirement: The datastore is not managed by storage DRS.

Action: Drag and drop the datastore from the Datastore Cluster in vCenter to move it out of the Datastore Cluster. Second requirement is redundant, because SDRS is enabled on a datastore which is configured withing a Datastore Cluster. By removing a datastore from a Datastore Cluster you atomatically disable storage DRS on it.

Requirement: Storage I/O control is disabled for this datastore.

Action: Go to the datastore properties and uncheck Storage I/O Control option. On a SIOC-enabled datastore vSphere creates a folder named after the block device ID and keeps a file called “slotsfile” in it. Its size will change to 0.00 KB once SIOC is disabled.

Requirement: The datastore is not used for vSphere HA heartbeat.

Action: vSphere HA automatically selects two VMware datastores, creates .vSphere-HA folders and use them to keep HA heartbeats. If you have more than two datastores in your cluster, you can control which datastores are selected. Go to cluster properties > Datastore Heartbeating (under vSphere HA section) and select preferred datastores from the list. This will work if you are unmounting one datastore. If you need to unmount all datastores, you will have to disable HA on the cluster level altogether.

datastore_heartbeat

Additional Requirements

Requirements which fall in this category are not checked by vCenter, but are still have to be satisfied. Otherwise vCenter will not let you unmount the datastore.

Requirement: The datastore is not used for swap.

Action: When VM is powered on by default it creates a swap file in the VM directory with .vswp extension. You can change the default behavior and on a per host basis select a dedicated datastore where host will be creating swap files for virtual machines. This setting is enabled in cluster properties in Swapfile Location section. The datastore is then selected for each host in Virtual Machine Swapfile Location settings on the the host configuration tab.

What host also does when you enable this option is it creates a host local swap file, which is named something like this: sysSwap-hls-55de2f14-6c5d-4d50-5cdf-000c296fc6a7.swp

There are scenarios where you need to unmount the swap datastore, such as when you say need to reconnect all of your storage from FC to iSCSI. Even if you shutdown all of your VMs, datastore unmount will fail because the host swap files are still there and you will see an error such as this:

The resource ‘Datastore Name: iSCSI1 VMFS uuid: 55de473c-7f3ae2b5-f9f8-000c29ba113a’ is in use.

See the error stack for details on the cause of the problem.

Error Stack:

Call “HostStorageSystem.UnmountVmfsVolume” for object “storageSystem-29” on vCenter Server “VC.lab.local” failed.

Cannot unmount volume ‘Datastore Name: iSCSI1 VMFS uuid: 55de473c-7f3ae2b5-f9f8-000c29ba113a’ because file system is busy. Correct the problem to retry the operation.

The workaround is to change the setting on the cluster level to store VM swap file in VM directory and reboot all hosts. After a reboot the host .swp file will disappear.

If rebooting the hosts is not desirable, you can SSH to each host and type the following command:

# esxcli sched swap system set –hostlocalswap-enabled false

To confirm that the change has taken effect run:

# esxcli sched swap system get

Then check the datastore and the .swp files should no longer be there.

Conclusion

If you satisfy all of the above requirements you should have no problems when unmounting VMware datastores. vSphere creates a few additional system folders on each of the datastores, such as .sdd.sf and .dvsData, but I personally have never had issues with them.

Advertisements

Magic behind NetApp VSC Backup/Restore

June 12, 2013

netapp_dpNetApp Virtual Storage Console is a plug-in for VMware vCenter which provides capabilities to perform instant backup/restore using NetApp snapshots. It uses several underlying NetApp features to accomplish its tasks, which I want to describe here.

Backup Process

When you configure a backup job in VSC, what VSC does, is it simply creates a NetApp snapshot for a target volume on a NetApp filer. Interestingly, if you have two VMFS datastores inside one volume, then both LUNs will be snapshotted, since snapshots are done on the volume level. But during the datastore restore, the second volume will be left intact. You would think that if VSC reverts the volume to the previously made snapshot, then both datastores should be affected, but that’s not the case, because VSC uses Single File SnapRestore to restore the LUN (this will be explained below). Creating several VMFS LUNs inside one volume is not a best practice. But it’s good to know that VSC works correctly in this case.

Same thing for VMs. There is no sense of backing up one VM in a datastore, because VSC will make a volume snapshot anyway. Backup the whole datastore in that case.

Datastore Restore

After a backup is done, you have three restore options. The first and least useful kind is a datastore restore. The only use case for such restore that I can think of is disaster recovery. But usually disaster recovery procedures are separate from backups and are based on replication to a disaster recovery site.

VSC uses NetApp’s Single File SnapRestore (SFSR) feature to restore a datastore. In case of a SAN implementation, SFSR reverts only the required LUN from snapshot to its previous state instead of the whole volume. My guess is that SnapRestore uses LUN clone/split functionality in background, to create new LUN from the snapshot, then swap the old with the new and then delete the old. But I haven’t found a clear answer to that question.

For that functionality to work, you need a SnapRestore license. In fact, you can do the same trick manually by issuing a SnapRestore command:

> snap restore -t file -s nightly.0 /vol/vol_name/vmfs_lun_name

If you have only one LUN in the volume (and you have to), then you can simply restore the whole volume with the same effect:

> snap restore -t vol -s nightly.0 /vol/vol_name

VM Restore

VM restore is also a bit controversial way of restoring data. Because it completely removes the old VM. There is no way to keep the old .vmdks. You can use another datastore for particular virtual hard drives to restore, but it doesn’t keep the old .vmdks even in this case.

VSC uses another mechanism to perform VM restore. It creates a LUN clone (don’t confuse with FlexClone,which is a volume cloning feature) from a snapshot. LUN clone doesn’t use any additional space on the filer, because its data is mapped to the blocks which sit inside the snapshot. Then VSC maps the new LUN to the ESXi host, which you specify in the restore job wizard. When datastore is accessible to the ESXi host, VSC simply removes the old VMDKs and performs a storage vMotion from the clone to the active datastore (or the one you specify in the job). Then clone is removed as part of a clean up process.

The equivalent cli command for that is:

> lun clone create /vol/clone_vol_name -o noreserve -b /vol/vol_name nightly.0

Backup Mount

Probably the most useful way of recovery. VSC allows you to mount the backup to a particular ESXi host and do whatever you want with the .vmdks. After the mount you can connect a virtual disk to the same or another virtual machine and recover the data you need.

If you want to connect the disk to the original VM, make sure you changed the disk UUID, otherwise VM won’t boot. Connect to the ESXi console and run:

# vmkfstools -J setuuid /vmfs/volumes/datastore/VM/vm.vmdk

Backup mount uses the same LUN cloning feature. LUN is cloned from a snapshot and is connected as a datastore. After an unmount LUN clone is destroyed.

Some Notes

VSC doesn’t do a good cleanup after a restore. As part of the LUN mapping to the ESXi hosts, VSC creates new igroups on the NetApp filer, which it doesn’t delete after the restore is completed.

What’s more interesting, when you restore a VM, VSC deletes .vmdks of the old VM, but leaves all the other files: .vmx, .log, .nvram, etc. in place. Instead of completely substituting VM’s folder, it creates a new folder vmname_1 and copies everything into it. So if you use VSC now and then, you will have these old folders left behind.

Limiting the number of concurrent storage vMotions

June 6, 2013

vmw-dgrm-vsphr-087b-diagram1VMware vCenter allows several concurrent storage vMotions on a datastore. But it can negatively impact your production environment, by hammering your underlying storage. If you want to migrate several virtual machines to another datastore, it’s much safer to do that one by one. But it’s too much manual work.

There is a simple way to limit the number of concurrent storage vMotions by configuring vCenter advanced settings. There are a group of resource management parameters for network, host and datastore limits which apply to vMotion and Storage vMotion. They are called limits and costs. For ESXi 4.1 default datastore limit for migration with Storage vMotion is 128. And datastore resource cost for Storage vMotion is 16 (defaults for other versions of ESXi can be found here: Limits on Simultaneous Migrations). It basically means that 8 concurrent storage vMotions is allowed for each datastore. So to allow only one storage vMotion at a time you can either change the limit to 16 or cost to 128.

Lets say we choose to change the cost to 128. There are two ways of doing it. The first one is to edit vCenter vpxd.cfg file and add the following stanza between <vpxd></vpxd> tags:

<ResourceManager>
<CostPerEsx41SVmotion>128</CostPerEsx41SVmotion>
</ResourceManager>

The second simpler one way is to edit vCenter -> Administration -> vCenter Server Settings -> Advanced Settings and add config.vpxd.ResourceManager.CostPerEsx41SVmotion key with value equal to 128. You will probably need to reboot vCenter after that.

There is one moment, however. If you migrate VMs from say 3 source datastores to 1 destination, then 3 concurrent storage vMotion will kick off. I do not know what is the reason for that, but that’s what I found from the practice.