Archive for the ‘VMware’ Category

vSphere Host Profiles: Using Customization Files

February 15, 2020

Overview

If you own vSphere Enterprise Plus licences, using vSphere Host Profiles is a no brainer. Even if you rarely add ESXi hosts to your cluster, why configure them by hand if you can do that by a few mouse clicks in a fast and consistent manner.

Host profiles are usually created by setting up one ESXi host according to your requirements and then capturing its state. Some settings in a host profile are unique to each host, which include the host name, VMkernel adapter network settings, user name for joining host to AD, etc. When you apply your profile to a new unprepared host, vCenter will ask you to specify these settings. This step is called host customization.

You can either type these settings manually or if you want to take your automation game one step further, you can use a customization file, which is simply the list of setting in .csv format.

This feature was first introduced in vSphere 6.5 and official documentation is a bit light on this topic. Purpose of this post is to close this gap by demonstrating where to find this configuration option.

Create

To create a customization file, right click on a ESXi host and choose Host Profiles > Export Host Customizations. This host has to have host profile already applied to it (including all customization settings), otherwise this option will be grayed out. This can be the first host you used to capture the original host profile.

Open .csv file in your editor of choice and change settings accordingly. If you adding multiple hosts to your cluster, you can write a script to generate multiple copies of this file for each new ESXi host you’re adding.

Apply

Host customization settings are specified (manually or using a customization file) when host profile is being applied to the host. So first right click on the host and choose Host Profiles > Attach Host Profile. Then on Customize hosts page import customization file by clicking on the Browse button:

Note: If you hit the “Host settings validation failed” error after applying host customizations, read my blog article here that explains the problem.

Conclusion

Pretty simple, isn’t it? Key is to not forget that customization file can be specified either when you are applying host profile or, alternatively, you can skip host customization step and use Host Profiles > Edit Host Customizations later. For host that doesn’t have a host profile associated with it, Edit Host Customizations option will always be greyed out.

Using VMware SDK Support

December 5, 2019

I predict this post will get a single hit over its lifetime, but if it helps at least one person desperately trying to find out how to open a VMware SDK support request, that’s good enough for me.

Quick Overview

Not everyone knows, but VMware, along with support for vSphere, NSX and all the other software products, also provides SDK and API support. If you are a partner developing a solution that integrates with a VMware product or even a customer writing your own vSphere plug-in using vSphere Management SDK, you can reach out to VMware for help.

It’s a paid service. You can find detailed description of it on its landing page here: VMware SDK and API Support

How to open SRs

One thing that is not very obvious about the SDK support is how to open support requests if you’re a customer. The goal of this short post is to demonstrate where to find it on VMware support portal:

  1. Log in to My VMware portal using your account credentials
  2. Under the Support section click Get Support
  3. On the opened page, under “Technical” category, choose your issue type, such as “Fault/Crash”
  4. In the provided list of Supported Products expand SDK Support Services
  5. Select VMware SDK Support
  6. Click Continue and proceed with describing your issue and opening the ticket, as usual

This is a screenshot of what it will look like, if your account has been entitled to SDK support:

If you’re working with SpringSource, there is also a range of support option under the SprinSource Open Source Support sub-category.

Conclusion

I’ve had only brief interaction with SDK Support team, but can only say good things about them. One of the examples was a question I had on parameter specifications of a particular vSphere Web Management SDK function and I not only got an answer to my question, but I was also provided with code snippets, which I didn’t even ask for. So if you are serious about using VMware SDKs and think you may require technical support, I can certainly recommend this service.

Multiple vCenter Connections in PowerCLI

November 30, 2019

Connect-VIServer is a PowerCLI cmdlet which most of the PowerCLI scripts out there start with. It creates a connection to a vCenter server, which you can then use to run queries against a particular vSphere environment.

If you only need one vCenter connection, you don’t need to worry about how your subsequent PowerCLI cmdlets are ordered. They will all use this single vCenter by default for all requests. But if you need multiple simultaneous connections, you have to be more careful with you code or you can accidentally end up running commands against a wrong vCenter. Not a situation you want to find yourself in, especially if you’re making changes to the environment.

Goal of this blog post is to explain PowerCLI behaviour, when multiple vCenter connections are used. And how to best write your code to avoid common mistakes.

vCenter server parameter

If you are connected to more than one vCenter, the easiest way to specify which vCenter to use in a particular PowerCLI cmdlet is by using the -Server parameter. For example:

Get-VM -Server vcenter1.domain.local

By doing so, you will avoid any ambiguity in your code, since vCenter is always specified explicitly. This is the theory. In reality, I find, you will always have that one command where you forgot to provide -Server parameter, which will cause problems when you least expect it.

DefaultVIServer variable

If you’ve done any PowerCLI scripting before, you’ve most likely come across the $global:DefaultVIServer variable. When you connect to a vCenter using Connect-VIServer cmdlet, DefaultVIServer variable is set to that vCenter name. That way when you run PowerCLI cmdlets without using -Server parameter, they implicitly use vCenter from the DefaultVIServer variable as a target for the query. When you disconnect from the vCenter using Disconnect-VIServer, this variable is emptied.

The challenge with this approach is that the DefaultVIServer variable can only hold one vCenter name. If you connect/disconnect to/from multiple vCenter servers, you may end up in a situation where DefaultVIServer variable becomes empty, even though you still have an active vCenter connection. It’s easier to demonstrate it in the following script output:

Connecting to vCenter server vcenter1.domain.local
DefaultVIServer value: vcenter1.domain.local
Connecting to vCenter server vcenter2.domain.local
DefaultVIServer value: vcenter2.domain.local
Disconnecting from vCenter server vcenter2.domain.local
DefaultVIServer value:
Get-VM : 29/11/2019 6:17:41 PM Get-VM You are not currently connected to any servers. Please connect first using a Connect cmdlet.

As you can see, after we disconnect from the current default vCenter server vcenter2.domain.local, the variable becomes blank, until we connect to any other vCenter again. As a result, Get-VM cmdlet fails.

The error message is misleading. Connection is there and you can still run commands against the vcenter1.domain.local by specifying it in the -Server parameter. However, it defeats the purpose of DefaultVIServer variable, when using multiple simultaneous vCenter connections.

There is a way to fix that.

DefaultVIServer mode

This leads us up to the third option – changing DefaultVIServer mode. PowerCLI supports multiple default vCenter servers if you change DefaultVIServer to Multiple (default is Single):

Set-PowerCLIConfiguration -DefaultVIServerMode Multiple

As a result of this change, PowerCLI will start using DefaultVIServers array, which tracks all currently connected vCenters.

There are two implications of using Multiple connection mode:

  1. If you run a PowerCLI cmdlet without the -Server parameter, it will run against all connected vCenters at the same time – which is fine, as long as this is what you want.
  2. If you expect other users to run this script and they use Single connection mode, it can break your script. If that’s the case, make sure to explicitly set DefaultVIServer mode in the beginning of your script to avoid any unexpected behaviour.

Conclusion

Each option has its pros and cons. It’s up to you to choose what works best in your particular situation. But if you asked me, I would recommend disconnecting vCenter server session as soon as you no longer needed it. That way you can avoid any potential ambiguity in your code. Use multiple simultaneous vCenter connections only if you absolutely need it.

Creating vRealize Operations Manager Alerts Using REST API

September 11, 2018

Whenever I’m faced with a repetitive configuration task, I search for ways to automate it. There’s nothing more boring than sitting and clicking through the GUI for hours performing the same thing over and over again.

These days most of the products I work with support REST API interface, so scripting has become my solution of choice. But scripting requires you to know a scripting language, such as PowerShell, certain SDKs and APIs, like PowerCLI and REST and more importantly – time to write the script and test it. If you’re going to use this script regularly, in the long-term it’s worth the effort . But what if it’s a one-off task? You may well end up spending more time writing a script, than it takes to perform the task manually. In this case there are more practical ways to improve your efficiency. One of such ways is to use developer tools like Postman.

The idea is that you can write a REST request that applies a certain configuration setting and use it as a template to make multiple calls by slightly tweaking the parameters. You would have to change the parameters manually for each request, which is not as elegant as providing an array of variables to a script, but still much quicker than clicking through the GUI.

Recently I worked on a VMware Validated Design (VVD) deployment for a customer, which required configuring dozens of vRealize Operations Manager alerts as part of the build. So I will use it as an example to demonstrate how you can save yourself hours by doing it in Postman, instead of GUI.

Collect Alert Properties

To create an alert in vROps you will need to specify certain alert properties in the REST API call body. You will need at least:

  • “pluginId” – ID of the outbound plugin, which is usually the Standard Email Plugin
  • “emailaddr” – recipient email address
  • “values” property under the alertDefinitionIdFilters XML element – this is the alert definition ID
  • “resourceKind” – resource that the alert is applicable for, such as VirtualMachine, Datastore, etc.
  • “adapterKind” – this is the adapter that the alert comes from, such VMWARE, NSX, etc.

To determine the pluginId you will need to configure an outbound plugin in vROps and then make the following GET call to get the ID:

To find values for alert definition, resource kind and adapter kind properties, make the following get call and search for the alert name in the results:

Create Alert in vROps

To create an alert in vROps, you will need to make a POST call to the following URI in XML format:

  • Use the following request URL: https://vrops-hostname/suite-api/api/notifications/rules
  • Click on Headers tab and specify the following key “Content-Type” and value “application/xml”
  • Click on Body tab and choose raw, in the drop-down choose “XML (application/xml)”
  • Copy the following XML request to the body and use it as a template
<ops:notification-rule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ops="http://webservice.vmware.com/vRealizeOpsMgr/1.0/">
<ops:name>
No data received for Windows platform
</ops:name>
<ops:pluginId>c5f60db9-eb5b-47c1-8545-8ba573c7d289</ops:pluginId>
<ops:alertControlStates/>
<ops:alertStatuses/>
<ops:criticalities/>
<ops:resourceKindFilter>
<ops:resourceKind>Windows</ops:resourceKind>
<ops:adapterKind>EP Ops Adapter</ops:adapterKind>
</ops:resourceKindFilter>
<ops:alertDefinitionIdFilters>
<ops:values>AlertDefinition-EP Ops
Adapter-Alert-system-availability-Windows</ops:values>
</ops:alertDefinitionIdFilters>
<ops:properties name="emailaddr">vrops@corp.local</ops:properties>
</ops:notification-rule>

As described before, make sure to replace the following properties with your own values: “pluginId”, “values” property under the alertDefinitionIdFilters XML element, resourceKind, adapterKind and emailaddr.

As a result of the REST API call you will get an alert created in vROps:

For every other alert you can keep the plugin ID and email address the same and update only the alert definition, resouce kind and adapter kind.

Conclusion

By using the same REST call and changing properties for each alert accordingly, I was able to finish the job much quicker and avoided hours of pain of clicking through the GUI. As long as you have a REST API endpoint to work with, the same approach can be applied to any repetitive task.

If you’d like to learn more, make sure to check out VMware {code} project here for more information about VMware product APIs and SDKs.

vSphere 6 Dump / Syslog Collector: PowerCLI Script

November 17, 2015

This is a quick update for a post I previously wrote on configuring vSphere 5 Syslog and Network Dump Collectors. You can find it here. This post will be about the changes in version 6.

Scripts I reposted for version 5 no longer work for version 6, so I thought I’d do an update. If you’re looking just for the updated scripts, simply scroll down to the end of the post.

What’s new in vSphere 6

If you look at the scripts all that’s changed is the order and number of the arguments. Which is not overly exciting.

What’s more interesting is that with vSphere 6 Syslog and ESXi Dump Collectors are no longer a separate install. They’re bundled with vCenter and you won’t see them as separate line items in the vCenter installer.

What I’ve also noticed is that ESXi Dump Collector service is not started automatically, so make sure to go to the services on the vCenter VM and start it manually.

Dump Collector vCenter plugin doesn’t seem to exist any more as well. But you are still able to see Syslog Collector settings in vCenter.

syslog_dump_collectors

Another thing worth mentioning here is also the directories where the logs and dumps are kept. In vCenter 6 they can be found by these paths:

C:\ProgramData\VMware\vCenterServer\data\vmsyslogcollector

C:\ProgramData\VMware\vCenterServer\data\netdump\Data

 

PowerShell Get-EsxCli Cmdlet

Also want to quickly touch on the fact that the below scripts are written using the Get-EsxCli cmdlet to get a EsxCli object and then directly invoke its methods.  Which I find not very ideal, as it’s not clear what each of the arguments actually mean and because the script gets broken every time the number or order of the arguments changes. Which is exactly what’s happened here.

There are Set-VMHostSyslogConfig and Set-VMHostDumpCollector cmdlets, which use argument names such as -SyslogServer and -Protocol, which are self explanatory. I may end up rewriting these scripts if I have time. But at the end of the day both ways will get the job done.

Maybe one hint is if you’re lost and not sure about the order of the arguments, run this cmdlet on a EsxCli object to find out what each argument actually mean:

$esxcli.system.coredump.network | Get-Member

get-member

ESXi Dump Collector PowerCLI script:

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.coredump.network.get()
}

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.coredump.network.set($null, “vmk0”, $null, “10.10.10.10”, 6500);
$esxcli.system.coredump.network.set($true)
}

There are a couple commands to check the ESXi Dump Collector configuration, as it’s not always clear if it’s able to write a core dump until a PSOD actually happens.

First command checks if Dump Collector service on a ESXi host can connect to the Dump Collector server and the second one actually forces ESXi host to purple screen if you want to be 100% sure that a core dump is able to be written. Make sure to put the ESXi host into maintenance mode if you want to go that far.

# esxcli system coredump network check

# vsish
# set /reliability/crashMe/Panic

Syslog Collector PowerCLI script:

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.syslog.config.get()
}

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.syslog.config.set($null, $null , $null, $null, $null, $null, $null, $null, “udp://vcenter.domain.local:514”, $null, $null);
$esxcli.network.firewall.ruleset.set($null, $true, “syslog”)
$esxcli.system.syslog.reload()
}

For the Syslog Collector it’s important to remember that there’s a firewall rule on each ESXi host, which needs to be enabled (the firewall ruleset command in the script).

For the Dump Collector there’s no firewall rule. So if you looking for it and can’t find, it’s normal to not have it by default.

vSphere Dump / Syslog Collector: PowerCLI Script

March 12, 2015

Overview

If you install ESXi hosts on say 2GB flash cards in your blades which are smaller than required 6GB, then you won’t have what’s called persistent storage on your hosts. Both your kernel dumps and logs will be kept on RAM drive and deleted after a reboot. Which is less than ideal.

You can use vSphere Dump Collector and Syslog Collector to redirect them to another host. Usually vCenter machine, if it’s not an appliance.

If you have a bunch of ESXi hosts you’ll have to manually go through each one of them to set the settings, which might be a tedious task. Syslog can be done via Host Profiles, but Enterprise Plus licence is not a very common things across the customers. The simplest way is to use PowerCLI.

Amendments to the scripts

These scripts originate from Mike Laverick’s blog. I didn’t write them. Original blog post is here: Back To Basics: Installing Other Optional vCenter 5.5 Services.

The purpose of my post is to make a few corrections to the original Syslog script, as it has a few mistakes:

First – typo in system.syslog.config.set() statement. It requires additional $null argument before the hostname. If you run it as is you will probably get an error which looks like this.

Message: A specified parameter was not correct.
argument[0];
InnerText: argument[0]

Second – you need to open outgoing syslog ports, otherwise traffic won’t flow. It seems that Dump Collector traffic is enabled by default even though there is no rule for it in the firewall (former netDump rule doesn’t exist anymore). Odd, but that’s how it is. Syslog on the other hand requires explicit rule, which is reflected in the script by network.firewall.ruleset.set() command.

Below are the correct versions of both scripts. If you copy and paste them everything should just work.

vSphere Dump Collector

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.coredump.network.get()
}

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.coredump.network.set($null, “vmk0”, “10.0.0.1”, “6500”)
$esxcli.system.coredump.network.set($true)
}

vSphere Syslog Collector

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.syslog.config.get()
}

Foreach ($vmhost in (get-vmhost))
{
$esxcli = Get-EsxCli -vmhost $vmhost
$esxcli.system.syslog.config.set($null, $null, $null, $null, $null, “udp://10.0.0.1:514”)
$esxcli.network.firewall.ruleset.set($null, $true, “syslog”)
$esxcli.system.syslog.reload()
}

How to move aggregates between NetApp controllers

September 25, 2013

Stop Sign_91602

 

DISCLAMER: I ACCEPT NO RESPONSIBILITY FOR ANY DAMAGE OR CORRUPTION OF DATA THAT MAY OCCUR AS A RESULT OF CARRYING OUT STEPS DESCRIBED BELOW. YOU DO THIS AT YOUR OWN RISK.

 

We had an issue with high CPU usage on one of the NetApp controllers servicing a couple of NFS datastores to VMware ESX cluster. HA pair of FAS2050 had two shelves, both of them owned by the first controller. The obvious solution for us was to reassign disks from one of the shelves to the other controller to balance the load. But how do you do this non-disruptively? Here is the plan.

In our setup we had two controllers (filer1, filer2), two shelves (shelf1, shelf2) both assigned to filer1. And two aggregates, each on its own shelf (aggr0 on shelf0, aggr1 on shelf1). Say, we want to reassign disks from shelf2 to filer2.

First step is to migrate all of the VMs from the shelf2 to shelf1. Because operation is obviously disruptive to the hosts accessing data from the target shelf. Once all VMs are evacuated, offline all volumes and an aggregate, to prevent any data corruption (you can’t take aggregate offline from online state, so change it to restricted first).

If you prefer to reassign disks in two steps, as described in NetApp Professional Services Tech Note #021: Changing Disk Ownership, don’t forget to disable automatic ownership assignment on both controllers, otherwise disks will be assigned back to the same controller again, right after you unown them:

> options disk.auto_assign off

It’s not necessary if you change ownership in one step as shown below.

Next step is to actually reassign the disks. Since they are already part of an aggregate you will need to force the ownership change:

filer1> disk assign 1b.01.00 -o filer2 -f

filer1> disk assign 1b.01.01 -o filer2 -f

filer1> disk assign 1b.01.nn -o filer2 -f

If you do not force disk reassignment you will get an error:

Assign request failed for disk 1b.01.0. Reason:Disk is part of a failed or offline aggregate or volume. Changing its owner may prevent aggregate or volume from coming back online. Ownership may be changed only by using the appropriate force option.

When all disks are moved across to filer2, new aggregate will show up in the list of aggregates on filer2 and you’ll be able to bring it online. If you can’t see the aggregate, force filer to rescan the drives by running:

filer2> disk show

The old aggregate will still be seen in the list on filer1. You can safely remove it:

filer1> aggr destroy aggr1

ESX root Password Complexity Workaround

August 30, 2013

ESX server enforces complexity requirements on passwords and if the one you want to set up doesn’t meet them, password change will fail with something like that:

Weak password: not enough different characters or classes for this length. Try again.

You can obviously play with PAM settings to lower the requirements, but here the the tip on how to really quickly workaround that.

Simply generate a hash for you password using the following command:

# openssl passwd -1

And then replace the root password hash in /etc/shadow with the new one.

From my experience on ESX 4.1, you won’t even need to reconnect the host to the vCenter. It will continue working just fine.

NetApp VSC Single File Restore Explained

August 5, 2013

netapp_dpIn one of my previous posts I spoke about three basic types of NetApp Virtual Storage Console restores: datastore restore, VM restore and backup mount. The last and the least used feature, but very underrated, is the Single File Restore (SFR), which lets you restore single files from VM backups. You can do the same thing by mounting the backup, connecting vmdk to VM and restore files. But SFR is a more convenient way to do this.

Workflow

SFR is pretty much an out-of-the-box feature and is installed with VSC. When you create an SFR session, you specify an email address, where VSC sends an .sfr file and a link to Restore Agent. Restore Agent is a separate application which you install into VM, where you want restore files to (destination VM). You load the .sfr file into Restore Agent and from there you are able to mount source VM .vmdks and map them to OS.

VSC uses the same LUN cloning feature here. When you click “Mount” in Restore Agent – LUN is cloned, mapped to an ESX host and disk is connected to VM on the fly. You copy all the data you want, then click “Dismount” and LUN clone is destroyed.

Restore Types

There are two types of SFR restores: Self-Service and Limited Self-Service. The only difference between them is that when you create a Self-Service session, user can choose the backup. With Limited Self-Service, backup is chosen by admin during creation of SFR session. The latter one is used when destination VM doesn’t have connection to SMVI server, which means that Remote Agent cannot communicate with SMVI and control the mount process. Similarly, LUN clone is deleted only when you delete the SFR session and not when you dismount all .vmdks.

There is another restore type, mentioned in NetApp documentation, which is called Administartor Assisted restore. It’s hard to say what NetApp means by that. I think its workflow is same as for Self-Service, but administrator sends the .sfr link to himself and do all the job. And it brings a bit of confusion, because there is an “Admin Assisted” column on SFR setup tab. And what it actually does, I believe, is when Port Group is configured as Admin Assisted, it forces SFR to create a Limited Self-Service session every time you create an SFR job. You won’t have an option to choose Self-Assisted at all. So if you have port groups that don’t have connectivity to VSC, check the Admin Assisted option next to them.

Notes

Keep in mind that SFR doesn’t support VM’s with IDE drives. If you try to create SFR session for VMs which have IDE virtual hard drives connected, you will see all sorts of errors.

Monitoring ESX Storage Queues

July 30, 2013

6a00d8341c328153ef01774354e2fd970d-500wiQueue Limits

I/O data goes through several storage queues on its way to disk drives. VMware is responsible for VM queue, LUN queue and HBA queue. VM and LUN queues are usually equal to 32 operations. It means that each ESX host at any moment can have no more than 32 active operations to a LUN. Same is true for VMs. Each VM can have as many as 32 active operations to a datastore. And if multiple VMs share the same datastore, their combined I/O flow can’t go over the 32 operations limit (per LUN queue for QLogic HBAs has been increased from 32 to 64 operations in vSphere 5). HBA queue size is much bigger and can hold several thousand operations (4096 for QLogic, however I can see in my config that driver is configured with 1014 operations).

Queue Monitoring

You can monitor storage queues of ESX host from the console. Run “esxtop”, press “d” to view disk adapter stats, then press “f” to open fields selection and add Queue Stats by pressing “d”.

AQLEN column will show the queue depth of the storage adapter. CMDS/s is the real-time number of IOPS. DAVG is the latency which comes from the frame traversing through the “driver – HBA – fabric – array SP” path and should be less than 20ms. Otherwise it means that storage is not coping. KAVG shows the time which operation spent in hypervisor kernel queue and should be less than 2ms.

Press “u” to see disk device statistics. Press “f” to open the add or remove fields dialog and select Queue Stats “f”. Here you’ll see a number of active (ACTV) and queue (QUED) operations per LUN.  %USD is the queue load. If you’re hitting 100 in %USD and see operations under QUED column, then again it means that your storage cannot manage the load an you need to redistribute your workload between spindles.

Some useful documents: