Run CLI Commands on NSX Manager Using REST API

August 29, 2019

Over the last few years I’ve had a chance to work with NSX-V REST APIs in many different shapes and forms. Directly from vRealize Orchestrator and PowerShell/PowerNSX, indirectly from vRealize Automation or simply by making calls from Postman, which is sometimes required during NSX deployment and upgrades.

To date I haven’t been able to find any gaps in the API and can say only good things about it. It is very well documented. You can find detailed descriptions of all requests in NSX API Guide PDF or interactively browse it in API explorer on https://code.vmware.com.

But at the end of the day, NSX REST API is only a subset of what you can do from CLI and there are situations where it’s not sufficient. I’ll give you an example. Let’s say you want to know how much storage is available on NSX Manager appliance log partition. There’s a REST API call, which will give you a response similar to this:

GET https://nsxm/api/1.0/appliance-management/system/storageinfo

<storageInfo>
  <totalStorage>86G</totalStorage>
  <usedStorage>22G</usedStorage>
  <freeStorage>64G</freeStorage>
  <usedPercentage>25</usedPercentage>
</storageInfo>

As you can see, it answers the question of how much total space is available on the appliance, but doesn’t provide a full per partition breakdown available from the CLI via “show filesystem”:

Filesystem      Size  Used Avail Use% Mounted on
/dev/root       5.6G  1.2G  4.1G  23% /
tmpfs           7.9G  232K  7.9G   1% /run
devtmpfs        7.9G     0  7.9G   0% /dev
/dev/sda6        44G   19G   24G  44% /common
/dev/loop0       16G   45M   15G   1% /common/vdisk_mnt

So what are the options here? What is not widely known is that you can use NSX central command-line interface to remotely invoke appliance CLI commands using REST API.

Invoking CLI Commands

NSX REST API has a handy POST call https://nsxm/api/1.0/nsx/cli?action=execute. All you need to provide in addition to Authorization credentials using “Basic Auth” option is the following body in XML format:

<nsxcli>
  <command>show filesystem</command>
</nsxcli>

In response you will get a body in “text/plain” format, which is the only drawback of this method. You will need to parse the response in your scripting language of choice. In PowerShell, if you made the original call using Invoke-WebRequest cmdlet and saved it into the $response variable, it can look something like this:

$responseRows = $response.Content -split "`n"
foreach($row in $responseRows) {
  if($row -Like "*/dev/sda6*") {
    $pctUsed = $row.Split(" ",[StringSplitOptions]"RemoveEmptyEntries")[4]
    $pctUsedValue = $pctUsed.Substring(0, $pctUsed.Length-1)
    Write-Host "Space utilization on the log partition is $pctUsed."
    break
  }
}

Conclusion

For most use cases NSX REST API provides all the necessary information about NSX component configuration in structured JSON or XML format. This method is more of an exception rather than a rule, but it’s a nice tool to have in your tool belt, when you run out of options.

Advertisements

vSphere 6.0 REST API: A History Lesson

August 23, 2019

I’m glad to see how VMware products are becoming more and more automation-focused these days. NSX has always had rich REST API capabilities, which I can’t complain about. And vSphere is now starting to catch up. vSphere 6.5 was the first release where REST API started getting much more attention. See these official blog posts for example:

But not many people know that vSphere 6.5 wasn’t the first release where REST API became available. Check this forum thread on VMTN “Does vCenter 6.0 support RESTFUL api?”:

I think its only supported for 6.5 as below blogs has a customer asked the same question and reply is no..

It’s not entirely true, even though I know why the OP got a “No” answer. Let me explain.

vSphere 6.0 REST API

VMware started to make first steps towards REST API starting from 6.0 release. If you have a legacy vSphere 6.0 environment you can play with, you can easily test that by opening the following URL:

https://vcenter/rest/com/vmware/vapi/metadata/cli/command

You will get a long list of commands available in 6.0 release:

It may look impressive, but if you look closely you will quickly notice that they are all Content Library or Tagging related. Quote from the referenced blog post:

VMware vCenter has received some new extensions to its REST based API. In vSphere 6.0, this API set provides the ability to manage the Content Library and Tagging but now also includes the ability to manage and configure the vCenter Server Appliance (VCSA) based functionality and basic VM management.

That is right, in vSphere 6.0 REST API is very limited, you won’t get inventory data, backup or update API. All you can do is manage Content Library and Tagging, which, frankly, is not very practical.

Making REST API Calls

If Content Library and Tagging use cases are applicable to you or you are just feeling adventurous this is an example of how you can make a call to vSphere 6.0 REST API via Postman.

All calls are POST-based and action (get, list, create, etc.) is specified as a parameter, so pay close attention to request format.

First you will need to generate authentication token by making a POST call to https://vcenter/rest/com/vmware/cis/session, using “Basic Auth” for Authorization and you will get a token in response:

Then change Authorization to “No auth” and specify the token in “vmware-api-session-id” header in your next call. In this example I’m getting a list of all content libraries (you will obviously get an empty response if you haven’t actually created one):

Some commands require a body, to determine the body format use the following POST call to https://vcenter/rest/com/vmware/vapi/metadata/cli/command?~action=get, with the following body in JSON format:

{
	"identity": {
        "name": "get",
        "path": "com.vmware.content.library"
	}
}

Where “path” is the operation and “name” is the action from the https://vcenter/rest/com/vmware/vapi/metadata/cli/command call above.

If you’re looking for more detailed information, I found this blog post by Mitch Tulloch very useful:

Conclusion

There you have it. vSphere 6.0 does support REST API, it’s just not very useful, that’s why no one talks about it.

This blog post won’t help you if you are stuck in a stone age and need to manage vSphere 6.0 via REST API, but it at least gives you a definitive answer of whether REST API is supported in vSphere 6.0 and what you can do with it.

If you do find yourself in such situation, I recommend to fall back on PowerCLI, if possible.

Connecting to PostgreSQL Database Backing VMware Products

August 19, 2019

Most of the VMware products these days are standardised on PostgreSQL. Yes, you can still deploy vCenter for Windows, for instance, and use MS SQL or Oracle as a back-end database, but it’s now deprecated and vSphere 6.7 is the last release where it’s supported. Other products, like vRealize Automation are moving in the same direction.

VCSA, vRA, vRO are all distributed as appliances and shouldn’t be modified in any way by the end user. But I’ve had times before when I needed to directly connect to the PostgreSQL database to better understand certain parts of the product. One of the recent examples was encryption in vRO. I needed to ensure that the passwords I save in SecureString attributes (the ones shown as asterisks) in my workflows are not kept as plain text in vRO. So let’s see how I validated this assumption by looking at the vRO database.

vRO Database

I first SSH’ed into the appliance and connected to the database using PostgreSQL interactive terminal:

# psql vmware postgres

I then listed all database table names:

> SELECT * FROM pg_catalog.pg_tables;

When I found the table I was looking for, I listed its contents:

> SELECT * FROM vmo_workflowcontent;

And simply searched for my attribute name in the output, which was encrypted indeed.

Exporting the Database

You won’t always know what table you’re looking for, so the easiest way to go about it is to simply export the whole database in plain text and use search in a text file:

# su -m -c “/opt/vmware/vpostgres/current/bin/pg_dump -Fp vmware > /tmp/vmware.sql” postgres

“-Fp” here is for plain text (default is custom format, which is compressed), “vmware” is the database and “postgres” is the user.

VCSA and vRA Databases

You will find that database names aren’t the same for different products, for instance vCenter’s database name is “VCDB” (capital letters) and vRA is “vcac” (username is also “vcac”). So if you need to connect to VCSA database you will use the following syntax:

# psql VCDB postgres

For vRA it will look like this:

# psql vcac vcac

Then you can use the same approach demonstrated for vRO to read table data or simply export the whole database.

Conclusion

I hope it helps you with your tinkering adventures. Just make sure to use this only for research and not change anything in the database, unless specifically advised by GSS.

NSX Optimistic Locking and PowerNSX

August 3, 2019

Recently, when working on some NSX-V automation, I came across an interesting issue, which I want to discuss here, since there’s almost no information on the Internet (while I’m writing this), that would help to solve it or even point you in the right direction. It has to do with PowerNSX and Optimistic Locking in NSX (which technically is not even a locking mechanism), but let’s start from the beginning.

If you ever used PowerNSX module to automate NSX via PowerShell you noticed that most of the code examples use pipelines to run PowerNSX cmdlets. So instead of using variables, like so:

$Edge = Get-NsxEdge vRA7 _ edge
$LoadBalancer = Get-NsxLoadBalancer -Edge $Edge
Set-NsxLoadBalancer -LoadBalancer $LoadBalancer -enabled
New-NsxLoadBalancerApplicationProfile -LoadBalancer $LoadBalancer -Name $WebAppProfileName -Type $VipProtocol –SslPassthrough

all commands are run this way instead:

Get-NsxEdge vRA7_edge | Get-NsxLoadBalancer | Set-NsxLoadBalancer -enabled
Get-NsxEdge vRA7_dge | Get-NsxLoadBalancer | New-NsxLoadBalancerApplicationProfile -Name $WebAppProfileName -Type $VipProtocol –SslPassthrough

What’s the difference you may ask, besides the fact the the second variant is slower, because you retrieve edge and load balancer objects multiple times, instead of once, compared to the first variant? There’s actually a strong reason for it. More specifically, it is the following error that you gonna get if you don’t use pipelines:

invoke-nsxwebrequest : Invoke-NsxWebRequest : The NSX API response received indicates a failure. 409 : Conflict : Response Body: {“errorCode”:101, “details”:”Concurrent object access error. Refresh UI or fetch the latest copy of the object and retry the operation.”, “rootCauseString”:null, “moduleName”:null, “errorData”:null}

See, NSX uses Optimistic Locking (yes, there’s Pessimistic Locking as well) to handle concurrency. Its purpose is to make sure that if you’re making a change to an object in NSX you are aware of its current state. In the above example, you saved load balancer into a variable, changed the load balancer state to enabled and then tried to create an application profile, supplying load balancer saved in a variable as a parameter to the cmdlet. But the load balancer (and edge) state has changed and you’re basically using an old (stale) version of the object. You either have to retrieve the current state of the object again or avoid this issue all together by simply using pipelines and retrieve the up-to-date version of the object with every call.

Read this article if you want to know more about Optimistic Locking:

If you found this useful, please leave a comment, smash that like button and hit notification bell to not miss new blog posts ever again.

Quick Start With Lifecycle Manager REST APIs

December 11, 2018

Just a few years ago coming across an infrastructure product (software or hardware) that supports REST APIs was a rare thing. Today it’s the opposite. Buying, say, a storage array from a major vendor, that doesn’t support some sort of an API can be seen as a potential drawback. It’s now gotten to a point where certain operations can only be done via API and are not available in the GUI. So basic programming skills become more and more important.

I have come across such situation with vRealize Suite Lifecycle Manager (vRSLCM or just LCM) product from VMware. If you have a request that got stuck, the only way to cancel it (at least at the time of writing) is to use LCM’s REST APIs. It can’t be done from the GUI.

While I was tackling this issue, I noticed that there aren’t many articles on how to make REST calls to LCM on the Internet, so I though I’d use this opportunity to show how to do it.

Authentication

First challenge you have to deal with is authentication. LCM doesn’t support basic authentication, like other products, for instance NSX. You need a token.

This is how you can get a token in Postman:

{
	"username":"admin@localhost",
	"password":"vmware"
}

This is what it will look like in Postman:

When you click send you should get a token in response:

Making REST Calls

Now you need to specify the token as one of the headers, with “x-xenon-auth-token” as key and the token itself as value:

From here, you are ready to make actual REST API calls. Coming back to our example, we can go to LCM GUI and copy the ID of the stuck request from the browser window:

And then make a DELETE call with empty body to cancel the request:

As a result, traces of the request will be completely deleted from LCM.

Note: The only catch here, that you have to remove “v1” version of the API from the URL. Or it will not work.

Swagger UI

LCM supports Swagger, which lets you run REST API calls straight from the browser. So if you want to feel yourself a hacker, open the https://lcm-hostname/api URL and you can get the token and make requests by simply using the “Try It Out” button, specifying required parameters and hitting “Execute”.

Unable to Delete vCenter Endpoint in vRealize Automation

December 7, 2018

vRealize Automation Error

More than once in my experience I’ve had a need to delete an endpoint in vRealize Automation. Maybe configuration has changed or you simply made a typo in vCenter hostname or credentials. Once you’ve specified vCenter address and saved the endpoint you can no longer change it (only delete and re-add).

But even when you try to delete it, you will get an error something along the lines of:

You cannot delete this endpoint because 1 compute resources and 0 storage paths use it.

CloudClient Error

There is a KB article that walks you through the process of how to do that using a special tool called CloudClient: Error “This endpoint is being used by # compute resources and # storage paths and cannot be deleted” when you attempt to delete an endpoint in vRA 7.x (2150548)

But even that approach not always work. When you run this command from the KB article “vra computeresource inactive list” you may get the following error:

Error: Something went wrong while processing your request. Please check the application logs for details.

Solution

There is almost no mention of this second error on the Internet and I can see how someone can keep banging his head trying to solve it, so I thought I’d share a solution here. And it’s simple – open a GSS ticket. They can delete the endpoint for you. If you see this error, there’s no other way that I know of to solve this problem without involving GSS.

Clean-up

You can see an error similar to the following in vRA logs if you didn’t stop proxy agents before deleting the endpoint:

Error processing ping response
Error occurred while executing stored proc usp_InsertUpdateHost The INSERT statement conflicted with the FOREIGN KEY constraint “FK_ManagementEndpoint_Host”. The conflict occurred in database “vRa_IaaS”, table “dbo.ManagementEndpoints”, column ‘ManagementEndpointID’.
The statement has been terminated.
Inner Exception: The INSERT statement conflicted with the FOREIGN KEY constraint “FK_ManagementEndpoint_Host”. The conflict occurred in database “vRa_IaaS”, table “dbo.ManagementEndpoints”, column ‘ManagementEndpointID’.
The statement has been terminated.

All you need to do to get rid of it is restart your proxy agents.

Conclusion

Hope this post saves someone the hassle of hours searching for the answer in blogs and forums.

Creating vRealize Operations Manager Alerts Using REST API

September 11, 2018

Whenever I’m faced with a repetitive configuration task, I search for ways to automate it. There’s nothing more boring than sitting and clicking through the GUI for hours performing the same thing over and over again.

These days most of the products I work with support REST API interface, so scripting has become my solution of choice. But scripting requires you to know a scripting language, such as PowerShell, certain SDKs and APIs, like PowerCLI and REST and more importantly – time to write the script and test it. If you’re going to use this script regularly, in the long-term it’s worth the effort . But what if it’s a one-off task? You may well end up spending more time writing a script, than it takes to perform the task manually. In this case there are more practical ways to improve your efficiency. One of such ways is to use developer tools like Postman.

The idea is that you can write a REST request that applies a certain configuration setting and use it as a template to make multiple calls by slightly tweaking the parameters. You would have to change the parameters manually for each request, which is not as elegant as providing an array of variables to a script, but still much quicker than clicking through the GUI.

Recently I worked on a VMware Validated Design (VVD) deployment for a customer, which required configuring dozens of vRealize Operations Manager alerts as part of the build. So I will use it as an example to demonstrate how you can save yourself hours by doing it in Postman, instead of GUI.

Collect Alert Properties

To create an alert in vROps you will need to specify certain alert properties in the REST API call body. You will need at least:

  • “pluginId” – ID of the outbound plugin, which is usually the Standard Email Plugin
  • “emailaddr” – recipient email address
  • “values” property under the alertDefinitionIdFilters XML element – this is the alert definition ID
  • “resourceKind” – resource that the alert is applicable for, such as VirtualMachine, Datastore, etc.
  • “adapterKind” – this is the adapter that the alert comes from, such VMWARE, NSX, etc.

To determine the pluginId you will need to configure an outbound plugin in vROps and then make the following GET call to get the ID:

To find values for alert definition, resource kind and adapter kind properties, make the following get call and search for the alert name in the results:

Create Alert in vROps

To create an alert in vROps, you will need to make a POST call to the following URI in XML format:

  • Use the following request URL: https://vrops-hostname/suite-api/api/notifications/rules
  • Click on Headers tab and specify the following key “Content-Type” and value “application/xml”
  • Click on Body tab and choose raw, in the drop-down choose “XML (application/xml)”
  • Copy the following XML request to the body and use it as a template
<ops:notification-rule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ops="http://webservice.vmware.com/vRealizeOpsMgr/1.0/">
<ops:name>
No data received for Windows platform
</ops:name>
<ops:pluginId>c5f60db9-eb5b-47c1-8545-8ba573c7d289</ops:pluginId>
<ops:alertControlStates/>
<ops:alertStatuses/>
<ops:criticalities/>
<ops:resourceKindFilter>
<ops:resourceKind>Windows</ops:resourceKind>
<ops:adapterKind>EP Ops Adapter</ops:adapterKind>
</ops:resourceKindFilter>
<ops:alertDefinitionIdFilters>
<ops:values>AlertDefinition-EP Ops
Adapter-Alert-system-availability-Windows</ops:values>
</ops:alertDefinitionIdFilters>
<ops:properties name="emailaddr">vrops@corp.local</ops:properties>
</ops:notification-rule>

As described before, make sure to replace the following properties with your own values: “pluginId”, “values” property under the alertDefinitionIdFilters XML element, resourceKind, adapterKind and emailaddr.

As a result of the REST API call you will get an alert created in vROps:

For every other alert you can keep the plugin ID and email address the same and update only the alert definition, resouce kind and adapter kind.

Conclusion

By using the same REST call and changing properties for each alert accordingly, I was able to finish the job much quicker and avoided hours of pain of clicking through the GUI. As long as you have a REST API endpoint to work with, the same approach can be applied to any repetitive task.

If you’d like to learn more, make sure to check out VMware {code} project here for more information about VMware product APIs and SDKs.

Scripted CIFS Shares Migration

March 8, 2018

I don’t usually blog about Windows Server and Microsoft products in general, but the need for file server migration comes up in my work quite frequently, so I thought I’d make a quick post on that topic.

There are many use cases, it can be migration from a NAS storage array to a Windows Server or between an on-premises file server and cloud. Every such migration involves copying data and recreating shares. Doing it manually is almost impossible, unless you have only a handful of shares. If you want to replicate all NTFS and share-level permissions consistently from source to destination, scripting is almost the only way to go.

Copying data

I’m sure there are plenty of tools that can perform this task accurately and efficiently. But if you don’t have any special requirements, such as data at transit encryption, Robocopy is probably the simplest tool to use. It comes with every Windows Server installation and starting from Windows Server 2008 supports multithreading.

Below are the command line options I use:

robocopy \\file_server\source_folder D:\destination_folder /E /ZB /DCOPY:T /COPYALL /R:1 /W:1 /V /TEE /MT:128 /XD “System Volume Information” /LOG:D:\robocopy.log

Most of them are common, but there are a few worth pointing out:

  • /MT – use multithreading, 8 threads per Robocopy process by default. If you’re dealing with lots of small files, this can significantly improve performance.
  • /R:1 and /W:1 – Robocopy doesn’t copy locked files to avoid data inconsistencies. Default behaviour is to keep retrying until the file is unlocked. It’s important for the final data synchronisation, but for data seeding I recommend one retry and one second wait to avoid unnecessary delays.
  • /COPYALL and /DCOPY:T will copy all file and directory attributes, permissions, as well as timestamps.
  • /XD “System Volume Information” is useful if you’re copying an entire volume. If you don’t exclude the System Volume Information folder, you may end up copying deduplication and DFSR data, which in addition to wasting disk space, will break these features on the destination server.

Robocopy is typically scheduled to run at certain times of the day, preferably after hours. You can put it in a batch script and schedule using Windows Scheduler. Just keep in mind that if you specify the job to stop after running for a certain amount of hours, Windows Scheduler will stop only the batch script, but the Robocopy process will keep running. As a workaround, you can schedule another job with the following command to kill all Robocopy processes at a certain time of the day, say 6am in the morning:

taskkill /f /im robocopy.exe

Duplicating shares

For copying CIFS shares I’ve been using “sharedup” utility from EMC’s “CIFS Tools” collection. To get the tool, register a free account on https://support.emc.com. You can do that even if you’re not an EMC customer and don’t own an EMC storage array. From there you will be able to search for and download CIFT Tools.

If your source and destination file servers are completely identical, you can use sharedup to duplicate CIFS shares in one command. But it’s rarely the case. Often you want to exclude some of the shares or change paths if your disk drives or directory structure have changed. Sharedup supports input and output file command line options. You can generate a shares list first, which you can edit and then import shares to the destination file server.

To generate the list of shares first run:

sharedup \\source_server \\destination_server ALL /SD /LU /FO:D:\shares.txt /LOG:D:\sharedup.log

Resulting file will have records similar to this:

#
@Drive:E
:Projects ;Projects ;C:\Projects;
#
@Drive:F
:Home;Home;C:\Home;

Delete shares you don’t want to migrate and update target path from C:\ to where your data actually is. Don’t change “@Drive:E” headers, they specify location of the source share, not destination. Also worth noting that you won’t see permissions listed anywhere in this file. This file lists shares and share paths only, permissions are checked and copied at runtime.

Once you’re happy with the list, use the following command to import shares to the destination file server:

sharedup \\source_server \\destination_server ALL /R /SD /LU /FI:D:\shares.txt /LOG:D:\sharedup.log

For server local users and groups, sharedup will check if they exist on destination. So if you run into an error similar to the following, make sure to first create those groups on the destination file server:

10:13:07 : WARNING : The local groups “WinRMRemoteWMIUsers__” and “source_server_WinRMRemoteWMIUsers__” do not exist on the \\destination_server server !
10:13:09 : WARNING : Please use lgdup utility to duplicate the missing local user(s) or group(s) from \\source_server to \\destination_server.
10:13:09 : WARNING : Unable to initialize the Security Descriptor translator

Conclusion

I created this post as a personal howto note, but I’d love to hear if it’s helped anyone else. Or if you have better tool suggestions to accomplish this task, please let me know!

vRealize Automation Disaster Recovery

January 14, 2018

Introduction

VMware has invested a lot of time and effort in vRealize Automation high availability. For medium and large deployment scenarios VMware recommends using a load balancer (Citrix, F5, NSX) to distribute traffic between vRA appliance and infrastructure components, as well as database clustering (such as MS SQL availability groups) for database high availability. Additionally, in vRA 7.3 VMware added support for automatic failover of vRA appliance’s embedded PostgreSQL database, which was a manual process prior to that.

There is a clear distinction, however, between high availability and disaster recovery. Generally speaking, HA covers redundancy within the site and is not intended to protect from full site failure. Site Recovery Manager (or another replication product) is required to protect vRA in a DR scenario, which is described in more detail in the following document:

In my opinion, there are two important aspects that are missing from the aforementioned document, which I want to cover in this blog post: restoring VM UUIDs and changing vRA IP address. I will cover them in the order that these tasks would usually be performed if you were to fail over vRA to DR:

  1. Exporting VM UUIDs
  2. Changing IP addresses
  3. Importing VM UUIDs

I will also only touch on how to change VM reservations. Which is also an important step, but very well covered in VMware documentation already.

Note: this blog post does not provide configuration guidelines for VM replication software, such as Site Recovery Manager, Zerto or RecoverPoint and is focused only on DR aspects related to vRA itself. Refer to official documentation of corresponding products to determine how to set up VM replication to your disaster recovery site.

Exporting VM UUIDs

VMware uses two UUIDs to identify a VM. BIOS UUID (uuid.bios in .vmx file) was the original VM identifier implemented to identify a VM and is derived from the hardware VM is provisioned on. But it’s not unique. If VM is cloned, the clone will have the same BIOS UUID. So the second identifier was introduced called Instance UUID (vc.uuid in .vmx file), which is generated by vCenter and is unique within a single vCenter (two VMs in different vCenters can have the same Instance UUID).

When VMs are failed over, Instance UUIDs change. Compare VirtualMachine.Admin.AgentID (Instance UUID) and VirtualMachine.Admin.UUID (BIOS UUID) on original and failed over VMs.

Why does this matter? Because vRA uses Instance UUIDs to keep track of managed VMs.  If Instance UUIDs change, vRA will show the corresponding VMs as missing under Infrastructure > Managed Machines. And you won’t be able to manage them.

So it’s important to export VM Instance UUIDs before failover, which can then be used to restore the original values. This is how you can get the Instance UUID of a given VM using PowerCLI:

> (Get-VM vm_name).extensiondata.config.InstanceUUID

Here, on my GitHub page, you can find a script that I have put together to export Instance UUIDs of all VMs in CSV format.

Changing IP addresses

Once you’ve saved the Instance UUIDs, you can move on to failover. vRA components should be started in the following order:

  1. MS SQL database
  2. vRA appliance
  3. IaaS server

If network subnets, that all components are connected to, are stretched between two sites, when VMs are brought up at DR, there are no additional reconfiguration required. But usually it’s not the case and servers need to be re-IP’ed. IaaS server network setting are changed the same as on any other Windows server machine.

vRealize Appliance network settings are changed in vRA appliance management interface, that can be accessed at https://vra-appliance-hostname:5480, under Network > Address tab. The problem is, if IP addresses change at DR, it will be challenging to reach vRA appliance over the network. To work around that, connect to vRA VM console and run the following script from CLI to change appliance’s network settings:

# /opt/vmware/share/vami/vami_config_net

Don’t forget to update the DNS record for vRA appliance in DNS. For IaaS server it’s not needed, as long as you allow Dynamic DNS (DDNS) updates.

Importing VM UUIDs

After the failover all of your VMs will have missing status in vRA. To make vRA recognize failed over VMs you will need to revert Instance UUIDs back to the original values. In PowerCLI this can be done in the following way:

> $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
> $spec.instanceUuid = ’52da9b14-0060-dc51-4733-3b01e912edd2′
> $vm = Get-VM -Name vm_name
> $vm.Extensiondata.ReconfigVM_Task($spec)

I’ve written another script, that will perform this task for you, which you can find on my GitHub page.

You will need two files to make the script work. The vm_vc_uuids.csv file you generated before, with the list of original VM Instance UUIDs. As well as the list of missing VMs in CSV format, that you can export from vRA after the failover on the Infrastructure > Managed Machines page:

This is an example of the script command line options and the output:

You will need to run an inventory data collection from the Infrastructure > Compute Resources > Compute Resources page. vRA will discover VMs and update their status to “On”.

Updating reservations

If you try to run any Day 2 operation on a VM with the old reservation in place, you will get an error similar to this:

Error processing [Shutdown], error details:
Error getting property ‘runtime.powerState’ from managed object (null)
Inner Exception: Object reference not set to an instance of an object.

To manually update VM reservation, on Infrastructure > Managed Machines page hover over the VM and select Change Reservation:

This process is obviously not scalable, as it can take hours, if you have hundreds of VMs. VMware offers an alternative solution that lets you update all VMs by using Bulk Import feature available from Infrastructure > Administration > Bulk Imports. The idea is that you can export all VM configuration details in a CSV file, update compute and storage reservation columns and import back to vRA. vRealize Suite 7.0 Disaster Recovery by Using Site Recovery Manager 6.1 gives very detailed instruction on how to do that in “Bulk Import, Update, or Migrate Virtual Machines” section.

Conclusion

I hope this blog post helped to cover some gaps in VMware documentation. If you have any questions or comments, as always, feel free to leave them in the comments sections below.

References

[SOLVED] Migrating vCenter Notifications

January 6, 2018

Why is this a problem?

VMware upgrades and migrations still comprise a large chunk of what I do in my job. If it is an in-place upgrade it is often more straightforward. The main consideration is making sure all compatibility checks are made. But if it is a rebuild, things get a bit more complicated.

Take for example a vCenter Server to vCenter Server Appliance migration. If you are migrating between 5.5, 6.0 and 6.5 you are covered by the vCenter Server Migration Tool. Recently I came across a customer using vSphere 5.1 (yes, it is not as uncommon as you might think). vCenter Server Migration Tool does not support migration from vSphere 5.1, which is fair enough, as it is end of support was August 2016. But as a result, you end up being on your own with your upgrade endeavours and have to do a lot of the things manually. One of such things is migrating vCenter notifications.

You can go and do it by hand. Using a few PowerCLI commands you can list the currently configured notifications and then recreate them on the new vCenter. But knowing how clunky and slow this process is, I doubt you are looking forward to spend half a day configuring each of the dozens notifications one by one by hand (I sure am not).

I offer an easy solution

You may have seen a comic over on xkcd called “Is It Worth The Time?“. Which gives you an estimate of how long you can work on making a routine task more efficient before you are spending more time than you save (across five years). As an example, if you can save one hour by automating a task that you do monthly, even if you spend two days on automating it, you will still brake even in five years.

Knowing how often I do VMware upgrades, it is well worth for me to invest time in automating it by scripting. Since you do not do upgrades that often, for you it is not, so I wrote this script for you.

If you simply want to get the job done, you can go ahead and download it from my GitHub page here (you will also need VMware PowerCLI installed on your machine for it to work) and then run it like so:

.\copy-vcenter-alerts-v1.0.ps1 -SourceVcenter old-vc.acme.com -DestinationVcenter new-vc.acme.com

Script includes help topics, that you can view by running the following command:

Get-Help -full .\copy-vcenter-alerts-v1.0.ps1

Or if you are curious, you can read further to better understand how script works.

How does this work?

First of all, it is important to understand the terminology used in vSphere:

  • Alarm trigger – a set of conditions that must be met for an alarm warning and alert to occur.
  • Alarm action – operations that occur in response to triggered alarms. For example, email notifications.

Script takes source and destination vCenter IP addresses or host names as parameters and starts by retrieving the list of existing alerts. Then it compares alert definitions and if alert doesn’t exist on the destination, it will be skipped, so be aware of that. Script will show you a warning and you will be able to make a decision about what to do with such alert later.

Then for each of the source alerts, that exists on the destination, script recreates actions, with exact same triggers. Trigger settings, such as repeats (enabled/disabled) and trigger state changes (green to yellow, yellow to red, etc) are also copied.

Script will not attempt to recreate an action that already exists, so feel free to run the script multiple times, if you need to.

What script does not do

  1. Script does not copy custom alerts – if you have custom alert definitions, you will have to recreate them manually. It was not worth investing time in such feature at this stage, as custom alerts are rare and even if encountered, there us just a handful, that can be moved manually.
  2. Only email notification actions are supported – because they are the most common. If you use other actions, like SNMP traps, let me know and maybe I will include them in the next version.

PowerCLI cmdlets used

These are some of the useful VMware PowerCLI cmdlets I used to write the script:

  • Get-AlarmDefinition
  • Get-AlarmAction
  • Get-AlarmActionTrigger
  • New-AlarmAction
  • New-AlarmActionTrigger