Posts Tagged ‘Problem’

Unable to Delete vCenter Endpoint in vRealize Automation

December 7, 2018

vRealize Automation Error

More than once in my experience I’ve had a need to delete an endpoint in vRealize Automation. Maybe configuration has changed or you simply made a typo in vCenter hostname or credentials. Once you’ve specified vCenter address and saved the endpoint you can no longer change it (only delete and re-add).

But even when you try to delete it, you will get an error something along the lines of:

You cannot delete this endpoint because 1 compute resources and 0 storage paths use it.

CloudClient Error

There is a KB article that walks you through the process of how to do that using a special tool called CloudClient: Error “This endpoint is being used by # compute resources and # storage paths and cannot be deleted” when you attempt to delete an endpoint in vRA 7.x (2150548)

But even that approach not always work. When you run this command from the KB article “vra computeresource inactive list” you may get the following error:

Error: Something went wrong while processing your request. Please check the application logs for details.

Solution

There is almost no mention of this second error on the Internet and I can see how someone can keep banging his head trying to solve it, so I thought I’d share a solution here. And it’s simple – open a GSS ticket. They can delete the endpoint for you. If you see this error, there’s no other way that I know of to solve this problem without involving GSS.

Clean-up

You can see an error similar to the following in vRA logs if you didn’t stop proxy agents before deleting the endpoint:

Error processing ping response
Error occurred while executing stored proc usp_InsertUpdateHost The INSERT statement conflicted with the FOREIGN KEY constraint “FK_ManagementEndpoint_Host”. The conflict occurred in database “vRa_IaaS”, table “dbo.ManagementEndpoints”, column ‘ManagementEndpointID’.
The statement has been terminated.
Inner Exception: The INSERT statement conflicted with the FOREIGN KEY constraint “FK_ManagementEndpoint_Host”. The conflict occurred in database “vRa_IaaS”, table “dbo.ManagementEndpoints”, column ‘ManagementEndpointID’.
The statement has been terminated.

All you need to do to get rid of it is restart your proxy agents.

Conclusion

Hope this post saves someone the hassle of hours searching for the answer in blogs and forums.

Advertisements

Error When Deploying VCSA or PSC

October 31, 2017

Recently when helping a customer to deploy a new greenfield VMware 6.5 environment I ran into an issue where brand new vCenter Server Appliance and Platform Service Controller 6.5 build 5973321 fail to deploy to an ESXi host build 5969303.

Stage 1 (install) of the deployment completes successfully. In Stage 2 (setup) VCSA installer both for vCenter and PSC first shows a prompt asking for credentials.

PSC Issue Description

After providing credentials, when installing an external PSC, installation fails with the following error:

Error:
Unable to connect to vCenter Single Sign-On: Failed to connect to SSO; uri:https://psc-hostname/sts/STSService/vsphere.local
Failed to register vAPI Endpoint Service with CM
Failed to configure vAPI Endpoint Service at the firstboot time

Resolution:
Please file a bug against VAPI

Installation wizard shows the following resulting error:

Failure:
A problem occurred during setup. Refresh this page and try again.

A problem occurred during setup. Services might not be working as expected.

A problem occurred while – Starting VMware vAPI Endpoint…

Appliance shows the following error in console:

Failed to start services. Firstboot Error.

Alternatively PSC can fail with the following error:

Error:
Unexpected failure: }
Failed to register vAPI Endpoint Service with CM
Failed to configure vAPI Endpoint Service at the firstboot time

Resolution:
Please file a bug against VAPI

VCSA Issue Description

After providing credentials, when installing vCenter with embedded PSC, installation fails with the following error:

Error:
Unable to start the Service Control Agent.

Resolution:
Search for these symptoms in the VMware knowledge base for any known issues and possible workarounds. If none can be found, collect a support bundle and open a support request.

Installation wizard shows the following resulting error:

Failure:
A problem occurred during setup. Refresh this page and try again.

A problem occurred during setup. Services might not be working as expected.

A problem occurred while – Starting VMware Service Control Agent…

Appliance shows the same error in console.

Alternatively VCSA can fail with the following error:

Error:
Encountered an internal error. Traceback (most recent call last): File “/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py”, line 1852, in main vmidentityFB.boot() File “/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py”, line 359, in boot self.checkSTS(self.__stsRetryCount, self.__stsRetryInterval) File “/usr/lib/vmidentity/firstboot/vmidentity-firstboot.py”, line 1406, in checkSTS raise Exception(‘Failed to initialize Secure Token Server.’) Exception: Failed to initialize Secure Token Server.

Resolution:
This is an unrecoverable error, please retry install. If you run into this error again, please collect a support bundle and open a support request.

Issue Workaround

This issue happens when VCSA or PSC installation was cancelled and is attempted for the second time to the same ESXi host.

Identified workaround for this issue is to use another ESXi host, which has never been used to deploy PSC or VCSA to.

Issue Resolution

VMware is aware of the bug and working on the resolution.

Dell Repository Manager: Bootable ISO Issues

May 23, 2016

problem_solutionIn one of my previous posts I described the process of upgrading a Dell FX2 chassis firmware using Dell Repository Manager (DRM).

In an ideal world you just follow the process and in an hour or two you can get your chassis upgraded. You may sometimes run into issues. I want to go through some of them in this post, including possible remediation.

Issue Description

When exporting firmware to a bootable ISO you can find DRM not being able to download some of the bundle components with the following error in the Job Result:

Processing failed:
Failed downloading files:
Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN

And errors in the Log:


60. 24/03/2016 5:58:50 PM Export to Bootable ISO : Downloaded 34 / 56
61. 24/03/2016 5:59:44 PM Export to Bootable ISO : Error downloading some files
62. 24/03/2016 5:59:45 PM Export to Bootable ISO : Failed exporting to Bootable ISO.

Workaround #1: Skip the Component

You can try the following option “Continue download irrespective of any error (in the selected components)” in the export dialog. It won’t help to get the component downloaded, but you will got a bootable ISO.

However, DRM will still keep the failed component in the bundle and try to install it during the upgrade, which will obviously fail (update 16/56):

failed_update

Once the upgrade is finished you will get the following error at the end:

Note: Some update requires machine reboot. Please reboot to CD/DVD to continue for the failed update because of the dependency…

upgrade_status

No matter how many times you reboot you will obviously get the same errors. You can ignore it if you 100% sure this is what causes the upgrade to fail or use Workaround #2.

Workaround #2: Create Custom ISO

When you create a repository in DRM it’s populated with pre-built components and bundles. But you can create custom repositories. The idea is that you can exclude the failed component from the repository by creating it manually.

Assuming you already have the base repository configured, do the following:

  • Open the existing repository and click on the Components tab
  • Deselect the failed component in the component list (in my case it was Diagnostics_Application_PWMC8_LN64_OSC_1.1_A00.BIN)
  • Click on the “Copy To” button:

custom_components

  • In the opened dialogue select “Create NEW Repository and copy component(s) into it”
  • Follow the wizard and when you click finish, components will be copied to the newly created repository
  • Open the new repository and click on the Componenets tab
  • Select all components and click on the “Copy to” button once again
  • This time select “Create a NEW Bundle in the same repository and add component(s) into it”
  • On the next screen give the bundle a name and make sure to choose “Linux 32-bit and 64-bit” in the OS Type

custom_bundle

As a result you should get a new bundle created which you can export to a bootable ISO using the same process.

Workaround #3: Use Server Update Utility

If none of the above helps you can fall back to a proven upgrade approach and use Server Update Utility (SUU). SUU is a huge 12GB ISO to download, but you can use Dell Download Manager, which supports resuming interrupted downloads. Make sure to disable proxy! Dell Download Manager does not support resuming an interrupted download if you’re using a proxy server.

SUU is not a bootable ISO. Previously you had to use Dell Systems Build and Update Utility (SBUU) to boot from it first and then mount the ISO to proceed with the upgrade. Starting with Dell 11G servers you don’t need it anymore and can upgrade firmware straight form Dell Lifecycle Controller (LC).

You’ll need to boot into the Lifecycle Controller and choose Firmware Update > Launch Firmware Update > Local Drive(CD or DVD or USB). Mount the SUU ISO and the rest is fairly straightforward. LC will upgrade the firmware and reboot the blade.

lc_upgrade

Conclusion

Dell Repository Manager is the recommended approach to upgrade firmware on Dell hardware. Unlike SUU, DRM downloads the latest updates and only the necessary components. It is also capable of making a bootable ISO.

If you have issues, rely on Server Update Utility as it’s bulletproof and always work. But be prepared to download a 12GB ISO image and make sure you have an option to bypass proxy.

Requirements for Unmounting a VMware Datastore

December 30, 2015

I have come across issues unmounting VMware datastores myself multiple times. In recent vSphere versions vCenter shows you a warning if some of the requirements are not fulfilled. It is not the case in the older vSphere versions, which makes it harder to identify the issue.

Interestingly, there are some pre-requisites which even vCenter does not prompt you about. I will discuss all of the requirements in this post.

General Requirements

In this category I combine all requirements which vCenter checks against, such as:

Requirement: No virtual machine resides on the datastore.

Action: You have to make sure that the host you are unmounting the datastore from has no virtual machines (running or stopped) registered on this datastore.  If you are unmounting just one datastore from just one host, you can simply vMotion all VMs residing on the datastore from this host to the remaining hosts. If you are unmounting the datastore from all hosts, you’ll have to either Storage vMotion all VMs to the remaining datastores or shutdown the VMs and unregister them from vCenter.

unmount_vmfs2

Requirement: The datastore is not part of a Datastore Cluster.

Requirement: The datastore is not managed by storage DRS.

Action: Drag and drop the datastore from the Datastore Cluster in vCenter to move it out of the Datastore Cluster. Second requirement is redundant, because SDRS is enabled on a datastore which is configured withing a Datastore Cluster. By removing a datastore from a Datastore Cluster you atomatically disable storage DRS on it.

Requirement: Storage I/O control is disabled for this datastore.

Action: Go to the datastore properties and uncheck Storage I/O Control option. On a SIOC-enabled datastore vSphere creates a folder named after the block device ID and keeps a file called “slotsfile” in it. Its size will change to 0.00 KB once SIOC is disabled.

Requirement: The datastore is not used for vSphere HA heartbeat.

Action: vSphere HA automatically selects two VMware datastores, creates .vSphere-HA folders and use them to keep HA heartbeats. If you have more than two datastores in your cluster, you can control which datastores are selected. Go to cluster properties > Datastore Heartbeating (under vSphere HA section) and select preferred datastores from the list. This will work if you are unmounting one datastore. If you need to unmount all datastores, you will have to disable HA on the cluster level altogether.

datastore_heartbeat

Additional Requirements

Requirements which fall in this category are not checked by vCenter, but are still have to be satisfied. Otherwise vCenter will not let you unmount the datastore.

Requirement: The datastore is not used for swap.

Action: When VM is powered on by default it creates a swap file in the VM directory with .vswp extension. You can change the default behavior and on a per host basis select a dedicated datastore where host will be creating swap files for virtual machines. This setting is enabled in cluster properties in Swapfile Location section. The datastore is then selected for each host in Virtual Machine Swapfile Location settings on the the host configuration tab.

What host also does when you enable this option is it creates a host local swap file, which is named something like this: sysSwap-hls-55de2f14-6c5d-4d50-5cdf-000c296fc6a7.swp

There are scenarios where you need to unmount the swap datastore, such as when you say need to reconnect all of your storage from FC to iSCSI. Even if you shutdown all of your VMs, datastore unmount will fail because the host swap files are still there and you will see an error such as this:

The resource ‘Datastore Name: iSCSI1 VMFS uuid: 55de473c-7f3ae2b5-f9f8-000c29ba113a’ is in use.

See the error stack for details on the cause of the problem.

Error Stack:

Call “HostStorageSystem.UnmountVmfsVolume” for object “storageSystem-29” on vCenter Server “VC.lab.local” failed.

Cannot unmount volume ‘Datastore Name: iSCSI1 VMFS uuid: 55de473c-7f3ae2b5-f9f8-000c29ba113a’ because file system is busy. Correct the problem to retry the operation.

The workaround is to change the setting on the cluster level to store VM swap file in VM directory and reboot all hosts. After a reboot the host .swp file will disappear.

If rebooting the hosts is not desirable, you can SSH to each host and type the following command:

# esxcli sched swap system set –hostlocalswap-enabled false

To confirm that the change has taken effect run:

# esxcli sched swap system get

Then check the datastore and the .swp files should no longer be there.

Conclusion

If you satisfy all of the above requirements you should have no problems when unmounting VMware datastores. vSphere creates a few additional system folders on each of the datastores, such as .sdd.sf and .dvsData, but I personally have never had issues with them.

VNX LDAP Integration: AD Nested Groups

February 11, 2015

Have you ever stumbled upon AD authentication issues on VNX, even though it all looked configured properly? LDAP integration has always been a PITA on storage arrays and blade chassis as usually there is no way to troubleshoot what the actual error is.

auth_error

If VNX cannot lookup the user or group that you’re trying to authenticate against in AD, you’ll see just this. Now go figure why it’s getting upset about it. Even though you can clearly see the group configured in “Role Mapping” and there doesn’t seem to be any typos.

Common problem is Nested Groups. By default VNX only checks if your account is under the specified AD group and doesn’t traverse the hierarchy. So for example, if your account is under the group called IT_Admins in AD, IT_Admins is added to Domain Admins and Domain Admins is in “Role Mapping” – it’s not gonna work.

nested_groups

To make it work change “Nested Group Level” to something appropriate for you and this’d resolve the issue and make your life happier.