Posts Tagged ‘bios’

vRealize Automation Disaster Recovery

January 14, 2018

Introduction

VMware has invested a lot of time and effort in vRealize Automation high availability. For medium and large deployment scenarios VMware recommends using a load balancer (Citrix, F5, NSX) to distribute traffic between vRA appliance and infrastructure components, as well as database clustering (such as MS SQL availability groups) for database high availability. Additionally, in vRA 7.3 VMware added support for automatic failover of vRA appliance’s embedded PostgreSQL database, which was a manual process prior to that.

There is a clear distinction, however, between high availability and disaster recovery. Generally speaking, HA covers redundancy within the site and is not intended to protect from full site failure. Site Recovery Manager (or another replication product) is required to protect vRA in a DR scenario, which is described in more detail in the following document:

In my opinion, there are two important aspects that are missing from the aforementioned document, which I want to cover in this blog post: restoring VM UUIDs and changing vRA IP address. I will cover them in the order that these tasks would usually be performed if you were to fail over vRA to DR:

  1. Exporting VM UUIDs
  2. Changing IP addresses
  3. Importing VM UUIDs

I will also only touch on how to change VM reservations. Which is also an important step, but very well covered in VMware documentation already.

Note: this blog post does not provide configuration guidelines for VM replication software, such as Site Recovery Manager, Zerto or RecoverPoint and is focused only on DR aspects related to vRA itself. Refer to official documentation of corresponding products to determine how to set up VM replication to your disaster recovery site.

Exporting VM UUIDs

VMware uses two UUIDs to identify a VM. BIOS UUID (uuid.bios in .vmx file) was the original VM identifier implemented to identify a VM and is derived from the hardware VM is provisioned on. But it’s not unique. If VM is cloned, the clone will have the same BIOS UUID. So the second identifier was introduced called Instance UUID (vc.uuid in .vmx file), which is generated by vCenter and is unique within a single vCenter (two VMs in different vCenters can have the same Instance UUID).

When VMs are failed over, Instance UUIDs change. Compare VirtualMachine.Admin.AgentID (Instance UUID) and VirtualMachine.Admin.UUID (BIOS UUID) on original and failed over VMs.

Why does this matter? Because vRA uses Instance UUIDs to keep track of managed VMs.  If Instance UUIDs change, vRA will show the corresponding VMs as missing under Infrastructure > Managed Machines. And you won’t be able to manage them.

So it’s important to export VM Instance UUIDs before failover, which can then be used to restore the original values. This is how you can get the Instance UUID of a given VM using PowerCLI:

> (Get-VM vm_name).extensiondata.config.InstanceUUID

Here, on my GitHub page, you can find a script that I have put together to export Instance UUIDs of all VMs in CSV format.

Changing IP addresses

Once you’ve saved the Instance UUIDs, you can move on to failover. vRA components should be started in the following order:

  1. MS SQL database
  2. vRA appliance
  3. IaaS server

If network subnets, that all components are connected to, are stretched between two sites, when VMs are brought up at DR, there are no additional reconfiguration required. But usually it’s not the case and servers need to be re-IP’ed. IaaS server network setting are changed the same as on any other Windows server machine.

vRealize Appliance network settings are changed in vRA appliance management interface, that can be accessed at https://vra-appliance-hostname:5480, under Network > Address tab. The problem is, if IP addresses change at DR, it will be challenging to reach vRA appliance over the network. To work around that, connect to vRA VM console and run the following script from CLI to change appliance’s network settings:

# /opt/vmware/share/vami/vami_config_net

Don’t forget to update the DNS record for vRA appliance in DNS. For IaaS server it’s not needed, as long as you allow Dynamic DNS (DDNS) updates.

Importing VM UUIDs

After the failover all of your VMs will have missing status in vRA. To make vRA recognize failed over VMs you will need to revert Instance UUIDs back to the original values. In PowerCLI this can be done in the following way:

> $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
> $spec.instanceUuid = ’52da9b14-0060-dc51-4733-3b01e912edd2′
> $vm = Get-VM -Name vm_name
> $vm.Extensiondata.ReconfigVM_Task($spec)

I’ve written another script, that will perform this task for you, which you can find on my GitHub page.

You will need two files to make the script work. The vm_vc_uuids.csv file you generated before, with the list of original VM Instance UUIDs. As well as the list of missing VMs in CSV format, that you can export from vRA after the failover on the Infrastructure > Managed Machines page:

This is an example of the script command line options and the output:

You will need to run an inventory data collection from the Infrastructure > Compute Resources > Compute Resources page. vRA will discover VMs and update their status to “On”.

Updating reservations

If you try to run any Day 2 operation on a VM with the old reservation in place, you will get an error similar to this:

Error processing [Shutdown], error details:
Error getting property ‘runtime.powerState’ from managed object (null)
Inner Exception: Object reference not set to an instance of an object.

To manually update VM reservation, on Infrastructure > Managed Machines page hover over the VM and select Change Reservation:

This process is obviously not scalable, as it can take hours, if you have hundreds of VMs. VMware offers an alternative solution that lets you update all VMs by using Bulk Import feature available from Infrastructure > Administration > Bulk Imports. The idea is that you can export all VM configuration details in a CSV file, update compute and storage reservation columns and import back to vRA. vRealize Suite 7.0 Disaster Recovery by Using Site Recovery Manager 6.1 gives very detailed instruction on how to do that in “Bulk Import, Update, or Migrate Virtual Machines” section.

Conclusion

I hope this blog post helped to cover some gaps in VMware documentation. If you have any questions or comments, as always, feel free to leave them in the comments sections below.

References

Advertisement

AIX at first glance

May 19, 2012

Recently I set up an AIX 5.1 on a RS/6000 box. Now, after some time working with the OS, I’d like to share my first impressions and features that distinguishes it from Linux.

FYI: Do not try to run AIX on x86, it won’t work. And it have never done. Only PowerPC and POWER RISC architectures.

System Management Services

The very first thing which may surprise you when you start a PowerPC system is absence of BIOS. PowerPC uses SMS which is an acronym for System Management Services. You enter SMS by pressing F2 during server startup. However, SMS implements same features as conventional server’s BIOS. Like configuring boot sequence, performing simple diagnostics, etc.

AIX default shell

AIX uses KornShell (ksh) by default. Bourne shell (bsh) is also available. But do not confuse it with Bourne-again shell (bash). It was developed two ears earlier (1989) than AIX 5.1 (2001), but wasn’t included. What’s interesting about ksh is that by default it works in vi editing mode. It means that initially you work in an input mode and enter commands by typing and hitting return as usual. Type ESC to enter control mode. For example type CTRL+V in control mode and you will find your ksh version. Mine is M-11/16/88f. If you type backslash (\) in control mode you will complete a file path. ksh88 shortcoming is that it doesn’t support commands completion.

System Management Interface Tool

AIX operating system is configured using the System Management Interface Tool (SMIT). It’s an equivalent of YaST in SuSE, redhat-config-* tools in Red Hat or Windows Control Panel. SMIT is very thorough configuration tool. For example, user add page consists of forty fields! SMIT has several handy functional keys. For instance, F5 sets field to the default value, using F9 you can temporarily invoke command shell, F4 generates a list if field implies it, like list of packages available to install from particular directory. Apart from that, SMIT has weird field hints: ‘-‘ says that field is numerical, ‘+’ means a list, ‘/’ is a path. Everything you do in SMIT is logged in /smit.log.

Web-based System Manager

On top of that, AIX has Web-based System Manager (WebSM) which lets you monitor your system and manage devices, backups, processes and virtually everything in your operating system. You can do that either from inside operating system itself or through standalone client which is available for Windows and Linux. To manage your AIX host via WebSM you need to have equal Manager and Remote Client versions.  To satisfy that you can download Windows version of Web-based System Manager Remote Client right from the AIX host using SCP or FTP from /usr/websm/pc_client/setup.exe. WebSM Client for AIX 5 is incompatible with Windows 7.

 

Object Data Manager

Feature which is unique to AIX is Object Data Manager (ODM) database, which maintains device configuration. ODM consists of Predefined Configuration Database (PCD) and Customized Configuration Database (CCD). Predefined Configuration Database keeps information on supported devices which means devices for which AIX has drivers and Customized Configuration Database hold information of devices which are currently connected to the system. Data in ODM is stored in terms of objects and their attributes. Access to ODM is implemented via special API. User can manage ODM by calling odmshow, odmadd, odmchange and odmdelete utilities. Additionally, AIX uses location codes to identify devices. Location code is effectively a path from a motherboard to a device. For example, location code of a SCSI device is in the form AB-CD-EF-G,H. Here AB is a bus type, CD – slot or adapter number, EF – connector ID, G – Control Unit Address of SCSI Device, H – Logical Unit Address of SCSI Device. I have two SCSI hard drives hdisk0 and hdisk1. For hdisk0 location code is 04-C0-00-5,0. Here 04 means PCI bus (00 – CPU bus, 01 – ISA bus, 05 – PCMCIA bus), C0 – integrated SCSI controller (A0 -ISA bus, B0 – secondary PCI bus), 00 – SCSI bus number, 5 – SCSI ID, 0 – LUN.

Logical Volume Manager

Did you know that LVM was implemented in AIX ten years earlier (1989) than in Linux (1998)? In fact, after AIX version of LVM was developed, its license was bought by HP. And only after that Heinz Mauelshagen developed Linux version with commands similar to the HP version. Windows Server platform still doesn’t have anything similar AFAIK.

Journaling File System

Another AIX achievement is JFS file system which is journaling by design. First JFS version was implemented in 1990 in AIX 3.1 Do you remeber when ext3 was developed? I believe somewhere in 2001. Journaling NTFS v3 was implemented in 2000 with Windows Server 2000. JFS file system in AIX 4.2 supported 64GB file size (it was 1996). With introduction of JFS2 in 2001, AIX 5 began to support 1TB files. Maximum file size for FAT32 was 4GB. All these facts are explainable. AIX was developed far earlier than Linux and Windows. But it’s still interesting how features firstly introduced in AIX (and other flavors of UNIX) migrate to younger OSes.

Full system recovery

Unlike Linux, AIX allows you to create full volume group backups with all logical volumes. Even in present times in Linux you work with antique tar, gzip, cpio and dd (or duplicity and bacula if you want something more sophisticated). In 2001 AIX already had savevg for backing up non-rootvg volume groups and mksysb which lets you backup rootvg along with system related data. mksysb creates installable image for full system recovery. I find these tools invaluable. I do not know of Linux alternative.

User/group administration

Additionally, AIX has several handy user administration features. For example, a user group can be either administrative or standard. If it’s administrative, then only root can add/remove users from it. If it’s standard, it means that ordinary users can administer that group. Feature I sometimes lack in Linux. Groups are configured in /etc/security/group and look like the following:

system:
admin = true

jradmin:
admin = false
adms = pac,xander

Here system is an administrative group and jradmin is standard. admin field identifies group type and adms contains the list of group administartors (pac an xander). Also, in AIX you can assign portions of root authority to non-root users. There are several predefined roles, like ManageAllUsers, ManageShutdown, ManageBackupRestore, etc, defined in /etc/security/roles. Roles consist of a number of authorizations, which is a set of particular tasks that user can perform. For example, ManageAllUsers role consists of the following authorizations: UserAudit, ListAuditClasses, UserAdmin, RoleAdmin, PasswdAdmin, GroupAdmin. You can create your own roles from these authorizations. In AIX 5 Role-Based Access Control (RBAC) is rather primitive and restricted, but it’s better than nothing.

Error logging

And the last thing I’d like to talk about is error logging. In Linux logging is performed by syslogd, AIX has the same daemon. However, AIX error logging facility is augmented by errdemon. It is started as part of system initialization and continuously monitors /dev/error. When information is read from /dev/error errdemon checks its Error Record Template Repository /var/adm/ras/errtmplt and if it has any additional info on this error, demon writes this information into /var/adm/ras/errlog. Log is in binary format. To read it run errpt command:

errpt -a -s 0519000012

This will show you detailed information on log entries starting from 19th of May 2012 00:00 a.m.

Conclusion

My first experience working with AIX (even with such an outdated version) makes me think of it as a sophisticated and very well written operating system. Many major features were developed in AIX much earlier than in Linux and Windows and I believe it’s still true for modern AIX releases. It becomes obvious why Unix is the primary choice for many big organizations with strong IT infrastructure.

Overclocking Q6600 on P5B Deluxe

March 20, 2012

There are hellova articles in the Internet on OC in general and particularly on overclocking CPUs on ASUS P5B Deluxe and other P965 chipset motherboards.  So here I’ll just post my results and BIOS configuration, along with some of my findings.

I was able to overclock FSB from 266 to 400 with x8 CPU multiplier. Which effectively means 3200MHz frequency as opposed to 2400 stock.

BIOS settings

JumperFree Configuration

CPU Frequency: 400
DRAM Frequency: DDR2-800MHz
PCI Express Frequency: 101
PCI Clock Synchronization Mode: 33.33MHz
Spread Spectrum: Disabled
Memory Voltage: 1.9V
CPU VCore Voltage: 1.525V
NB VCore: 1.25V
SB VCore (SATA, PCIE): 1.5V
ICH Chipset Voltage: 1.057V

CPU Configuration

CPU Ratio Setting: 8
C1E Support: Disabled
Max CPUID Value Limit: Disabled
Vanderpool Technology: Enabled
CPU TM Function: Enabled
Execute Disable Bit: Enabled

North Bridge Configuration

Basic Timings: 5-5-5-15-5
Additional Timings: 42-10-10-10-25
Static Read Control: Disabled

The most important thing to realize about OC on P5B Deluxe is the necessity of manual memory timings set up. Initially I wasted a lot of time trying to OC over 350 FSB with no luck. After changing timings to mentioned above I easily OCed to 400. I can’t explain why you need to set them like this. I just found them in the Internet and it works for me.

What was also new for me is that CPU with x6 multiplier and 400 FSB won’t work with the same voltage as x9 on 266. It’s the same frequency, however, CPU always init on x9 multiplier and only after power up system changes it to configured in BIOS. It means that if you want to lower CPU voltage by changing multiplier, then don’t expect voltage to decrease to initial values.

Another interesting fact is that with C1E Support setting enabled you will get less performance and less marks in CPU dependent benchmarks like 3DMark. C1E Support can lower CPU frequency when CPU idles. But it seems that it also reduces its performance under load.

I also left CPU TM Function (throttling) enabled for safety purposes.

For those who want to increase FPS in games I want to say that CPU and memory OC won’t give you any significant performance boost in games. For example in Unigine benchmark for 266 FSB, 2.4GHz CPU I get 37.9 average FPS. And for 400 FSB, 3.2GHz I get 38. I agree that for some games it will make difference but it’s not worth it. Don’t torture your system.

Results

266 FSB, 2.4GHz CPU:

Max temperature in linpack (OCCT) 54C (measured by Core Temp)
SiSoft Sandra Memory Throughput 5Gb/s
3DMark’06 12211
Unigine Heaven 37,9 avg. FPS
1:10 hour HD video encoding 3:47 hours

400 FSB, 3.2GHz CPU:

Max temperature in linpack (OCCT) 87C (measured by Core Temp)
SiSoft Sandra Memory Throughput 6.45Gb/s
3DMark’06 14945
Unigine Heaven 38 avg. FPS
1:10 hour HD video encoding 2:59 hours

As you can see I have extreme temperatures in linpack, even though I have Thermaltake Big Typhoon VX cooler and efficient Arctic Cooling MX-2 thermal grease. However, you should understand that system never gets to these temperatures under standard system loads like gaming, video encoding, etc. Usually I get no more that 65-70 which is not that high.

Bare in mind that P5B Deluxe undervolt your CPU by 0.5V. It means when you set VCore to 1.525 it’s actually 1.475. Also when I set VCore higher than 5.125 (1.465 effective) motherboard automatically changes VCore voltage to Auto. In fact, it will work on set up voltage until you enter BIOS and save changes for the second time. In other words if you set voltage higher than 5.125 then you will need to set it again after you enter BIOS and change anything for the next time.

The main reason for me to OC my system was video encoding. Firstly I changed old 2 core E6300 to 4 core Q6600 and then OCed it. I used 1:10 hour HD video for testing purposes. Time has changed from 8:08 on stock E6400 to 3:47 on stock Q6600 and then to 2:59 on OCed Q6600. So performance increase is quite apparent and worth it for this class of tasks.