Posts Tagged ‘DNS’

Joining ESXi to AD in Disjoint Namespace

November 4, 2019

What is Disjoint Namespace?

Typically, when using Microsoft Active Directory you use AD-integrated DNS and your AD domain name matches you DNS domain name, but you don’t have to. This is quite rare, but I’ve seen cases where the two don’t match. For example, you might have a Linux-based DNS, where you register an esx01.example.com DNS record for your ESXi host and then you join it to an Active Directory domain called corp.local.

That’s called a disjoint namespace. You can read this Microsoft article if you want to know more details: Disjoint Namespace.

In my personal opinion, using a disjoint namespace is asking for trouble, but it will still work if you really want to use it.

Problem

If you end up going down that route, there’s one caveat you should be aware of. When you joining a machine to AD, among other things, it needs to populate DNS name field property of the AD computer object. This is an example of ESXi computer object in Active Directory Users and Computers snap-in:

If you configure example.com domain in your ESXi Default TCP/IP stack, like so:

And then you try to, for example, join your ESXi host to corp.local AD domain, it will attempt to use esx-01a.example.com for computer object DNS name field. If you’re using a domain account with privileges restricted only to domain join, this operation will fail.

This is how the problem manifested itself in my case in ESXi host logs:

Failed to run provider specific request (request code = 8, provider = ‘lsa-activedirectory-provider’) -> error = 40315, symbol = LW_ERROR_LDAP_CONSTRAINT_VIOLATION, client pid = 2099303

If you’re using host profiles to join ESXi host to the domain, remediation will fail and you will see the following in /var/log/syslog.log:

WARNING: Domain join failed; retry count 1.

WARNING: Domain join failed; retry count 2.

Likewise (ActiveDirectory) Domain Join operation failed while joining new domain via username and password..

Note: this problem is specific to joining domain using a restricted service account. If you use domain administrator account, it will force the controller to add the computer object with a DNS name, which doesn’t match the AD name.

Solution

Make sure ESXi domain name setting matches the Active Directory domain name, not DNS domain name. You can still use the esx-01a.example.com record to add the ESXi hosts to vCenter, but you have to specify corp.local domain in DNS settings (or leave it blank), because this is what is going to be used to add the host to AD, like so:

This way your domain controller will be happy and ESXi host will successfully join the domain.

Additional Notes

While troubleshooting this issue I saw a few errors in ESXi host logs, which were a distraction, ignore them, as they don’t constitute an error.

This just means that the ESXi host Active Directory service is running, but host is not joined to a domain yet:

lsass: Failed to run provider specific request (request code = 12, provider = ‘lsa-activedirectory-provider’) -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 2111366

IPC is inter-process communication. Likewise consists of multiple services that talk to each other. They open and close connections, this is normal:

lsass-ipc: (assoc:0x8ed7e40) Dropping: Connection closed by peer

I also found this command to be useful for deeper packet inspection between an ESXi host and AD domain controllers:

tcpdump-uw -i vmk0 not port 22 and not arp

References

Highly available Windows network infrastructure

February 27, 2012

When number of computers in company starts to grow, IT services become critical for company operation, every IT department starts to think how to make their network infrastructure highly available. If it’s a Windows environment, then the first step is usually an additional domain controller. Bringing second DC up and running is rather simple. The only thing you need to do is to run dcpromo and follow the instructions given by the wizard. Then make additional DC a Global Catalog, so that it will serve authentication requests, by going to Active Directory Sites and Services and in NTDS settings on General tab check Global Catalog option. Windows File Replication Services (FRS) will do the rest.

However, it’s usually not enough. Computers rely on DNS service to resolve servers names and in case of primary DC failure your network will be paralyzed. Dcpromo don’t automatically install and configure additional DNS server. You need to do that manually. Moreover, if you use DHCP service to provide network settings to client computers and it’s located on the same server you will also have major issues. The problem here is that you can’t have two active DHCP servers giving out same addresses. But this problem also have its solution.

In case of DNS you should go to Add or Remove Windows Components and find DNS in Networking Services. Install it as AD integrated. Then on the primary DNS, for all your forward and reverse lookup zones, in properties add secondary DNS IP on Name Servers tab. After that DNS will automatically replicate all data. Don’t also forget to add your secondary DNS to DHCP configuration, otherwise clients won’t know about it.

When it comes to DHCP you have an option to use so called 80/20 rule to divide scope between DHCP servers (if you work on Windows server 2008 platform you can build HA DHCP cluster). Simply configure your first DHCP server to lease first 80% of network IP addresses and leave 20% to the second DHCP server. Then in case of first server failure most of computers will already have their IP addresses and you will still have 20% to distribute. In my case network is quite small and I split scope in 50/50. Just make equal configurations for two servers (reservations, exclusions, scope options, etc), but configure scope to have non-overlapping ranges. Then if you use 80/20 rule, you want your primary server to lease IP address in normal circumstances. If both servers will lease addresses with equal rights then you will quickly run out of addresses on 20% server and in case of primary server failure you won’t have enough addresses to lease. To solve that, tweak Conflict detection attempts option.

Basically, this is it. Of course, you will still have many points of failure, like network switch, UPS, etc. But this topic goes beyond this post.