Arriving at the fifth part of this series on Virtualizing Domain Controllers on vSphere, I managed to gather some feedback on these blogposts. One question that emerged after writing the last blogpost on Replication considerations for Domain Controllers running on VMware vSphere was:
Isn’t Windows Server 2012 supposed to solve all these challenges with virtualizing Domain Controllers?
That’s an interesting question. Some Active Directory admins might respond with their knee-jerk “It depends.” response. There are a couple of other variables out there that have impact on the integrity of the Active Directory database: We’ve discussed time differences in Part 2, and the default replication settings, the tombstone lifetime, DFSR and the Active Directory Recycle Bin in Part 3. Now, let’s look at the Active Directory Virtualization Safeguards and the underlying VM-GenerationID and when they help, and the situation in which they don’t. As usual, you’ll find a list of recommendations at the end of this blogpost.
About Active Directory Virtualization Safeguards
Active Directory Domain Services in Windows Server 2012 is the first Windows Server Role to take advantage of the VM-GenerationID with the Active Directory Virtualization Safeguards feature.
The VM-GenerationID is a random 128bit identifier. Introduced as a new feature of Hyper-V in Windows Server 2012, it also found its way to the other main virtualization platforms, including VMware vSphere.
The VM-GenerationID functionality in ESXi 5.0 Update 2 was implemented (and released on December 20, 2012), based on a draft of the VM-GenerationID whitepaper. In the draft version of the VM-GenerationID whitepaper, the VM-GenerationID value was defined as a random 64bit value. In the final version of the VM-GenerationID whitepaper, the VM-GenerationID value was defined as a random 128bit value.
This VM-GenerationID is placed in the RAM of each virtual machine (VM) running on a VM-GenerationID-capable platform with VM-GenerationID-capable settings. Every VM gets its own VM-GenerationID from the virtualization platform. The virtualization platform keeps the VM-GenerationID the same for a VM, unless one of the below situations occur, as described in the VM-GenerationID whitepaper:
VM-GenerationID in VMware vSphere
The above table from the VM-GenerationID whitepaper was also documented on the VMware blogs with the appropriate vSphere terminology and answers to additional questions asked in the comments:
The VM-GenerationID is communicated through the VM GenerationID Counter Driver to virtual machines running on VMware vSphere 5.0 Update 2, and above. It is not governed through VMware Tools settings.
The value of the VM-GenerationID per VM is exposed in the VMX file as vm.genid or vm.genidx.
VM-GenerationID safeguarding Active Directory Virtualization
Starting with Windows Server 2012, a virtual Domain Controller reads the VM-GenerationID from RAM when it starts and before every write to the Active Directory database. It stores the value of VM-Generation ID in the msDS-GenerationID attribute of its object in the local Active Directory database. (this attribute is not replicated)
Before every write, the Active Directory service compares the VM-GenerationID in RAM with the msDS-GenerationId attribute in the Active Directory database. If they match, no problem. If they don’t match, magic happens: .
- The invocationID is renewed, and;
- The RID Pool block in use is discarded.
As you might have remembered from the previous blogpost in this series, this effectively designates the Domain Controller as a new replication partner for other Domain Controllers. It allows the Domain Controller to replicate in necessary changes to avoid USN Rollbacks and Lingering Objects.
From the description above, you might have already figured out some situations that won’t trigger this behavior; Indeed, changes on the storage level won’t be observed and moving a virtual Domain Controller to or from a non-VM-GenerationID-capable hypervisor platform won’t trigger the safeguards either.
Even in straightforward vSphere deployments with straightforward management practices, though, you might not benefit from the Active Directory Virtualization Safeguards.
So, here is the complete list of requirements for Active Directory Virtualization Safeguards on VMware vSphere:
- VMware vSphere needs to run version 5.0 update 2, or up.
- VMware tools need to be installed and running on virtual Domain Controllers, ideally with a version that matches the VMware vSphere version.
- The virtual Domain Controller needs to run Windows Server 2012, or up.
- The Virtual Machine hardware version needs to be version 7, or up.
Just because you can…
… doesn’t mean you should snapshot virtual Domain Controllers.
Some valid reasons for using virtual machine snapshots with Domain Controllers are to:
- Backup software, that takes “image level” backups, typically rely on snapshots to ensure consistent backups
- install software and/or updates on a virtual Domain Controller and want the ability to revert in case there are issues
The possible impact of Virtualization Safeguards on Disaster Recovery
Even with Active Directory Virtualization Safeguards, remember that snapshots are not backups. In fact, the Active Directory Best Practices Analyzer (BPA) will display a warning when Active Directory is only ‘backed up’ through snapshots and not following valid backup and restore procedures.
Forest-wide Disaster Recoveries
Special considerations are required for site-wide Disaster Recovery plans when talking to snapshots of virtual Domain Controllers. As a disaster typically refers to complete site (or Active Directory) outage, in a disaster you typically must recover multiple Domain Controllers or the entire Active Directory infrastructure. In most organizations, snapshots aren’t taken in a sufficiently orchestrated fashion of frequency to allow for the Disaster Recovery scenarios. Bare metal-like restore actions are also not possible with snapshots.
Forest-wide recovery could be from backup or orchestrated, for instance with VMware Site Recovery Manager (SRM).
Was it a Proper Restore or Virtualization Safeguard?
The Active Directory virtualization safeguards kick in during a Domain Controller recovery, as the hypervisor platform changes the VM-GenerationID of the recovered Domain Controller. When investigating Active Directory backup and restore solutions, don’t focus on Event ID 1109 in the Directory Services event log , solely, as this can be triggered by both. Instead:
- Look for Event ID 1917 in the Directory Services event log when taking a backup to see if it uses the Active Directory writer, and;
- Look in the registry for the DSA Previous Restore count.
There’s more information on analyzing Active Directory-aware Backup and Restore solutions, is available in my Whitepaper on Host-Based Backups and Restores of Domain Controllers.
Recovering the RID Master
One of the outcomes of the Active Directory Virtualization Safeguards is to invalidate the RID pool block that was assigned to the Domain Controller previously. So, what happens if virtualization safeguards kick in on the Domain Controller holding the RID Master Flexible Single Master Operations (FSMO) role?
The ‘new’ Domain Controller will not be able to obtain a RID Pool block, when the RID Master is down. The RID Master cannot issue RID pool blocks, until it has replicated with other Domain Controllers.
The solution here is to seize the RID Master FSMO Role on another Domain Controller.
Have Domain Controllers in multiple sites
You can only seize a FSMO role when you have Domain Controllers running and replicating. To overcome a site-wide Active Directory outage, always have Domain Controllers in multiple sites or use a Disaster Recovery site where you replicate to.
Recovering the RID Master
When the Domain Controller holding the RID Master Flexible Single Master Operations (FSMO) role is located in an Active Directory site, that experiences an outage, either:
- Seize the RID Master role on a Domain Controller in an Active Directory site that is not experiencing an outage, as part of the Disaster Recovery plan.
- Replicate the RID Master and PDC Emulator to a pre-assigned Disaster Recovery site as part of the Disaster Recovery plan.
After any of the above actions, restart the Directory Service on the new RID Master. Use the following PowerShell command:
Restart-service NTDS -force
Then force replication to another Domain Controller not impacted by the outage (if available). Reboot the Domain Controller holding the RID Master Flexible Single Master Operations (FSMO) role after all other Domain Controllers have started.
To take advantage of virtualization safeguards as an organization, please consider these recommendations:
Meet the requirements
Many procedures in organizations result in sub-optimal settings. A VM template with VM version 6 will ruin your dreams of Active Directory Virtualization Safeguards, especially when you’ve in-place upgraded the Operating System to Windows Server 2012. Out-of-date VMware tools will have the same effect on these dreams, so update them when you update the underlying hypervisor platform.
Use snapshots with moderation
Active Directory is not a drinking game (although there is a drink attribute in Active Directory’s schema). Use snapshots of virtual Domain Controllers with moderation. Installing software or patching a Domain Controller shouldn’t automatically mean you take a snapshot of it.
Plan for disaster recovery
Snapshots of virtual Domain Controller cannot, generally, be used to perform a complete Active Directory Forest restore. You’ll need proper Active Directory backups for this purpose. Also virtualization safeguards might interfere with the current Disaster Recovery plan for the RID Master, Plan accordingly.
Promotions vs. Restores
It is often easier to deploy a new Windows Server installation and promote it to a Domain Controller, than trying to restore a Domain Controller from a backup.
Of course, your mileage may vary, depending on the agents and additional software running on Domain Controllers.
In general – it is unlikely you’ll frequently encounter the Active Directory virtualization safeguards, but it’s good to know it’s there if you need it to cover your behind.
New features in AD DS in Windows Server 2012, Part 12: Virtualization-safe AD
List of Hypervisors supporting VM-GenerationID
Cases where VM-GenerationID doesn’t trigger AD virtualization safeguards, Part 1
Cases where VM-GenerationID doesn’t trigger AD virtualization safeguards, Part 2
Windows Server 2012 VM-Generation ID Support in vSphere
Which vSphere Operation Impacts Windows VM-Generation ID?
At last! Virtual domain controllers just work
Active Directory VM Generation IDs
Best Practices for Virtualizing AD on VMware vSphere