A Best Practice approach to updating Hyper-V environments

Updating environments with Hyper-V can be more of a challenge compared to updating an environment that consists of mere physical servers. Not only the workloads need regular updating, but also the Windows servers and Hyper-V servers underneath them.

 

The challenges

Hyper-V relies on a Parent Partition, whether you’re using a Full installation of Windows Server 2008, a Server Core installation of Windows Server 2008 or the stand-alone Hyper-V Server. When you restart the Parent Partition your Child Partitions will also be paused. How to plan your maintenance window?

Updates can result in loss of functionality. Even though updates get tested thoroughly there is a chance a series or combination of updates or an incompatibility with a third party application or service hangs up your server or results in unexpected behavior. When you install any update it’s hard to troubleshoot these kinds of situations: which update resulted in the situation?

Some updates address security holes and require immediate installation in some situations: The risk of breaking stuff outweighs the risk of getting compromised.

 

A Best Practice approach

WSSRA

Within a good design for any environment a difference would be made between physical and virtual machines, safe and unsafe(r) networks, application-, messaging-, directory-, database- and security services.

The Windows Server System Reference Architecture (WSSRA) comes to mind. The basis of this architecture is to unravel an environment into five layers, (network, storage, application, management and security) supplying guidance for meeting the requirements of an enterprise. The purpose of this guidance is to build highly available, secure, scalable, manageable, and reliable enterprise infrastructure.

Virtual environments

The same architectural approach can also be applied to virtual environments. The logical division would be virtual infrastructure, hosts and workloads. In a Microsoft server environment this would mean a divide between:

Level Description Examples
1 Virtual infrastructure Windows Server 2008 with Hyper-V
Hyper-V Server 2008
2 Hosts Windows 2000 Server
Windows Server 2003
Windows Server 2008
3 Workloads Exchange Server 2007
SQL Server 2008
Terminal Services Applications (like Office 2007)

Patching goals

In a reference architecture patching would yield three goals:

  • Patched systems
  • Predictable downtime during maintenance windows
  • Possibilities for investigation of relationships between patches and loss of functionality / availability

 

Formulating best practices

Patched systems

Not all updates need to be applied immediately

Software products need to be patched to provide security and functionality. Not every patch is important, depending on your situation. When the main focus for some systems is to secure systems you need to apply all security updates immediately. When your systems perform loads of transaction to other countries, you’d better apply all Daylight Savings Time (DST) patches, otherwise you can delay applying the updates a little while.

Microsoft offers three levels of updates: Important, Recommended and Optional.
Decide for yourself which updates need to be applied and when they need to be applied.

Test or delay updates

You’d better test updates in a test environment when systems are mission critical. The dependency on these systems usually justify the cost of a test environment. When you don’t have a test environment wait at least until the third Tuesday of every month (a week after Patch Tuesday) and search online for any signs of updates breaking functionality or availability.

Virtualization offers flexible means to test updates. Snapshot functionality even allows a rollback scenario for updates. Remember though problems may occur on physical machines that you might not experience in virtual machines…

Predictable downtime

Automatically applying updates

Windows offers functionality to apply updates immediately. By default updates will be applied at night around 3:00 AM. This may not be an ideal method to apply updates:

  • A branch office on the other side of the world might be using the system at that time
  • The updates might be applied during backup, defragmentation or other maintenance

Furthermore this setting doesn’t offer much control. In a small environment without a dedicated systems manager the setting would sound logical, but in large environment choosing the setting is illogical.

Windows Server Update Services

A means to gain control over updates and when (parts of) your servers restart (services) to apply updates is to use Windows Server Update Services (WSUS). Using Organizational Units and Group Policy Objects (GPOs) you can divide servers into logical groups. Setting the Microsoft products for which to apply updates, setting when to apply updates and whether to restart automatically are examples of how to control updating in your environment.

Optionally you can distribute 3rd party applications and updates through Windows Server Update Services (WSUS) by using the Local Publishing feature in the WSUS 3.0 SP1 API.

Even more control can be obtained using System Center Configuration Manager. The WSUS server integration with Configuration Manager 2007 allows to scan all clients in the organization and apply the updates.

Maintenance windows

End users don’t like to be confronted with downtime, but if they do, they prefer it to be announced in advance and have a fair amount of regularity. An IT department, that arranges a default maintenance window on Friday from 18:00 to 21:00 will receive less complaints, less questions and less frustration from end users, compared to an IT department, that organizes maintenance windows irregularly. Good candidates for maintenance windows are:

  • The company’s weekly happy hour
  • A departments weekly birthday cake eating hour
  • Lunch time

Rogue Patch investigation

A critical element in updating your Microsoft environment is investigating which update was responsible for which broken functionality. (if any) This element is more important in virtualized environments, compared to physical environments, since a rogue patch on the Windows Server in the virtualization layer may cause serious problems for all virtual guests residing on the box.

Phased updating

In combination with the suggestion of having a maintenance window every week I suggest updating per logical layer. (virtual environment, virtual hosts, workloads) For instance this would result in a maintenance window for the virtual environment (where all virtual guests will go down temporarily when the virtual host reboots) every first Friday of the month, a maintenance window for all virtual Operating Systems running every second Friday of the month and a maintenance window for workloads running in the virtual guests (for instance Microsoft Exchange Server and Microsoft SQL Server) every third Friday of the month. One whole maintenance window remains to do maintenance on the Storage Area Network (SAN), the network, etc.

Depending on your environment you’d place your most critical layer on the second Friday of the month after you’ve tested them, since Microsoft releases updates every second Tuesday of the month. (except out-of-band updates) When you delay your updates (in lack of testing) place your most critical layer on the third Friday of the month.

Using snapshots

Creating a snapshot in Hyper-V before applying updates allows you to rollback updates in case of broken functionality. When everything’s fine you can ‘flatten’ the snapshot by applying the snapshot, shut down the virtual machine and allow sufficient time for the disk changes to be merged into the main VHD.

Note:
Using snapshots may not be a good idea in combination with certain workloads (read: Active Directory Domain Controllers) or availability needs. (with large updates the virtual machine may need to be off for a long period of time)

 

Concluding

Below are five of my best practices for updating virtual environments to control the updates to your virtual server environment, control the downtime and be able to address issues with rogue updates:

  1. Distinguish a virtualization layer, a virtual guests layer and a workload layer. Plan an update strategy per layer.
  2. Don’t install updates automatically unless it makes sense. (it rarely does)
  3. Use Windows Server Update Services whenever possible.
  4. Test or delay updates.
  5. Plan maintenance windows.

Further reading

Updating a web site to apply a security patch with the help of Hyper-V
Local Publishing of Updates and Applications
Released Hyper-V updates (up till September)
Integrated Installation and The Beauty of the Win6 Servicing Stack
How Microsoft IT does Patch Management
Patch Testing
Steve Riley on Hyper-V Patching
Hyper-V How To: Patch VMs Offline

leave your comment