How not to create redundancy in your Exchange

When I was at a client the other day I encountered the following:

Tekening1

As you can see the Exchange environment in itself already contains a single point of failure. Namely the Exchange-01 server who solemnly functions as a client access and transport hub. The two database servers however are both made high available through the use of the failover-cluster feature introduced in Windows server 2008. This in itself is a good idea. Beside the fact that this way you can create redundancy within your database hosts this also allows you to  use multiple redundant databases on both servers in a database availability group. You can even reboot one in the middle of production. For instance to  update some compromised certificates. The production reboot should notify clients to restart their outlook, but hey, your exchange is safe and up to date again.

It is a bad idea to install this failover cluster on a failover VMware cluster. The problem arises when an actual failover needs to take place. In a perfect world (where you wouldn’t even need failover since your servers would never break) failover would happen automatically if one server for whatever reason stops functioning. In the case of my client something very interesting happens.

First the database is going to be transferred to the other exchange server. All is well. At the same time, VMware steps in and fails over the Exchange server to another host or whatever it is that VMware does to keep guests alive and restores the system to it’s previous state. So the server that went down is restored with database connection while the windows failover-cluster transferred database access to the other exchange database server. With both servers wanting to access the database neither will be able to and that’s when your exchange database failover-cluster fails. This usually results in a lot of people calling the helpdesk to ask why they can’t access their mail.

This is not due to the fact that either VMware or Failover-clustering is a poor feature, this is because someone implemented a solution without proper testing.

So if you want to make Exchange redundant, only use one method and not two or more stacked methods or it will come around to byte you like an attack dog.

Upgrading SQL

When upgrading your environment to squeaky clean Windows Server 2008 R2 installations sooner or later you will have to tackle the SQL server. Migrating a Microsoft SQL server from 2005 to, say, 2008 R2 should be a walk in the park. It should be, if no one is actually using the server. Unfortunately this won’t be the case. Why buy a license to a product you’re not going to use, right?

So, first and foremost determine a maintenance window. Make sure people are not going to get mad and if they do, well, it’s always nice to be able to point out the fact that you notified them in advance.

First of all, and you can do this in advance, make sure you can run the databases on a different server while upgrading one. Ideally this would be a virtual server, but any domain member server would do. If you install the latest version of SQL Server you can also test if your databases are going to work or if they are going to screw things up. Usually databases that make use of Active Directory service account tend to be pretty easy to migrate.

The KB article describes how to do the standard SharePoint and IIS SQL databases migration:

http://support.microsoft.com/kb/314546

Assuming this all works fine and the databases of, for example, E-Policy Orchestrator are recovered from backup, as well after the migration and everything is running on your temporary server there is something you would like to keep in mind when migration to a virtual temporary server. If you are going to do the migration in the course of a couple of days, you want to add the SQL databases to you backup schedule, in case anything goes wrong. Consider the fact that the truncate of the logs can only complete after a Full backup. If you do any sort of other backup in between this won’t work, your logs will grow and the disk where you keep your logs stored might run out of available space. This basically means you database stops functioning.

This especially goes for virtual machines which usually get backed up with the host during a backup cycle. Make sure the Full Backup of the databases starts after the backup from the host server and guests and truncating the logs should be good.

As you are probably aware it’s best practice to install the databases on a separate disk on the server. Also if you have space to spare, and it is a good idea to use a disc large enough for your collection of precious databases, consider formatting the disc with 64Kb cluster size. This means you will get better performance at the cost of wasting some disc space.

http://msdn.microsoft.com/en-us/library/dd425070.aspx

So you’ve installed and configured the Server, correctly formatted the discs and installed Microsoft SQL Server. First of all, on your clean installation, don’t forget to configure the Windows Firewall. You should have TCP/IP port 1433 (default) open for incoming and outgoing Database traffic.

You now can migrate the databases back. Probably most will work. Then there is the one weird application which brakes. There always is one. Usually there are 3 possible explanations for this:

  • Either the service account is not correctly configured and this usually is a security issue. The applications documentation should help with this.
  • Then there are the applications that work with SQL based accounts and logins. Make sure the database is in mixed mode. With a new SQL server installation those that don’t work probably have lost their principal name in the master database and therefore, don’t work. These applications might have log files which can tell you more about the error you’re encountering. If the principal names are lost you might want to consider restoring them from a backup. What also works is deleting the accounts from the databases you’ve migrated and create new SQL account logins with the same credentials. Make sure these accounts get the appropriate rights in their databases. Most of the time owner will do. Sometimes these applications make use of different than normal or dbo schema’s so also make sure to apply the correct schema to the logins.
  • Third there are the applications you either will have to install from scratch or call in support…there is only so much you can do.

Concluding

Migrating SQL can be hazardous because a lot of services might depend on it. If you draw up a plan, make sure the backup is standby, then little can go wrong. Business critical applications should be extra taken care of.