Exchange migration “Couldn’t switch the mailbox into Sync Source mode”

Exchange

The issue

During one of my Exchange Online migration projects I encountered the following error on several mailboxes:

“Message : Error: Couldn’t switch the mailbox into Sync Source mode.
This could be because of one of the following reasons:
Another administrator is currently moving the mailbox.
The mailbox is locked.
The Microsoft Exchange Mailbox Replication service (MRS) doesn’t have the correct permissions.
Network errors are preventing MRS from cleanly closing its session with the Mailbox server.
If this is the case, MRS may continue to encounter this error for up to 2 hours – this duration is controlled by the TCP KeepAlive settings on the Mailbox server.
Wait for the mailbox to be released before attempting to move this mailbox again.”

The environment where I encountered this was an Exchange Server 2010 environment in a Database Availability Group (DAG) configuration. There where three DAG servers and two Client Access servers.
We placed two Exchange Server 2016 servers in front for the Hybrid connection with Exchange Online.

The cause

This was a uncommon error for me, so I did some research first before proceeding with the suggested action to alter the TCP settings and request that the entire Exchange Server environment is restarted.

I found several blogs with different solutions, rather then changing the TCP value. One of the suggestions was to run the mailbox repair option. I did this without success. I also tried an internal move request of the mailbox and it failed on the same error.

After reading a blog by Brad Hughes on the topic, I found a interesting remark about the cause of the error.

“When moving a mailbox, the Mailbox Replication Service (MRS) sets an “InTransitStatus” flag in the source mailbox to make sure other moves don’t try to act on this source mailbox at the same time.  This flag is really just held in memory in the source Information Store (Store) process (Store.exe for 2010 and Microsoft.Exchange.Store.Worker.exe for 2013 and 2016). “

The solution for me

So my conclusion on this was:

If it’s held in memory, what options do I have to ‘reset’ this, without restarting the servers?

Because the source mailbox databases were placed in a DAG, I asked the customer to failover the mailbox database, which contained the error user. After the failover (which caused downtime for less then a minute), I recreated the move request again and this time the mailbox was synced to Office 365.

I find this solution less disrupting and quicker than a reboot of an entire Exchange Server.

The initial error was for a primary mailbox, but this error can also occur when a user has an in-place archive. In this case, for me, it was also enough to dismount and mount the mailbox database.
Yes, you read it correctly; dismount and mount the mailbox database. The archive databases weren’t part of the DAG configuration. This action also triggers a memory reset.

This is not an error that happens when you start a migration; I also got it five times during synchronization of mailboxes. The solution here was the same, failover or dismount/mount the mailbox database.

Conclusion

I would suggest to not always go blindly with the Microsoft suggestion for the fix. Investigate the root cause is of the problem.
Yes, the Microsoft suggestion would have fixed it, but for the suggested fix to work, you need to restart all Exchange servers. This will cause a failover or dismount/mount action of the mailbox database.

For me it was just enough to just failover or dismount/mount the mailbox database(s) to get the synchronization starting or continuing. Without disrupting the rest of the organization, and aiding me in meeting my deadlines as a consultant.

I hope this helps for you, too.

TOOL: My own Exchange Move Request Report script

As a consultant I do a lot of migrations or assist in them. In this case I wrote my own script for processing the information generated by the Mailbox Replication Service. This service is used in Microsoft Exchange on-premises and in the cloud for migrating the mailboxes between databases or environments.

After multiple times doing the administration and processing of information manually. I decided to create my own processing script to retrieve the desired information and make it re-usable in Excel or other tooling, to export gathered information to CSV file format. I named the script Get Move Request Report script.

The script is provide as-is and maybe be used at your own risk.

Versions

Version 1: This is the first version and it’s basic. No advanced switch options or logging.

About the Exchange Move Request Report Script

The script consist of three functions:

1. Creates an overview of the Bad Items it found in the move request report.
2. Creates an overview of the basic move information of the move request statistics.
3. Creates an overview of the extended information of the move request statistics.
4. Creates an overview of information that quickly can be used for reports to stakeholders.

Parameters

When you execute the script and no parameters are given. It generates a bad items overview, basic move information and extended move information output of all existing move request in scope.

.\Get-MoveRequestReport.ps1

To generate only the outputs for all InProgress move requests, give the following command:

.\Get-MoveRequestReport.ps1 -Inprogress $true -AllMoves $false

To generate only the outputs for all Synced move requests, give the following command:

.\Get-MoveRequestReport.ps1 -Synced $true -AllMoves $false

To generate only the outputs for all Completed move requests, give the following command:

.\Get-MoveRequestReport.ps1 -Completed $true -AllMoves $false

Requirements

To use the script, you already need to have an Exchange PowerShell session open to the target environment (on-premises or cloud), where the move request are created.

Exchange Move Request Report Script

The script is displayed below for your review:

[CmdletBinding()]
param (
    [bool]$CompletedMoves = $false, 
    [bool]$Synced = $false,    
    [bool]$Inprogress = $false,
    [bool]$AllMoves = $true,
    [bool]$IncludeAllBadItems = $true,
    [bool]$MoveDataBasic = $true,
    [bool]$MoveDataFull = $true,
    [bool]$GenereExportInfo = $true
)

#region Functions
function generateBadItemInformation ($BadItem,$User)
{
    $ExportEntry = New-Object PSObject
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Identity -Value $User.Alias
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BatchName -Value $User.Batchname
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name DisplayName -Value $User.DisplayName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Date -Value $BadItem.Date
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Subject -Value $BadItem.Subject
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Kind -Value $BadItem.Kind
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name FolderName -Value $BadItem.FolderName
    
    
    $ExportEntry   
}

function generateBasicMoveInformation ($User)
{
    $ExportEntry = New-Object PSObject
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Identity -Value $User.Alias
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BatchName -Value $User.Batchname
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name DisplayName -Value $User.DisplayName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Status -Value $User.Status
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name SyncStage -Value $User.SyncStage
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteDatabase -Value $User.RemoteDatabase
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteHostName -Value $User.RemoteHostName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Message -Value $User.Message
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RecipientTypeDetails -Value $User.RecipientTypeDetails
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxSize -Value $User.TotalMailboxSize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxItemCount -Value $User.TotalMailboxItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalArchiveSize -Value $User.TotalArchiveSize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalArchiveItemCount -Value $User.TotalArchiveItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalPrimarySize -Value $User.TotalPrimarySize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalPrimaryItemCount -Value $User.TotalPrimaryItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BytesTransferred -Value $User.BytesTransferred

    $ExportEntry   
}

function generateFullMoveInformation ($User)
{
    $ExportEntry = New-Object PSObject
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Identity -Value $User.Alias
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BatchName -Value $User.Batchname
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name DisplayName -Value $User.DisplayName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Status -Value $User.Status
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name SyncStage -Value $User.SyncStage
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteDatabase -Value $User.RemoteDatabase
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteHostName -Value $User.RemoteHostName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Message -Value $User.Message
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RecipientTypeDetails -Value $User.RecipientTypeDetails
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxSize -Value $User.TotalMailboxSize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxItemCount -Value $User.TotalMailboxItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalArchiveSize -Value $User.TotalArchiveSize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalArchiveItemCount -Value $User.TotalArchiveItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalPrimarySize -Value $User.TotalPrimarySize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalPrimaryItemCount -Value $User.TotalPrimaryItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BytesTransferred -Value $User.BytesTransferred
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name OverallDuration -Value $User.OverallDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalSuspendedDuration -Value $User.TotalSuspendedDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalFailedDuration -Value $User.TotalFailedDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalQueuedDuration -Value $User.TotalQueuedDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalInProgressDuration -Value $User.TotalInProgressDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name StatusDetail -Value $User.StatusDetail
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name SourceVersion -Value $User.SourceVersion
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name FailureCode -Value $User.FailureCode
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name FailureType -Value $User.FailureType
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name FailureSide -Value $User.FailureSide
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name QueuedTimestamp -Value $User.QueuedTimestamp
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name StartTimestamp -Value $User.StartTimestamp
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name LastUpdateTimestamp -Value $User.LastUpdateTimestamp
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name LastSuccessfulSyncTimestamp -Value $User.LastSuccessfulSyncTimestamp
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name InitialSeedingCompletedTimestamp -Value $User.InitialSeedingCompletedTimestamp
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToContentIndexingDuration -Value $User.TotalStalledDueToContentIndexingDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToMdbReplicationDuration -Value $User.TotalStalledDueToMdbReplicationDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToMailboxLockedDuration -Value $User.TotalStalledDueToMailboxLockedDuration
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToReadThrottle -Value $User.TotalStalledDueToReadThrottle
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToWriteThrottle -Value $User.TotalStalledDueToWriteThrottle
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToReadCpu -Value $User.TotalStalledDueToReadCpu
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToWriteCpu -Value $User.TotalStalledDueToWriteCpu
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToReadUnknown -Value $User.TotalStalledDueToReadUnknown
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalStalledDueToWriteUnknown -Value $User.TotalStalledDueToWriteUnknown
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalTransientFailureDuration -Value $User.TotalTransientFailureDuration

    
    $ExportEntry   
}

function GenerateExcelInformation ($BadItems,$User)
{
    $ExportEntry = New-Object PSObject
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Identity -Value $User.Alias
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BatchName -Value $User.Batchname
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name DisplayName -Value $User.DisplayName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Status -Value $User.Status
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name SyncStage -Value $User.SyncStage
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteDatabase -Value $User.RemoteDatabase
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RemoteHostName -Value $User.RemoteHostName
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name Message -Value $User.Message
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name RecipientTypeDetails -Value $User.RecipientTypeDetails
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxSize -Value $User.TotalMailboxSize
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name TotalMailboxItemCount -Value $User.TotalMailboxItemCount
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BadItems -Value $BadItems.Count
    $ExportEntry | Add-Member -MemberType "NoteProperty" -Name BytesTransferred -Value $User.BytesTransferred
    
    $ExportEntry   
}


#endregion

#region Global Vales
$LogPath = "C:\Scripts\MRS\Reports\"
$FileTimeStamp = (Get-Date -Format yyyyMMdd-HHmmss).ToString()
$ExportBadItems = @()
$ExportBasicInformation = @()
$ExportFullInformation = @()
$ExportExcelInfo = @()
$Moves = @()
#endregion


#region Collect Movedata

If($AllMoves){
    $Moves = Get-MoveRequest | Get-MoveRequestStatistics -IncludeReport
}

If($CompletedMoves){
    $Moves = Get-MoveRequest -MoveStatus Completed | Get-MoveRequestStatistics
}

If($Synced){
    $Moves += Get-MoveRequest -MoveStatus Synced | Get-MoveRequestStatistics
}

If($Inprogress){
    $Moves += Get-MoveRequest -MoveStatus Inprogress | Get-MoveRequestStatistics
}

#endregion

#region Process moves Data

If($IncludeAllBadItems){

    $Moves2 = get-migrationuser | where{$_.SkippedItemCount -gt 0} | Get-MoveRequestStatistics -IncludeReport
    Foreach ($Move in $Moves2){
        $Baditems = ""
        $Baditems = ($Move.Report).Baditems
          
            If($Baditems -eq ""){
                Write-Host "No Errors found for " -BackgroundColor Green
            }
                Else{
                    Foreach($BadItem in $Baditems){
                        $ExportBadItems += generateBadItemInformation -BadItem $Baditem -User $Move
                    }
                }
    }

}

If($GenereExportInfo){

    Foreach ($Move in $Moves){
        $Baditems = ""
        $Baditems = ($Move.Report).Baditems
          
            If($Baditems -eq ""){
                Write-Host "No Errors found for " -BackgroundColor Green
            }
            Else{
                $ExportExcelInfo += GenerateExcelInformation -BadItems $Baditems -User $Move
            }
    }

}



If($MoveDataBasic){

     Foreach ($Move in $Moves){
        $ExportBasicInformation += generateBasicMoveInformation -User $Move
     }           
}


If($MoveDataFull){

    Foreach ($Move in $Moves){
        $ExportFullInformation += generateFullMoveInformation -User $Move           
    }
}

#endregion

#region Export data
If($IncludeAllBadItems){
    $ExportBadItems | ogv
    $FileName = $LogPath+"BadItems"+$FileTimeStamp+".csv"
    $ExportBadItems | Export-Csv -Path $FileName -Delimiter ";" -NoTypeInformation
}

If($MoveDataBasic){
    $ExportBasicInformation | ogv
    $FileName = $LogPath+"BasicMoveData"+$FileTimeStamp+".csv"
    $ExportBasicInformation |Export-Csv -Path $FileName -Delimiter ";" -NoTypeInformation
}

If($MoveDataFull){
    $ExportFullInformation | ogv
    $FileName = $LogPath+"FullMoveData"+$FileTimeStamp+".csv"
    $ExportFullInformation | Export-Csv -Path $FileName -Delimiter ";" -NoTypeInformation
}


if($ExportExcelInfo){
    $ExportExcelInfo | ogv
    $FileName = $LogPath+"ExcelData"+$FileTimeStamp+".csv"
    $ExportExcelInfo | Export-Csv -Path $FileName -Delimiter ";" -NoTypeInformation
}

#endregion

I hope this is also useful for your migrations and have fun with the script. As time goes and report demands change. I will update the script on this blog to keep you up-to-date.

The mysterious case of a failed account recovery and orphaned mailbox

In this blogpost I want to address two real-life cases that I encountered in the same Microsoft Office 365 tenant. The reason why I address the two issues in this one blog is because the errors and steps to resolution, were identical for both issues.

Background Information

The issues occurred in a cloud-only tenant. The tenant has multiple custom domains configured, and in use. The tenant consist of multiple user accounts and shared mailboxes.  There where no external scripts or data sources that are feeding into the Azure AD tenant with account information or automated management tasks.

I was called in to resolve the issues. Names and domains are anonymized for the purpose of this blogpost.

The Issues

Issue 1

The first issue occurred after a user account deletion and recovery.  There were two accounts, that were converted to shared mailboxes.

Mailbox 1: edatorial@custdomainA.com (Primary email) – UPN: edatorial@custdomainA.com – Created in 2017
Mailbox 2: edatorial@custdomainB.com (Primary email) – UPN: edatorial@custdomain.onmicrosoft.com – Created in 2018

The issue here was that mailbox 1 was accidentally deleted. We used the recovery page in the Office 365 Admin Portal to restore this account.
When we did this, we couldn’t change the primary address of both shared mailboxes. We hit an error stating that the proxy address already existed on the other account. On both accounts it was listed as a proxy address in Exchange Online and in Azure AD.
It should be impossible within Exchange Online to have the same proxy address on multiple accounts.

Issue 2

The second issue was that the customer requested a shared mailbox was to be deleted , but the customer asked for a empty shared mailbox with the same name some days later.  This mailbox was created, full access rights were delegated and people started working with the mailbox.

Mailbox 1: tooling@custdomainA.com (Primary email) – UPN: tooling@custdomainA.com

When I was handed the case, the customer reported that they couldn’t access the mailbox anymore. When I looked in Exchange Online, I saw the mailbox still listed on the Shared Mailboxes page.
In the Office 365 Admin Portal, I didn’t see the user account. Instead, it was listed on the Deleted Accounts page. We performed an account restore. This was successful, but not the solution to get it working again.

 

Resolving Issue 1

The information we started with for resolving the issues was that both accounts/mailboxes are visible in the Office 365 Admin Portal and in Exchange Online on the Shared Mailboxes page.

Observations and Symptoms

When both accounts were visible and active again, we tried to manage both accounts from the Exchange Online portal. Mailbox 1 gave an error in the management website; the account wasn’t located on the Domain Controller. Mailbox 2 gave an error, when we tried to alter the proxy addresses; the proxy address already exists on Mailbox 1.

I opened an Exchange Management Shell connection to the tenant, and tried to change the information there. I received the same errors as in the web interface; User not found and proxy address already exists.

“Could this account be incorrectly mapped?”

I checked if the accounts in Azure AD were correctly mapped to the Exchange Online accounts, by changing their display name. Within five minutes the information was updated in Exchange Online. So we know that the mailboxes are correctly mapped to the Azure AD accounts.

Then I remembered the behavior of Exchange Online, that it always wants to add the userPrincipalName (UPN) as an alias on the mailbox and cannot be removed, as long as the UPN is set. But as given in the description the UPNs already were different…

So I listed the mailbox information through the Exchange Online management shell.  Here I discovered that on both mailboxes the attributes WindowsLiveID, and MicrosoftOnlineServicesID, contained the same UPN, edatorial@custdomain.onmicrosoft.com.

Solution

Fixing Mailbox 2

Based on that discovery, I decided to update the UPN of both accounts. First I altered the UPN of mailbox 2, because this mailbox was already set to edatorial@custdomain.onmicrosoft.com . I updated the UPN of Mailbox 2 to edatorial@custdomainB.com and waited on the internal sync of Azure AD and Exchange Online. After five minutes, I checked the attributes WindowsLiveID and MicrosoftOnlineServicesID on Mailbox 2; these where updated to the new UPN information. Then I removed the edatorial@custdomain.onmicrosoft.com as an alias on Mailbox 2. This was successful and no errors were shown.

Mailbox 1 wasn’t fixed after this.

I decided to perform the same action on mailbox 1 as I did on mailbox 2. First I changed the UPN from edatorial@custdomainA.com to edatorial@custdomain.onmicrosoft.com in the Office 365 Admin Portal. Also I changed the display name back to how it was, to see, when the account was updated in Exchange Online. When this was changed, I changed the UPN back from edatorial@custdomain.onmicrosoft.com to edatorial@custdomainA.com in the Office 365 Admin Portal. After five minutes I checked the WindowsLiveID and MicrosoftOnlineServicesID attributes on Mailbox 1; these were updated to the new UPN information. Also it was now possible to manage the mailbox again.

And then…

Something curious happened 15 minutes later, though. Mailbox 1 was deleted again from Exchange Online and Azure AD. When I looked on the Deleted Users page in the Office 365 Admin Portal, the account was listed there again. We initiated a recovery once again and this worked as designed. Now the account was usable and working again. In the audit log of Azure AD, I couldn’t find the delete action, so determining the root cause of that spontaneous deletion was impossible.

Resolving Issue 2

The information for resolving started with that the account was restored in the Azure AD Portal. The mailbox was already visible in Exchange Online.

Observations and Symptoms

After the Azure AD account was restored, I checked if I could manage the mailbox again from the Exchange Online admin page. I only found an error; the object couldn’t be found on the Domain Controller.

As with Issue 1, I checked if the account was correctly mapped to the mailbox. I updated the display name and five minutes later I saw the change in Exchange Online. So I confirmed that the objects were mapped to each other. Based on the experience with Issue 1, I checked if the attributes: WindowsLiveID and MicrosoftOnlineServicesID were the same. This was not the case. The attributes were pointing to tooling@custdomain.onmicrosoft.com instead of tooling@custdomainA.com .

Solution

As solution to this problem I decided to change the userPrincipalName (UPN) from tooling@custdomainA.com to tooling@custdomain.onmicrosoft.com. This time, the change wasn’t picked up by Exchange Online. We already checked the integration, so I decided to delete the user one more time from the Office 365 Admin Portal. Also I waited on Exchange Online to see if the mailbox was deleted from their side. This was the case. So now both the Azure AD account and Mailbox where in a soft delete state.

Going from soft-delete to restored state

Now I restored the Azure AD account from the Office 365 Admin Portal and five minutes later the mailbox was also recovered. This time we could manage the mailbox again. So as the last step in the solution I changed the UPN one more time from tooling@custdomain.onmicrosoft.com to tooling@custdomainA.com  and this was now processed by Exchange Online. The attributes WindowsLiveID and MicrosoftOnlineServicesID were the same as the UPN in Azure AD.

Unknown Root Causes

At time of writing this blog, I still don’t know what caused both issues. 

All management tasks of the tenant are done through the Office 365 Admin Portal and Exchange Online.
The actions I took to resolve Issue 1 were on January 9th. When I was called in to resolve Issue 2, two days later, I saw that this account was deleted on January 9th.

Concluding

If I were to guess, the problem may lay in the  automated recovery procedure and automatic health tasks within Azure AD. I’m still trying to reproduce the issues, to point to a probable cause.

I hope that this blog was informative and useful in the future, when you might come across similar issues.

The mysterious case of Azure Backup Agent not running its schedule

This blogpost addresses a real-life issue that I encountered when migrating virtual servers. To give an impression of the situation I will give some background information.

Background information

The case starts with a migration of an existing virtual environment. The goal of the customer was to leave their current solutions provider and transfer server management to us.

Due to the time constraint for this migration, we choose to migrate the servers as-is and work from there.

We received the exported machines from the solutions provider and successfully activated it on a physical virtualization platform. Some of the virtual servers still ran Windows Server 2008 R2 and Windows Server 2012 R2.

This meant that the virtual servers were not built from scratch. We had no idea what the history is of the systems or if they have had errors in the past with updates, features or other software.

Enter backups

One of our first priorities after we successfully migrated and activated the servers and their services, was to setup the backup.

We started with a brief inventory of the installed applications and requirements. Based on the applications, we did not have the need to make stateful backups of SQL Server databases, Exchange Database Availability Groups (DAGs) or other specific applications or application data. We concluded we only needed file/folder and system state backups.

The backups needed to be stored off-site. Also, we needed the capabilities to restore the systems on the physical virtualization platform.

Azure Backup to the rescue!

Based on the above information we choose to use the Azure Backup agent, without the installation of the Azure Backup Server. This way, the backups are directly stored in Microsoft Azure Recovery Services (MARS).

What we did

We followed the Microsoft procedure. It can be found here. We created the Microsoft Azure Recovery Services vault and created a vault key to be used in the installation.

The installation of the agent went without a problem and the server had already been configured with the prerequisite software. We provided the registration information and that worked without any errors or problems, too.

After the successful registration, it was time to configure the backups; a separate schedule for the files and folders and a separate schedule for the System State backup.

So far so good. We had a backup solution and multiple backup schedules.

What happened next…

After a national holiday, we checked the servers for errors and if the backup schedule had run.

On one of the Windows Server 2008 R2 servers, there was no reference of a backup/recovery point. It looked like that the schedule wasn’t activated or hadn’t run; We found no errors in the event viewer or in the applications log.

What I did notice was that there were no references at all in the Event viewer log for the backup jobs. To validate the correct working of the application on the server, we choose to start a manual backup. This backup successfully completed without any errors.

We decided to wait one more night for the backup schedule to pick up its routine. The next day we checked the backup logs again, and no luck. The backup job still didn’t run at his scheduled time.

“Why won’t it just work?”

During my initial part of the investigation, I focused on the configuration of the job schedule itself. I examined the two configured jobs, and I thought to have found the issue; The action configuration is a PowerShell command that kicks off the backup job. Based on its job GUID.

An example is shown below:

BLOG-MARS-001

BLOG-MARS-002

The first thing I noticed, was that the parameter line, didn’t close with a . In normal PowerShell, if you start a string with , then you will need to close it with a .

This was not the case. I manually added the to the parameter line and started the backup through the Task Scheduler interface. But, same result… The job wasn’t started or shown in the GUI as failed.

Getting to the bottom of things

So, I changed the line to its original state and decided to create some test VMs. This way I could check the functionality on different operating systems. On every test VM, the action line looked the same, missing the end , but the actual schedules where starting and performing its configured task. So, the first conclusion was that the missing wasn’t the cause of the issue.

The second conclusion from this was that the Task scheduler input isn’t affected by the missing . If you run the command line yourself in PowerShell, you need to close with a to start the job.

My next step was to run the PowerShell command manually in my administrator session and a newly opened PowerShell Console. With the closing , of course. And to my surprise the actual job, started.

I cancelled the backup job and began focusing on the PowerShell Module that the command line preloads. Import-Module MSOnlineBackup;
BLOG-MARS-002

I looked up the actual location of the PowerShell Module on the server and it’s located at the following location: C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules

BLOG-MARS-003

I choose to copy the MSOnlineBackup folder to the following location: C:\Windows\System32\WindowsPowerShell\v1.0\Modules

BLOG-MARS-004

The reason for this is, that PowerShell searches predefined folders for the Modules that are called in the Import-Module command. Windows Server 2012 R2, and higher, with the latest PowerShell version, automatically preloads the modules, from these default locations, when a command is in need to autocomplete.

When the folder was copied, I tried the predefined schedule again. The result was that the backup job was started and visible in the GUI. After this result, we waited two days and the scheduled backups started and completed successfully.

The Root Cause

The root cause of the problem, was that the SYSTEM account couldn’t load/import the MSOnlineBackup module from the Task Scheduler. After I copied it to one of the system default folders locations, it could. It didn’t report the failure in any log on the system.

Double-checking my assumptions

To check this assumption, I created my own scheduled task, running with the NT AUTHORITY\SYSTEM account, to export the result of its $Env:PSModulePath to an text file.

BLOG-MARS-005

The result in this text file was that only the C:\Windows\System32\WindowsPowerShell\v1.0\Modules was listed as source directory, while for the administrator account, multiple folders where specified, including the C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules folder.

Concluding

In this case, the root cause of the problem, was the absence of the C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules directory in the $Env:PSModulePath in the SYSTEM account context. I ran the same schedule on my test virtual machine, and the result was, that multiple locations were listed, including the one for the backup.

I hope this was useful and educating for future problem analysis.