The mysterious case of a failed account recovery and orphaned mailbox

In this blogpost I want to address two real-life cases that I encountered in the same Microsoft Office 365 tenant. The reason why I address the two issues in this one blog is because the errors and steps to resolution, were identical for both issues.

Background Information

The issues occurred in a cloud-only tenant. The tenant has multiple custom domains configured, and in use. The tenant consist of multiple user accounts and shared mailboxes.  There where no external scripts or data sources that are feeding into the Azure AD tenant with account information or automated management tasks.

I was called in to resolve the issues. Names and domains are anonymized for the purpose of this blogpost.

The Issues

Issue 1

The first issue occurred after a user account deletion and recovery.  There were two accounts, that were converted to shared mailboxes.

Mailbox 1: (Primary email) – UPN: – Created in 2017
Mailbox 2: (Primary email) – UPN: – Created in 2018

The issue here was that mailbox 1 was accidentally deleted. We used the recovery page in the Office 365 Admin Portal to restore this account.
When we did this, we couldn’t change the primary address of both shared mailboxes. We hit an error stating that the proxy address already existed on the other account. On both accounts it was listed as a proxy address in Exchange Online and in Azure AD.
It should be impossible within Exchange Online to have the same proxy address on multiple accounts.

Issue 2

The second issue was that the customer requested a shared mailbox was to be deleted , but the customer asked for a empty shared mailbox with the same name some days later.  This mailbox was created, full access rights were delegated and people started working with the mailbox.

Mailbox 1: (Primary email) – UPN:

When I was handed the case, the customer reported that they couldn’t access the mailbox anymore. When I looked in Exchange Online, I saw the mailbox still listed on the Shared Mailboxes page.
In the Office 365 Admin Portal, I didn’t see the user account. Instead, it was listed on the Deleted Accounts page. We performed an account restore. This was successful, but not the solution to get it working again.


Resolving Issue 1

The information we started with for resolving the issues was that both accounts/mailboxes are visible in the Office 365 Admin Portal and in Exchange Online on the Shared Mailboxes page.

Observations and Symptoms

When both accounts were visible and active again, we tried to manage both accounts from the Exchange Online portal. Mailbox 1 gave an error in the management website; the account wasn’t located on the Domain Controller. Mailbox 2 gave an error, when we tried to alter the proxy addresses; the proxy address already exists on Mailbox 1.

I opened an Exchange Management Shell connection to the tenant, and tried to change the information there. I received the same errors as in the web interface; User not found and proxy address already exists.

“Could this account be incorrectly mapped?”

I checked if the accounts in Azure AD were correctly mapped to the Exchange Online accounts, by changing their display name. Within five minutes the information was updated in Exchange Online. So we know that the mailboxes are correctly mapped to the Azure AD accounts.

Then I remembered the behavior of Exchange Online, that it always wants to add the userPrincipalName (UPN) as an alias on the mailbox and cannot be removed, as long as the UPN is set. But as given in the description the UPNs already were different…

So I listed the mailbox information through the Exchange Online management shell.  Here I discovered that on both mailboxes the attributes WindowsLiveID, and MicrosoftOnlineServicesID, contained the same UPN,


Fixing Mailbox 2

Based on that discovery, I decided to update the UPN of both accounts. First I altered the UPN of mailbox 2, because this mailbox was already set to . I updated the UPN of Mailbox 2 to and waited on the internal sync of Azure AD and Exchange Online. After five minutes, I checked the attributes WindowsLiveID and MicrosoftOnlineServicesID on Mailbox 2; these where updated to the new UPN information. Then I removed the as an alias on Mailbox 2. This was successful and no errors were shown.

Mailbox 1 wasn’t fixed after this.

I decided to perform the same action on mailbox 1 as I did on mailbox 2. First I changed the UPN from to in the Office 365 Admin Portal. Also I changed the display name back to how it was, to see, when the account was updated in Exchange Online. When this was changed, I changed the UPN back from to in the Office 365 Admin Portal. After five minutes I checked the WindowsLiveID and MicrosoftOnlineServicesID attributes on Mailbox 1; these were updated to the new UPN information. Also it was now possible to manage the mailbox again.

And then…

Something curious happened 15 minutes later, though. Mailbox 1 was deleted again from Exchange Online and Azure AD. When I looked on the Deleted Users page in the Office 365 Admin Portal, the account was listed there again. We initiated a recovery once again and this worked as designed. Now the account was usable and working again. In the audit log of Azure AD, I couldn’t find the delete action, so determining the root cause of that spontaneous deletion was impossible.

Resolving Issue 2

The information for resolving started with that the account was restored in the Azure AD Portal. The mailbox was already visible in Exchange Online.

Observations and Symptoms

After the Azure AD account was restored, I checked if I could manage the mailbox again from the Exchange Online admin page. I only found an error; the object couldn’t be found on the Domain Controller.

As with Issue 1, I checked if the account was correctly mapped to the mailbox. I updated the display name and five minutes later I saw the change in Exchange Online. So I confirmed that the objects were mapped to each other. Based on the experience with Issue 1, I checked if the attributes: WindowsLiveID and MicrosoftOnlineServicesID were the same. This was not the case. The attributes were pointing to instead of .


As solution to this problem I decided to change the userPrincipalName (UPN) from to This time, the change wasn’t picked up by Exchange Online. We already checked the integration, so I decided to delete the user one more time from the Office 365 Admin Portal. Also I waited on Exchange Online to see if the mailbox was deleted from their side. This was the case. So now both the Azure AD account and Mailbox where in a soft delete state.

Going from soft-delete to restored state

Now I restored the Azure AD account from the Office 365 Admin Portal and five minutes later the mailbox was also recovered. This time we could manage the mailbox again. So as the last step in the solution I changed the UPN one more time from to  and this was now processed by Exchange Online. The attributes WindowsLiveID and MicrosoftOnlineServicesID were the same as the UPN in Azure AD.

Unknown Root Causes

At time of writing this blog, I still don’t know what caused both issues. 

All management tasks of the tenant are done through the Office 365 Admin Portal and Exchange Online.
The actions I took to resolve Issue 1 were on January 9th. When I was called in to resolve Issue 2, two days later, I saw that this account was deleted on January 9th.


If I were to guess, the problem may lay in the  automated recovery procedure and automatic health tasks within Azure AD. I’m still trying to reproduce the issues, to point to a probable cause.

I hope that this blog was informative and useful in the future, when you might come across similar issues.

The mysterious case of Azure Backup Agent not running its schedule

This blogpost addresses a real-life issue that I encountered when migrating virtual servers. To give an impression of the situation I will give some background information.

Background information

The case starts with a migration of an existing virtual environment. The goal of the customer was to leave their current solutions provider and transfer server management to us.

Due to the time constraint for this migration, we choose to migrate the servers as-is and work from there.

We received the exported machines from the solutions provider and successfully activated it on a physical virtualization platform. Some of the virtual servers still ran Windows Server 2008 R2 and Windows Server 2012 R2.

This meant that the virtual servers were not built from scratch. We had no idea what the history is of the systems or if they have had errors in the past with updates, features or other software.

Enter backups

One of our first priorities after we successfully migrated and activated the servers and their services, was to setup the backup.

We started with a brief inventory of the installed applications and requirements. Based on the applications, we did not have the need to make stateful backups of SQL Server databases, Exchange Database Availability Groups (DAGs) or other specific applications or application data. We concluded we only needed file/folder and system state backups.

The backups needed to be stored off-site. Also, we needed the capabilities to restore the systems on the physical virtualization platform.

Azure Backup to the rescue!

Based on the above information we choose to use the Azure Backup agent, without the installation of the Azure Backup Server. This way, the backups are directly stored in Microsoft Azure Recovery Services (MARS).

What we did

We followed the Microsoft procedure. It can be found here. We created the Microsoft Azure Recovery Services vault and created a vault key to be used in the installation.

The installation of the agent went without a problem and the server had already been configured with the prerequisite software. We provided the registration information and that worked without any errors or problems, too.

After the successful registration, it was time to configure the backups; a separate schedule for the files and folders and a separate schedule for the System State backup.

So far so good. We had a backup solution and multiple backup schedules.

What happened next…

After a national holiday, we checked the servers for errors and if the backup schedule had run.

On one of the Windows Server 2008 R2 servers, there was no reference of a backup/recovery point. It looked like that the schedule wasn’t activated or hadn’t run; We found no errors in the event viewer or in the applications log.

What I did notice was that there were no references at all in the Event viewer log for the backup jobs. To validate the correct working of the application on the server, we choose to start a manual backup. This backup successfully completed without any errors.

We decided to wait one more night for the backup schedule to pick up its routine. The next day we checked the backup logs again, and no luck. The backup job still didn’t run at his scheduled time.

“Why won’t it just work?”

During my initial part of the investigation, I focused on the configuration of the job schedule itself. I examined the two configured jobs, and I thought to have found the issue; The action configuration is a PowerShell command that kicks off the backup job. Based on its job GUID.

An example is shown below:



The first thing I noticed, was that the parameter line, didn’t close with a . In normal PowerShell, if you start a string with , then you will need to close it with a .

This was not the case. I manually added the to the parameter line and started the backup through the Task Scheduler interface. But, same result… The job wasn’t started or shown in the GUI as failed.

Getting to the bottom of things

So, I changed the line to its original state and decided to create some test VMs. This way I could check the functionality on different operating systems. On every test VM, the action line looked the same, missing the end , but the actual schedules where starting and performing its configured task. So, the first conclusion was that the missing wasn’t the cause of the issue.

The second conclusion from this was that the Task scheduler input isn’t affected by the missing . If you run the command line yourself in PowerShell, you need to close with a to start the job.

My next step was to run the PowerShell command manually in my administrator session and a newly opened PowerShell Console. With the closing , of course. And to my surprise the actual job, started.

I cancelled the backup job and began focusing on the PowerShell Module that the command line preloads. Import-Module MSOnlineBackup;

I looked up the actual location of the PowerShell Module on the server and it’s located at the following location: C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules


I choose to copy the MSOnlineBackup folder to the following location: C:\Windows\System32\WindowsPowerShell\v1.0\Modules


The reason for this is, that PowerShell searches predefined folders for the Modules that are called in the Import-Module command. Windows Server 2012 R2, and higher, with the latest PowerShell version, automatically preloads the modules, from these default locations, when a command is in need to autocomplete.

When the folder was copied, I tried the predefined schedule again. The result was that the backup job was started and visible in the GUI. After this result, we waited two days and the scheduled backups started and completed successfully.

The Root Cause

The root cause of the problem, was that the SYSTEM account couldn’t load/import the MSOnlineBackup module from the Task Scheduler. After I copied it to one of the system default folders locations, it could. It didn’t report the failure in any log on the system.

Double-checking my assumptions

To check this assumption, I created my own scheduled task, running with the NT AUTHORITY\SYSTEM account, to export the result of its $Env:PSModulePath to an text file.


The result in this text file was that only the C:\Windows\System32\WindowsPowerShell\v1.0\Modules was listed as source directory, while for the administrator account, multiple folders where specified, including the C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules folder.


In this case, the root cause of the problem, was the absence of the C:\Program Files\Microsoft Azure Recovery Services Agent\bin\Modules directory in the $Env:PSModulePath in the SYSTEM account context. I ran the same schedule on my test virtual machine, and the result was, that multiple locations were listed, including the one for the backup.

I hope this was useful and educating for future problem analysis.