Saturday 12 July 2014

Exchange 2010 (journal) large mailbox move limit reached?

Journal mailboxes seem to have little said about them. I wonder how many people are using them. It would be good to store 50gb chunks of journal mailboxes on Office 365, but they are expressly not allowed. Now that would be a cheap solution!

The journal mailbox is on Exchange 2010 (SP3 RU6 at time of writing). It is pretty large - 440gb, 5.5 million messages all in one folder(!). I decided to move it to another 2010 database as part of an Exchange 2010 to 2013 migration. It crashed half way through with a complaint about too many bad items, so I ran it from PowerShell with a very high bad item count using the AcceptLargeDataLoss switch. (I note that the Exchange 2013 interface accepts large item loss and notes it in the log, which is better than forcing PowerShell usage). The move failed again about half way through after a day or so of running with a bad item count error again. I suspected this was a bogus error and I had reached the limit of what Exchange 2010 could move to another Exchange 2010 server in 1 mailbox. I worked round the problem by creating a DAG of the mailbox database in question to the same server, which was much quicker anyway as it copies the blocks of the database as opposed the messages individually. I could also have unmounted the database and copied it, then used database portability to it's new location, but preferred for it to be online/mounted all the time. The journal mailbox had to subsequently be moved to Exchange 2013 from 2010 and it was the moment of truth. Moment is not quite the word as it took several days to move it, but it moved without an error. It seems that Exchange 2013 has increased the limits for extreme mailbox moves. As Office 365 supports 50gb mailboxes I was hoping that it would work OK up to say 10 times that size, and that was the case. I need to think how to come up with a better journal mailbox partition mechanism for the longer term. Maybe one mailbox per year or something like that.

Until Microsoft come up with a solution that supports a journal mailbox system on Office 365 many people won't/can't move all their email infrastructure to the cloud.

Exchange 2013 powershell broken after server 'rename' due to certificate problem

Everyone knows you cannot rename an Exchange Server once you have installed Exchange on it.

In this example the Server needed to be renamed. Exchange 2013 was un-installed, the server removed from the domain, renamed then added back to the domain.

Exchange 2013 was installed again. I noticed that some Exchange directories under program files\exchange were not removed by the un-installation, but decided that MS knew what they were doing. There were several GB worth of directories. I pondered on what else was left behind...

Installation gave no errors, and the server was put as the member of a DAG and came up with some errors when the DAG was set up. Interesting... firing up PowerShell on the server in question gave a cryptic error and little else.
Runspace Id: 22b854a9-cbd4-4567-97b6-f3aa52c12249 Pipeline Id: 00000000-0000-0000-0000-000000000000. WSMan reported an error with error code: -2144108477.
 Error message: Connecting to remote server ex2.mydomain.local failed with the following error message : [ClientAccessServer=EX,BackEndServer=ex2.mydomain.local,RequestId=a34012f8-4b26-4ac7-9cb4-b57657fb9adf,TimeStamp=03/07/2014 16:31:29] [FailureCategory=Cafe-SendFailure]  For more information, see the about_Remote_Troubleshooting Help topic.

Deleting and recreating the PowerShell virtual directory made no difference.
http://technet.microsoft.com/en-us/library/dd335085(v=exchg.150).aspx

Then my colleague noted a strange event in the event logs:
An error occurred while using SSL configuration for endpoint 0.0.0.0:444.  The error status code is contained within the returned data.

This was more like it. HTTPS was pointing to a non-existent certificate - probably the original self signed certificate from the early installation, that was deleted during 'manual' tidying up. The new self signed certificate was bound and everything started working again.

I think next time an Exchange server needs 'renaming' I will un-install Exchange, then reinstall Windows from scratch... I doubt that much time is spent investigating problems like the above by the Exchange development team.

Exchange 2010 to 2013 OWA delays after migration needs IIS application pool recycling

Migrating batches of OWA users from Exchange 2010 SP3 RU6  to 2013 CU5

When the http web page shortcuts were updated to point to CAS2013 (eg https://cas2013/owa) servers instead of CAS2010 servers just prior to the mailbox moves these users were unable to open their deleted items folder but this was a minor error so it was ignored as the users were only in this state for a short time. Presumably there is a subtle bug when the CAS2013 was proxying to the CAS2010. Everything else worked perfectly. During testing no-one had been in the deleted items folder!

After moving mailboxes to Exchange 2013 a small number of users were getting a major error where it appeared that the CAS2013 was still attempting to proxy back to CAS2010, and the CAS2010 attempting to retrieve data from Mailbox2013! Obviously OWA 2010 does not know how to get data from a 2013 mailbox and fails.

The error always had the same pattern.
Remember that the original request is to https://CAS2013/OWA, leaving the CAS2013 to decide how to handle the request.

Request
Url: https://CAS2010.domain.local:443/owa/forms/premium/StartPage.aspx (proxied in error to 2010)
User: AUser
EX Address: /o=domain/ou=domain/cn=Recipients/cn=AUser
SMTP Address: AUser@domain.com
OWA version: 14.3.195.1
Mailbox server: MAILBOX2013.domain.local

Exception
Exception type: Microsoft.Exchange.Data.Storage.StoragePermanentException
Exception message: Cannot get row count.
Call stack
Microsoft.Exchange.Data.Storage.QueryResult.get_EstimatedRowCount() ...
Rest omitted for brevity

After looking round the problem appeared similar to this one with the EAS directory/mobile devices.

However, all the EAS devices picked up the new settings within 10 minutes of the mailbox move...

I manually recycled the application pools and the mailboxes magically started working in OWA.
It looks like a different variant of the same bug. Lets hope they fix it soon...

Friday 4 July 2014

Outlook authentication problem after upgrading from Exchange 2010 SP2 to SP3 plus roll up

Upgrading an Exchange 2010 SP2 server to Exchange 2010 SP3 + RU6 in readiness for Exchange 2013 migration.

The service pack was applied, and then the roll up. No problems.

The next day some Outlook users were complaining of an authentication prompt 'randomly' asking for username and password when they were doing nothing in Outlook. After much investigation we tracked it down to the offline address book. This explained why the authentication prompt was only appearing infrequently, when Outlook went to download the address book in the background. After double checking the OAB physical directory / virtual directory permissions etc and oab.xml could be opened as you would expect there was no obvious cause to this error.  Terminal server users did not have this problem as they were using non cached mode without the offline address book. Changing a cached Outlook user to direct would also sort the problem, by bypassing the offline address book in the same manner. My colleague suggested we reboot the server, but this was more in desperation than anything else. The server had not been rebooted since the application of the SP3/RU6.

The server was rebooted and... magically the prompt went away. How bizarre is that? When you first install Exchange the setup programme tells you to reboot before putting the server into production, but not the service pack. The moral of the story is to always reboot after applying a service pack. I have no idea what could have happened but was more than happy when the reboot sorted it, as the next avenue was to open an expensive support ticket with MS support.