Thursday 13 June 2013

Active Directory Office 365 sync problems

Setup is Directory Sync to Office 365 hybrid (ie write back some AD attributes from Office 365) with password sync.

On initial sync there were some problems not found with the Office 365 pre-check tool.

A few users had inadequate permissions to write-back AD attributes. This was caused by them not inheriting security in AD. Switching inheritance back on fixed the problem immediately. Why they had been set to not inherit many years ago remains a mystery.

Another user who had moved to a different role just would not sync due to a duplicate attribute. The sync tool 'helpfully' omitted which attribute was in conflict.  The user in question had been copied from the original user and then changed and the original user left in place for business reasons. After some time including checking every (visible?) AD attribute no duplicates could be found. The (very useful) idfix tool did not spot any problems either with this record. Google/Bing for idfix.
I added the person to my Outlook to take a look what they had in their mailbox and 2 copies of the same mailbox appeared - very strange- never seen that before. I then removed the person from my Outlook only for 1 of the mailboxes to remain. It seemed obvious that the record had some major problem.The person in question no longer required the account and it was deleted in AD. 2 syncs later the problem had gone. It would have been useful to know exactly what was failing the unique constraint. I am sure it would be pretty easy for the programmers to report this information in the sync failure email you get sent.

I strongly recommend you pin the sync tool UI to the taskbar / desktop as you might be looking at it more than you had hoped at the start!

You can find it here (could change with future versions):
C:\Program Files\Windows Azure Active Directory Sync\SYNCBUS\Synchronization Service\UIShell\miisclient.exe

When you look at the on screen log of what it does automatically it is easy to run manual syncs by mimicking the steps it takes. There is a PowerShell command start-onlinecoexistencesync but you will be needing the GUI to track down problems. This GUI is actually a poor man's very crude AD auditing tool as you can see how many and the nature of changes to your AD and roughly when they happened but not who did it. It is certainly better than nothing, and is a good spin-off benefit of having installed the software.

It can save time when troubleshooting to change the default sync time from 3 hours to 5 minutes so the sync tool keeps attempting to sync whilst you are troubleshooting without you having to run anything:

To do this change the file:
C:\Program Files\Windows Azure Active Directory Sync\Microsoft.Online.DirSync.Scheduler.exe.Config

temporarily from this:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <!--the interval in hours-->
    <!--refer for valid values:http://msdn2.microsoft.com/en-us/library/system.timespan.parse.aspx-->
    <add key="SyncTimeInterval" value="3:0:0" />
  </appSettings>
</configuration>

to this:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <!--the interval in hours-->
    <!--refer for valid values:http://msdn2.microsoft.com/en-us/library/system.timespan.parse.aspx-->
    <add key="SyncTimeInterval" value="0:5:0" />
  </appSettings>
</configuration>

Set it back to your selected sync time afterwards - I would not recommend leaving it at 5 mins!

Happy syncing



Tuesday 4 June 2013

SQL Server 2012 Extended events for error_reported with some merge replication noise removed

One of the main problems with the XE error_reported event is that it includes informational data that most people would not define as an error. Here is my version, which merges various internet versions I have seen with my own additions and is pretty simple. Many of the items filtered have a severity of 10 but I prefer to filter each error type individually having seen them appear in the trace file and made a conscious decision that the error can be ignored.  It might be a good start if you don't have anything else.

With extended events I always save 10 files of 100 meg as you can copy a 100 meg file over relatively low bandwidth comms if the need arises and 1 gig is not very onerous on storage. The files from this script would take several years to rollover as very few genuine errors are now logged.

As with all scripts use on your own servers at your own discretion. It is pretty low impact unless you have lots of errors in which case you will have more or different noise to my applications or some errors that need fixing! I don't need track_causality switched on with this trace but you might do depending on what you are doing. There are many bespoke replication 'noise' and minor errors filtered out which you might be an important error in your system so check before use.


CREATE EVENT SESSION [NM_Error] ON SERVER 
ADD EVENT sqlserver.error_reported(
ACTION(sqlserver.database_name,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_nt_username,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack)
    WHERE ([error_number]<>(14108) AND [error_number]<>(20532) AND [error_number]<>(14149) AND [error_number]<>(20556) AND [error_number]<>(20554) AND [error_number]<>(20567) AND [error_number]<>(20568) AND [error_number]<>(3262) AND [error_number]<>(14226) AND [error_number]<>(17177) AND [error_number]<>(14150) AND [error_number]<>(14554) AND [error_number]<>(3197) AND [error_number]<>(3198) AND [error_number]<>(2528) AND [error_number]<>(18264) AND [error_number]<>(3211) AND [error_number]<>(3014) AND [error_number]<>(4035) AND [error_number]<>(5701) AND [error_number]<>(5703) AND [error_number]<>(18265) AND [error_number]<>(14205) AND [error_number]<>(14213) AND [error_number]<>(14214) AND [error_number]<>(14215) AND [error_number]<>(14216) AND [error_number]<>(14549) AND [error_number]<>(14558) AND [error_number]<>(14559) AND [error_number]<>(14560) AND [error_number]<>(14561) AND [error_number]<>(14562) AND [error_number]<>(14563) AND [error_number]<>(14564) AND [error_number]<>(14565) AND [error_number]<>(14566) AND [error_number]<>(14567) AND [error_number]<>(14568) AND [error_number]<>(14569) AND [error_number]<>(14570) AND [error_number]<>(14635) AND [error_number]<>(8153) AND [error_number]<>(14638) AND [error_number]<=(50000))) 
ADD TARGET package0.event_file(SET filename=N'd:\sqlxe\NM_Error',max_file_size=(100),max_rollover_files=(10))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

ALTER EVENT SESSION [NM_Error] ON SERVER STATE=START