This article was originally posted September 2009 on my old blog blogs.technet.com/ad which I handed over to Alex Simons years ago. I wrote it to help Active Directory admins gain a better understanding of how AD replication works so they could catch problems or fine-tune AD update convergence. Of course, the reference to Microsoft Operations Manager (MOM) is a dead giveaway to the fine aging of the article.
I think the information in this article is still relevant for AD admins now, but it is also incredibly useful information for security architects, consultants, and analysts.
As I mentioned in my last article, Active Directory continues to be a target for bad actors. The security industry therefore concentrates endpoint security on the domain members (workstations, servers) and domain controllers in the environment. And rightly so! AD is where an attacker will be trying to glean info about an environment to begin lateral movement and possible elevation to the cloud.
Why not simply review the directory itself rather than the audit or other event logs? AD replication metadata can only be overwritten by later updates. So, if there is missing log data at least there is still that forensic thread to unravel.
The directory may have some evidence in terms of what has been updated. Each update of an attribute requires detailed tracking to ensure that the entire directory (all domain controllers) gets the update. That tracking is metadata describing which attribute(s) where changed, on which DC, the version of the attribute which is incremented each time an update is applied, and when it was changed. This data alone can provide key insights, but gains solidity when combined with secure channel information or audit information. When you add in data from other sources then a more holistic and clearer picture can emerge.
Which data? Attributes like password unicodePwd, Kerberos servicePrincipalName, group memberof. Bad actors could do things like change a password, add SPNs, add user or group members, and other things which would leave a trail of forensic information in the AD replication metadata. Not only AD replication convergence can be tracked using these techniques. A savvy security practitioner could track individual updates to specific attributes on specific AD objects.
This article can help you follow that trail. Read below to find the updates you are looking for.
In this blog post we’re going to go over a few techniques that are a bit old school but will come in handy for understanding how things work even if you ultimately use a great monitoring suite like MOM. Now, there are great articles here and here that describe good general ways to start checking your AD replication-and the information on those articles still applies. In this post we’re going to go a bit past and to the side of them though.
Before we go further, we need to go over USN Highwater-marks and Up to Dateness vectors and how they are used. In my experience these are the two data points in tracking updates that are the most confusing in Active Directory replication.
Of course, USNs are Update Sequence Numbers and are an ever-increasing counter of numbers assigned to updates-unique per domain controller. As updates are received from peer replicas, or as updates originate at that domain controller itself, the next USN in the series is used to signify that update. In other words USNs are local numbers on each DC. However, those local USNs are monitored by peer domain controllers who look at what the most recent and highest number USN was in order to help decide whether or not some of those updates are needed to be replicated in. If they are not needed then they can be discarded…which is what propagation dampening is.
A recent supportability article had excellent explanations of up-to-dateness vector and high-water mark which I’m pasting below:
For each directory partition that a destination domain controller stores, USNs are used to track the latest originating update that a domain controller has received from each source replication partner, as well as the status of every other domain controller that stores a replica of the directory partition. When a domain controller is restored after a failure, it queries its replication partners for changes with USNs that are greater than the USN of the last change that the domain controller received from each partner before the time of the backup.
The following two replication values contain USNs. Source and destination domain controllers use them to filter updates that the destination domain controller requires.
- Up-to-dateness vector A value that the destination domain controller maintains for tracking the originating updates that are received from all source domain controllers. When a destination domain controller requests changes for a directory partition, it provides its up-to-dateness vector to the source domain controller. The source domain controller then uses this value to reduce the set of attributes that it sends to the destination domain controller. The source domain controller sends its up-to-dateness vector to the destination at the completion of a successful replication cycle.
- High water mark Also known as the direct up-to-dateness vector. A value that the destination domain controller maintains to keep track of the most recent changes that it has received from a specific source domain controller for an object in a specific partition. The high-water mark prevents the source domain controller from sending out changes that are already recorded by the destination domain controller.
Let’s dig in with a scenario where you are the admin and you have noticed that there is a replication backlog at some AD sites. In this situation we have anecdotal complaints from our help desk where they see users created in New York have an hour or even occasionally days before those users show up on DCs in the Los Angeles site. Although it’s sometimes wise to take help desk reports with a grain of salt this isn’t something you want to ignore.
We have three sites-Los Angeles, Kansas City and New York-and we have DCs in each site. For the question at hand, we need to figure out whether there is, in fact, a replication back log and if so how big it is. Repadmin.exe, since it is the Swiss Army knife of AD replication tools, would be the first tool to use (repadmin /showrepl * /csv that is) however it is entirely possible to have a back log of updates between two replicas and not see constant or even intermittent errors from them if they are replicating-albeit replicating slowly.
Now let’s see why the USNHighwater-mark and Up-to-Dateness Vectors are important in tracking updates by using the command “repadmin /showutdvec < hostname> <distinguished name of naming context>”. To understand what is happening between the three DCs Server15 in LA, Server17 in KC, and Server12 in NY we will need to run the showutdvec command once on each server and then examine the results.
Ran on or against Server15:
LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45
KansasCity\server17 @ USN 35282103 @ Time 2009-09-17 12:51:15
NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39
Ran on or against Server17:
LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45
KansasCity\server17 @ USN 36483665 @ Time 2009-09-21 10:54:41
NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39
Ran on or against Server12:
LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45
KansasCity\server17 @ USN 35295102 @ Time 2009-09-18 07:03:08
NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39
Let’s take KC and NY and compare them:
KC LOCALLY:server17 @ USN 36483665
NEW YORK: server17 @ USN 35282103
Now subtract what NY knows of KC having versus what KC has as high-water mark:
36483665 minus 35282103 = 1201562
So, there is a difference of 1,201,562 updates between what the Kansas City server named Server17 has and what its peers think it has.
This tells us that Server17 has received (from some other DC not listed above) or originated approximately 1.2 million updates and that the LA and New York servers have not processed those updates yet. This also tells us that the KC DC Server17 is receiving inbound updates from the other two sites just fine.
That suggests a replication backlog, since the up-to-dateness vector (that USN number above) for Server17 which the LA and NY servers have retained for tracking locally are lower than the USN Highwater-mark which actually is on the KC server itself.
Are all of these updates ones that NY and LA actually need? Perhaps not-it simply depends on the nature of the updates.
More than likely propagation dampening will occur as the replicas try to process the updates from KC. Propagation dampening is the routine which assesses whether a received updated is needed by the local domain controller or not. If the update is not, then it is discarded. For those unneeded updates you would see an event like below following a similar event ID 1240 if you have your NTDS diagnostic logging for Replication events turned up:
9/20/2009 10:35:30 AM Replication 1239 Server15
Internal event: The attribute of the following object was not sent to the following directory service because its up-to-dateness vector indicates that the change is redundant.
Attribute:9030e (samaccountname)
Object:<distinguishedname of object>
Object GUID:d8frg570-73f1-4781-9b82-f4345255b68u
directory service GUID:9fbfdgdf66-3e75-4542-b3e7-2akjkj776b
That leads us to the question of how to find out more about what those updates are.
To do that we can issue an LDAP query against KCs DC Server 17 for all of the objects that have a recent WhenChanged attribute. To do that we first get the USNHighwatermark for the given partition from our showvector command above and subtract a number from it in order to display the most recent updates against that DC. In our scenario that would be 36483665, and we will subtract 1000 in order to query for the most recent 1000 updates.
- Open LDP.EXE.
- From the Connection menu select Connect and then press OK in the Connect dialogue that appears.
- From the Connection menu select Bind and then press OK in the Connect dialogue that appears.
- Next, click on the Browse menu and select Search.
- Enter the partition’s distinguished name in the BaseDN field (DC=<partname>,DC=com).
- Paste the following in the filter field: (usnchanged>=36482665)
- Select Subtree search.
- Click on Options and change the size limit to 5000.
- Still in Options add the following to the Attributes list (each entry separated by semicolon) to those already present: usnchanged;whenchanged
- Then click Run.
And here is a sample of our result set:
>> Dn: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com
4> objectClass: top; person; organizationalPerson; user;
1> cn: Test134417;
1> distinguishedName: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com;
1> whenChanged: 09/13/2009 15:11:26 Central Standard Time;
1> uSNChanged: 36483650;
1> name: Test134417;
1> canonicalName: treyresearch.com/Accounting/Test134417;
>> Dn: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com
4> objectClass: top; person; organizationalPerson; user;
1> cn: Test134418;
1> distinguishedName: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com;
1> whenChanged: 09/13/2009 15:11:26 Central Standard Time;
1> uSNChanged: 36483649;
1> name: Test134418;
1> canonicalName: treyresearch.com/Accounting/Test134418;
In this case, after a large sampling of all the most recent updates to occur on the KC DC, we see that someone or something is creating users named Test<number> in the Accounting OU.
Is it some provisioning software that the accounting department uses? A migration from another directory? What if the objects were of some other type, something unique enough to be immediately understood?
These are all questions that you can apply to a concern like this once you have an idea about those updates you are looking for.