May 2023 – The Road to Take

These are the Updates You Are Looking For

This article was originally posted September 2009 on my old blog blogs.technet.com/ad which I handed over to Alex Simons years ago. I wrote it to help Active Directory admins gain a better understanding of how AD replication works so they could catch problems or fine-tune AD update convergence. Of course, the reference to Microsoft Operations Manager (MOM) is a dead giveaway to the fine aging of the article.

I think the information in this article is still relevant for AD admins now, but it is also incredibly useful information for security architects, consultants, and analysts.

As I mentioned in my last article, Active Directory continues to be a target for bad actors. The security industry therefore concentrates endpoint security on the domain members (workstations, servers) and domain controllers in the environment. And rightly so! AD is where an attacker will be trying to glean info about an environment to begin lateral movement and possible elevation to the cloud.

Why not simply review the directory itself rather than the audit or other event logs? AD replication metadata can only be overwritten by later updates. So, if there is missing log data at least there is still that forensic thread to unravel.

The directory may have some evidence in terms of what has been updated. Each update of an attribute requires detailed tracking to ensure that the entire directory (all domain controllers) gets the update. That tracking is metadata describing which attribute(s) where changed, on which DC, the version of the attribute which is incremented each time an update is applied, and when it was changed. This data alone can provide key insights, but gains solidity when combined with secure channel information or audit information. When you add in data from other sources then a more holistic and clearer picture can emerge.

Which data? Attributes like password unicodePwd, Kerberos servicePrincipalName, group memberof. Bad actors could do things like change a password, add SPNs, add user or group members, and other things which would leave a trail of forensic information in the AD replication metadata. Not only AD replication convergence can be tracked using these techniques. A savvy security practitioner could track individual updates to specific attributes on specific AD objects.

This article can help you follow that trail. Read below to find the updates you are looking for.

In this blog post we’re going to go over a few techniques that are a bit old school but will come in handy for understanding how things work even if you ultimately use a great monitoring suite like MOM. Now, there are great articles here and here that describe good general ways to start checking your AD replication-and the information on those articles still applies. In this post we’re going to go a bit past and to the side of them though.

Before we go further, we need to go over USN Highwater-marks and Up to Dateness vectors and how they are used. In my experience these are the two data points in tracking updates that are the most confusing in Active Directory replication.

Of course, USNs are Update Sequence Numbers and are an ever-increasing counter of numbers assigned to updates-unique per domain controller. As updates are received from peer replicas, or as updates originate at that domain controller itself, the next USN in the series is used to signify that update. In other words USNs are local numbers on each DC. However, those local USNs are monitored by peer domain controllers who look at what the most recent and highest number USN was in order to help decide whether or not some of those updates are needed to be replicated in. If they are not needed then they can be discarded…which is what propagation dampening is.

A recent supportability article had excellent explanations of up-to-dateness vector and high-water mark which I’m pasting below:

For each directory partition that a destination domain controller stores, USNs are used to track the latest originating update that a domain controller has received from each source replication partner, as well as the status of every other domain controller that stores a replica of the directory partition. When a domain controller is restored after a failure, it queries its replication partners for changes with USNs that are greater than the USN of the last change that the domain controller received from each partner before the time of the backup.

The following two replication values contain USNs. Source and destination domain controllers use them to filter updates that the destination domain controller requires.

Up-to-dateness vector A value that the destination domain controller maintains for tracking the originating updates that are received from all source domain controllers. When a destination domain controller requests changes for a directory partition, it provides its up-to-dateness vector to the source domain controller. The source domain controller then uses this value to reduce the set of attributes that it sends to the destination domain controller. The source domain controller sends its up-to-dateness vector to the destination at the completion of a successful replication cycle.
High water mark Also known as the direct up-to-dateness vector. A value that the destination domain controller maintains to keep track of the most recent changes that it has received from a specific source domain controller for an object in a specific partition. The high-water mark prevents the source domain controller from sending out changes that are already recorded by the destination domain controller.

Let’s dig in with a scenario where you are the admin and you have noticed that there is a replication backlog at some AD sites. In this situation we have anecdotal complaints from our help desk where they see users created in New York have an hour or even occasionally days before those users show up on DCs in the Los Angeles site. Although it’s sometimes wise to take help desk reports with a grain of salt this isn’t something you want to ignore.

We have three sites-Los Angeles, Kansas City and New York-and we have DCs in each site. For the question at hand, we need to figure out whether there is, in fact, a replication back log and if so how big it is. Repadmin.exe, since it is the Swiss Army knife of AD replication tools, would be the first tool to use (repadmin /showrepl * /csv that is) however it is entirely possible to have a back log of updates between two replicas and not see constant or even intermittent errors from them if they are replicating-albeit replicating slowly.

Now let’s see why the USNHighwater-mark and Up-to-Dateness Vectors are important in tracking updates by using the command “repadmin /showutdvec < hostname> <distinguished name of naming context>”. To understand what is happening between the three DCs Server15 in LA, Server17 in KC, and Server12 in NY we will need to run the showutdvec command once on each server and then examine the results.

Ran on or against Server15:

LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45

KansasCity\server17 @ USN 35282103 @ Time 2009-09-17 12:51:15

NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39

Ran on or against Server17:

LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45

KansasCity\server17 @ USN 36483665 @ Time 2009-09-21 10:54:41

NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39

Ran on or against Server12:

LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45

KansasCity\server17 @ USN 35295102 @ Time 2009-09-18 07:03:08

NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39

Let’s take KC and NY and compare them:

KC LOCALLY:server17 @ USN 36483665

NEW YORK: server17 @ USN 35282103

Now subtract what NY knows of KC having versus what KC has as high-water mark:

36483665 minus 35282103 = 1201562

So, there is a difference of 1,201,562 updates between what the Kansas City server named Server17 has and what its peers think it has.

This tells us that Server17 has received (from some other DC not listed above) or originated approximately 1.2 million updates and that the LA and New York servers have not processed those updates yet. This also tells us that the KC DC Server17 is receiving inbound updates from the other two sites just fine.

That suggests a replication backlog, since the up-to-dateness vector (that USN number above) for Server17 which the LA and NY servers have retained for tracking locally are lower than the USN Highwater-mark which actually is on the KC server itself.

Are all of these updates ones that NY and LA actually need? Perhaps not-it simply depends on the nature of the updates.

More than likely propagation dampening will occur as the replicas try to process the updates from KC. Propagation dampening is the routine which assesses whether a received updated is needed by the local domain controller or not. If the update is not, then it is discarded. For those unneeded updates you would see an event like below following a similar event ID 1240 if you have your NTDS diagnostic logging for Replication events turned up:

9/20/2009 10:35:30 AM Replication 1239 Server15

Internal event: The attribute of the following object was not sent to the following directory service because its up-to-dateness vector indicates that the change is redundant.

Attribute:9030e (samaccountname)

Object:<distinguishedname of object>

Object GUID:d8frg570-73f1-4781-9b82-f4345255b68u

directory service GUID:9fbfdgdf66-3e75-4542-b3e7-2akjkj776b

That leads us to the question of how to find out more about what those updates are.

To do that we can issue an LDAP query against KCs DC Server 17 for all of the objects that have a recent WhenChanged attribute. To do that we first get the USNHighwatermark for the given partition from our showvector command above and subtract a number from it in order to display the most recent updates against that DC. In our scenario that would be 36483665, and we will subtract 1000 in order to query for the most recent 1000 updates.

Open LDP.EXE.
From the Connection menu select Connect and then press OK in the Connect dialogue that appears.
From the Connection menu select Bind and then press OK in the Connect dialogue that appears.
Next, click on the Browse menu and select Search.
Enter the partition’s distinguished name in the BaseDN field (DC=<partname>,DC=com).
Paste the following in the filter field: (usnchanged>=36482665)
Select Subtree search.
Click on Options and change the size limit to 5000.
Still in Options add the following to the Attributes list (each entry separated by semicolon) to those already present: usnchanged;whenchanged
Then click Run.

And here is a sample of our result set:

>> Dn: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com

4> objectClass: top; person; organizationalPerson; user;

1> cn: Test134417;

1> distinguishedName: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com;

1> whenChanged: 09/13/2009 15:11:26 Central Standard Time;

1> uSNChanged: 36483650;

1> name: Test134417;

1> canonicalName: treyresearch.com/Accounting/Test134417;

>> Dn: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com

4> objectClass: top; person; organizationalPerson; user;

1> cn: Test134418;

1> distinguishedName: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com;

1> whenChanged: 09/13/2009 15:11:26 Central Standard Time;

1> uSNChanged: 36483649;

1> name: Test134418;

1> canonicalName: treyresearch.com/Accounting/Test134418;

In this case, after a large sampling of all the most recent updates to occur on the KC DC, we see that someone or something is creating users named Test<number> in the Accounting OU.

Is it some provisioning software that the accounting department uses? A migration from another directory? What if the objects were of some other type, something unique enough to be immediately understood?

These are all questions that you can apply to a concern like this once you have an idea about those updates you are looking for.

Using KQL with Azure AD

Understanding Azure data and how it can be reviewed using Kusto Query Language queries is necessary for any security minded person. This blog post shows how KQL can be used to track newly synced identities for suspicious activity.

If you work with Microsoft cloud services very much you will have noticed that the service telemetry is often available using Kusto Query Language via Azure Data Explorer. The online documentation explains that Kusto Query Language is a powerful tool to explore your data and discover patterns, identify anomalies and outliers, create statistical modeling, and more. Kusto queries use schema entities that are organized in a hierarchy like SQLs: databases, tables, and columns. What is a Kusto query? A Kusto query is a read-only request to process data and return results. The request is stated in plain text, using a data-flow model that is easy to read, author, and automate. Kusto queries are made of one or more query statements.

KQL has been my favorite tool for years since it allowed me to identify service problems, gauge feature performance against desired results, and condense the information so that it made sense when put into a visual display like a chart. KQL allows for concise, quantifiable answers from large sets of data from one or more databases which can be tough to do using other methods.

As you would expect, KQL’s extensibility and ease of use is one reason it is used so much by Azure services. Another big selling point is how easy it is for a service to set up and use. KQL is prevalent throughout Azure due to that ease of ingesting service telemetry from Cosmos databases. Azure Cosmos DB in turn is the typical database used due to its versatility as a static data store for basic telemetry or as part of an event-driven architecture built with Azure Data Factory or Azure Synapse. KQL (via Azure Data Explorer or Azure Log Analytics) and Cosmos DB technologies fit together very well for a solution which can help handle large sets of data in a performant way and still allow for insights into service specific and even answer cross-functional questions. We’ll talk in a later blog about how important planning service telemetry is when creating a new software product or service.

Azure Sentinel provides KQL access for performing advanced analysis of activity from multiple sources combined with User and Entity Behavior Analytics (UEBA). If you are not lucky enough to have Sentinel Azure Active Directory by itself allows you to review tenant-specific telemetry using KQL once it is published to an Azure Log Analytics workspace (albeit without the UEBA analytics). The data alone can be useful for understanding what ‘normal’ looks like in your environment or in threat hunting. The steps to sending the telemetry to your workspace and configuring retention and other settings can be found at this link.

Now that we know KQL is used with Azure AD let’s go over how to use it with a few real-world security scenarios.

One of the better-known avenues for exploitation of Azure AD is via AD on premises. AD is 23 years old now and though Microsoft does a great job of security updates and recommended controls the fact is that the AD attack surface is very large-especially if organizations do not maintain good security posture. It is a large and attractive target which means it is important to review the actions of newly synchronized users.

KQL can be used to query Azure AD and identify newly synced users and what those newly synced users have changed recently. This is a simple technique to use if you have a suspicion or just want to do a spot check. For routine matters I highly recommend using a solution which uses machine learning or AI to sift through the data and identify suspicious activity.

Figure 1 Log Analytics query for interesting audit events. Blurring and cropping to protect the innocent. Query text at bottom of article.

The data is exportable for preservation and review. When reviewing note that there are three rows in the query which will indicate that a new user was synced and the identity of that new user: Actor, OperationName, and TargetObjectUPN. We can use the Actor field since Azure AD hybrid sync automatically creates an AD on premise identity named something with SYNC-HOST in the string. The other indicators are OperationName is AddUser and of course the TargetObjectName for the identity.

Note that if your organization uses another principal for their sync service account you could add a line to select only that service principal name in the query like | where Actor = ‘actorstring’.

KQL’s real power comes into play when you can combine two or more databases together and query that data to gain a broader picture of a scenario. This is essentially what Sentinel and other services do albeit with the added special sauce of behavioral analytics and scoring algorithms.

For example, if you have a suspect or suspects you can also see which SaaS applications the identity has signed into recently to help gauge their sketchiness by reviewing the AAD audit logs with the sign in logs for the same identities. In the example below we are not filtering on a specific identity-though we could add a where statement to do that-but are querying for what any user who has been recently added via sync is signing into. You can perform a more targeted search by removing the remark on line 16 and looking to see if they sign into specific applications. For example, a user signing into “Azure Active Directory Powershell” immediately after sync could mean shenanigans.

Figure 2 Log Analytics query for interesting signin events. Blurring and cropping to protect the innocent. Query text at bottom of article.

Understanding Azure data and how it can be reviewed using Kusto Query Language queries is necessary for any security minded person. It can help to better understand information which is already present by filtering out data which is not necessary, extract relevant information from results spanning multiple databases, or even spot trends or anomalies. The ability to construct KQL queries is a valuable skill, and hopefully one that this blog post will has helped you strengthen.

Audit Log query

AuditLogs

| where ActivityDateTime >= ago(3d) //Range in time from now to query.

| extend Actor = tostring(InitiatedBy.user.userPrincipalName) //Add a row for the identity which requested the change.

| extend TargetedObject = tostring(TargetResources[0].displayName) //Add a row for the object displayname which was changed.

| extend TargetObjectUPN = tostring(TargetResources[0].userPrincipalName) //Add a row for the object UPN which was changed.

| extend ObjectType = tostring(TargetResources[0].type) //Add a row for the type of object which was targeted.

| where OperationName != “Update agreement”
and OperationName != “Import”
and OperationName != “Update StsRefreshTokenValidFrom Timestamp”
//Remove operational events which are not interesting.

| project ActivityDateTime, Actor, ObjectType, OperationName, TargetObjectUPN, TargetedObject, ResultDescription //Display only the information which helps understanding of what happened for the scenario.

Query to see what recently added users (via AAD Connect sync) have signed into

//Query to find what recently synced users are signing into

AuditLogs

| where ActivityDateTime >= ago(14d) //Range in time from now to query.

| extend Actor = tostring(InitiatedBy.user.userPrincipalName) //Add a row for the identity which requested the change.

| extend TargetedObject = tostring(TargetResources[0].displayName) //Add a row for the object displayname which was changed.

| extend TargetObjectUPN = tostring(TargetResources[0].userPrincipalName) //Add a row for the object UPN which was changed.

| extend ObjectType = tostring(TargetResources[0].type) //Add a row for the type of object which was targeted.

| extend targetObjectId = tostring(TargetResources[0].id) //Extract the displayname of the target object to its own row.

| extend InitiatedByUPN = tostring(InitiatedBy.user.userPrincipalName) //Extract the UPN of the actor object to its own row.

| where OperationName == “Add user”
and InitiatedByUPN contains
“SYNC-HOST”

| project ActivityDateTime, Actor, ObjectType, OperationName, TargetObjectUPN, TargetedObject, targetObjectId, ResultDescription //Display only the information which helps understanding of what happened for the scenario.

| join kind = rightsemi //Join kind to show only the sign-in data related to out AuditLogs entries filtered above.

(SigninLogs

| extend operatingSystem = tostring(DeviceDetail.operatingSystem) //Place the client OS in its own row

//| where AppDisplayName == “Graph Explorer” or AppDisplayName == “Azure Active Directory PowerShell” or AppDisplayName == “Microsoft Office 365 Portal” or AppDisplayName == “”

) on
$left.targetObjectId == $right.UserId

| project TimeGenerated, operatingSystem, Identity, AlternateSignInName, AppDisplayName, AuthenticationRequirement, ResultType, ResultDescription