ASim parsers and the watchlist nobody tells you to populate

There’s a specific failure mode in Microsoft Sentinel that may catch teams out the first time they deploy an ASim-dependent solution. The Content Hub install completes. The analytic rules show up as Active. The data connector says “connected”. The Health blade is green. And the rules never fire, because the parser they sit on top of returns zero rows, because a watchlist that nobody mentions during installation hasn’t been populated.

The whole stack passes every “deployed” check the platform has, and produces nothing.

What ASim actually does

ASim (Advanced Security Information Model) is Microsoft’s normalisation layer for Sentinel. The pitch is reasonable. Instead of writing one detection rule per log source, you write one rule against a normalised schema, and source-specific parsers handle the messy work of translating each vendor’s quirks into a common shape. A query against _Im_Authentication returns logon events from Windows, Entra ID, Cisco ISE, and Okta in the same field layout. Same idea for network sessions, DNS, process events, and most of the schemas a SOC actually cares about.

One scope note before going further: this post is about syslog-shaped, on-prem source patterns where the watchlist matters. Cloud-native sources (Microsoft Defender for Cloud, Microsoft Defender XDR, Entra ID logs) generally don’t use this watchlist mechanism. Their parsers identify rows by table name or fixed schema markers, and the watchlist is a non-issue.

The architecture has three layers, and the second is where things can go awry:

flowchart LR
    A[Raw logs<br />Syslog table] --> B[Source-specific parser<br />vimAuthenticationCiscoMerakiSyslog]
    W[(Watchlist<br />Sources_by_SourceType)] -.SearchKey lookup.-> B
    B --> C[Unifying parser<br />_Im_Authentication]
    C --> D[Analytic rule<br />Brute-force detection]

    style A fill:#d4edda,stroke:#28a745,color:#212529
    style B fill:#fff3cd,stroke:#ffc107,color:#212529
    style W fill:#f8d7da,stroke:#dc3545,color:#212529
    style C fill:#d4edda,stroke:#28a745,color:#212529
    style D fill:#d1ecf1,stroke:#17a2b8,color:#212529

The unifying parsers (the ones starting with _Im_ or imAuthentication) are what your detection content queries. Underneath each one is a union of source-specific parsers. Each source-specific parser knows about exactly one product or wire format: Cisco Meraki Syslog, VMware ESXi Hostd, Infoblox NIOS DHCP, and so on.

Source-specific parsers all do the same three things. Filter the input table to rows that belong to their source. Parse those rows into the schema’s expected fields. Hand the result back up to the unifying parser.

Step one is the part that breaks.

The “which rows are mine?” problem

Take VMware ESXi as the worked example. The parser reads the Syslog table, but Syslog is shared with everything else that emits to the same DCR. Linux servers, network appliances, the AMA forwarder VM itself, sometimes domain controllers depending on your ingestion design. The parser needs to know which Computer values represent ESXi hosts.

The OOB ESXi parser tries three methods.

Match against a list of hardcoded “heuristic” ProcessName strings (vpxd-main, vmkwarning, hostd-probe). On modern ESXi, those names are wrong. vpxd-main requires vCenter and often gets dropped if you’ve tuned your DCR. vmkwarning is a verbose ident most teams filter out at ingest. hostd-probe isn’t a thing modern ESXi emits at all.
Match against a hardcoded list of hostnames literally named 'ESXiserver1' and 'ESXiserver2'. These are placeholder values the parser ships with. The documentation tells you to edit them. Almost nobody does.
Look up the Sources_by_SourceType watchlist, find rows where SearchKey == 'VMwareESXi', and use those Source values.

Method three is the only one that scales as the fleet grows. It’s the one the documentation leans on.

How the watchlist is meant to work

The watchlist is a flat lookup with two columns. One row per host:

SourceType,Source
VMwareESXi,esx-host-01
VMwareESXi,esx-host-02
CiscoMeraki,meraki-mx-01
LinuxAuthpriv,jumpbox-prd-01

SourceType is the column you designate as the SearchKey when creating the watchlist. Source is the value the parser joins against Computer. Once designated, Sentinel exposes that column’s values as SearchKey in queries, which is why the parser does where SearchKey == 'VMwareESXi' even though the underlying CSV header reads SourceType.

Around 15 Microsoft-shipped parsers query this watchlist: the VMware ESXi parser, several Cisco parsers (Meraki, UCS, ISE), Citrix ADC, Juniper SRX, Linux Authpriv, Pulse Connect Secure, RSA SecurID, Sophos XG, the Symantec family, and the Infoblox NIOS DNS and DHCP suite.

There’s a catch worth flagging: shipped Microsoft content uses two different watchlist names for what’s conceptually the same lookup. The official Infoblox solution ships an ARM template that creates a watchlist literally named Sources_by_SourceType, and the Infoblox parsers call _GetWatchlist('Sources_by_SourceType') to match. The ASim-style parsers for VMware ESXi and several others call _GetWatchlist('ASimSourceType'), which is a completely different watchlist name that no Microsoft solution actually ships a template for. If you’ve created ASimSourceType (because that’s what the ESXi parser asked for) and you later start ingesting Infoblox, you’ll need either a second watchlist named Sources_by_SourceType with the same content, or a local fork of the Infoblox parsers. Easy to miss until you wonder why DHCP detections aren’t firing.

What happens when the watchlist is empty

The parser still runs. It returns zero rows.

The analytic rule on top still runs on its scheduled cadence. It returns zero rows. The rule’s triggerOperator: gt and triggerThreshold: 0 evaluate to false, so the rule never fires. Sentinel marks it Active. The Logs blade returns no errors. The Health blade is green. The connector card says “connected”.

You have zero detection coverage for that source, and nothing in the platform raises a flag.

This is the failure mode that catches people out. Installing the Content Hub solution feels like deployment. The rules show up Active. The Last 24 Hours panel says zero alerts, which is what you’d expect from a quiet day rather than a broken pipeline. The first time the gap usually surfaces is when somebody hunts for an event they know happened, and the parser returns nothing.

Verifying the chain

Three queries, in order. Run all three before trusting any new ASim-dependent rule.

// 1. Watchlist has the right SearchKey and Source rows
_GetWatchlist('ASimSourceType')
| where SearchKey == 'VMwareESXi'
| project SearchKey, Source

If this returns zero rows, the parser will return zero rows. Stop here and populate the watchlist before going further.

// 2. The Source values match real Computer values in Syslog
let watchlistSources =
    _GetWatchlist('ASimSourceType')
    | where SearchKey == 'VMwareESXi'
    | project Source;
Syslog
| where TimeGenerated > ago(24h)
| where Computer in (watchlistSources)
| summarize Events = count() by Computer

If watchlist rows exist but this returns zero, the Source values don’t match Computer exactly. Case sensitivity is the most common cause. Trailing whitespace and FQDN-versus-shortname is the second.

// 3. The parser actually returns rows
// Function name varies by source. Look it up under Functions in the Logs blade.
// For the OOB ESXi parser the alias is literally `VMwareESXi`.
// For ASim-style parsers, names look like `_Im_Authentication`, `imAuthenticationCiscoMerakiSyslog`, etc.
VMwareESXi
| take 10

If steps one and two pass but step three returns nothing, the issue is downstream of the watchlist lookup. Usually the parser’s regex extraction failing against your specific wire format. That’s a separate diagnosis and a fork-the-parser job. Less common than an empty watchlist, but still worth checking.

A note for anyone deploying ASim-dependent content

Treat the watchlist as a deployment dependency, not a nice-to-have. The rules don’t carry a runtime check that the watchlist is populated, and they won’t fail loud. Add the verification queries above to whatever runbook you use for new connector onboarding, and add a watchlist row at the same time you onboard each new host.

The cost of getting this wrong is invisible detection gaps. The cost of getting it right is about thirty seconds per host.