How do STIX Indicator pattern's work?

The ultimate use of intelligence is to try and defend or counteract for it. For example, understanding how to put in place network defenses or to mitigate an attack that has been successful in part of its initiatives.

Part of this is to ensure you are able to detect security events (to ensure the bit of intelligence you are looking at has not already impacted you).

Many of you will be familiar with detection languages in SIEMs to search for malicious events. There might be as simple as searching for an IP address, or more complex looking for behaviours and patterns alongside evidential breadcrumbs.

In STIX 2.1, Indicator SDOs must contain a pattern Property that can be used to describe suspicious or malicious cyber activity.

The STIX 2.1 Indicator SDO specification is flexible enough to allow for a range of detection language (pattern_type) as defined in the Pattern Type Vocabulary, these are;

  • pcre: Perl Compatible Regular Expressions language
  • sigma: SIGMA language
  • snort: SNORT language
  • suricata: SURICATE language
  • yara: YARA language
  • stix: STIX pattern language

For example, I could use a sigma pattern inside an Indicator SDO by defining the Properties "pattern_type": "sigma" and print the entire Sigma rule yaml content under the "pattern" Property.

For example, here is the rule Suspicious ASPX File Drop by Exchange (sigma/rules/windows/file/file_event/file_event_win_exchange_webshell_drop.yml at master · SigmaHQ/sigma · GitHub) as a STIX 2.1 Indicator

{
    "type": "indicator",
    "spec_version": "2.1",
    "id": "indicator--0a5c5084-a05c-5f94-b21c-20c0f3a974fe",
    "created_by_ref": "identity--d2916708-57b9-5636-8689-62f049e9f727",
    "created": "2022-10-01T00:00:00.000Z",
    "modified": "2022-10-01T00:00:00.000Z",
    "name": "Suspicious ASPX File Drop by Exchange",
    "description": "Detects suspicious file type dropped by an Exchange component in IIS into a suspicious folder. The following false positives can result from this detection; Unknown",
    "indicator_types": [
        "malicious-activity",
        "anomalous-activity"
    ],
    "pattern": "{'title': 'Suspicious ASPX File Drop by Exchange', 'id': 'bd1212e5-78da-431e-95fa-c58e3237a8e6', 'related': [{'id': '6b269392-9eba-40b5-acb6-55c882b20ba6', 'type': 'similar'}], 'status': 'test', 'description': 'Detects suspicious file type dropped by an Exchange component in IIS into a suspicious folder', 'references': ['https://www.microsoft.com/security/blog/2022/09/30/analyzing-attacks-using-the-exchange-vulnerabilities-cve-2022-41040-and-cve-2022-41082/', 'https://www.gteltsc.vn/blog/canh-bao-chien-dich-tan-cong-su-dung-lo-hong-zero-day-tren-microsoft-exchange-server-12714.html', 'https://en.gteltsc.vn/blog/cap-nhat-nhe-ve-lo-hong-bao-mat-0day-microsoft-exchange-dang-duoc-su-dung-de-tan-cong-cac-to-chuc-tai-viet-nam-9685.html'], 'author': 'Florian Roth (Nextron Systems), MSTI (query, idea)', 'date': '2022/10/01', 'tags': ['attack.persistence', 'attack.t1505.003'], 'logsource': {'product': 'windows', 'category': 'file_event'}, 'detection': {'selection': {'Image|endswith': '\\\\w3wp.exe', 'CommandLine|contains': 'MSExchange', 'TargetFilename|contains': ['FrontEnd\\\\HttpProxy\\\\', '\\\\inetpub\\\\wwwroot\\\\aspnet_client\\\\']}, 'selection_types': {'TargetFilename|endswith': ['.aspx', '.asp', '.ashx']}, 'condition': 'all of selection*'}, 'falsepositives': ['Unknown'], 'level': 'high'}",
    "pattern_type": "sigma",
    "valid_from": "2022-10-01T00:00:00Z",
    "labels": [
        "level: high",
        "status: test",
        "author: Florian Roth (Nextron Systems), MSTI (query, idea)",
        "license: None",
        "attack.persistence",
        "attack.t1505.003"
    ],
    "external_references": [
        {
            "source_name": "rule",
            "url": "https://github.com/SigmaHQ/sigma/blob/master/rules/windows/file/file_event/file_event_win_exchange_webshell_drop.yml"
        },
        {
            "source_name": "id",
            "url": "bd1212e5-78da-431e-95fa-c58e3237a8e6"
        },
        {
            "source_name": "reference",
            "url": "https://www.microsoft.com/security/blog/2022/09/30/analyzing-attacks-using-the-exchange-vulnerabilities-cve-2022-41040-and-cve-2022-41082/"
        },
        {
            "source_name": "reference",
            "url": "https://www.gteltsc.vn/blog/canh-bao-chien-dich-tan-cong-su-dung-lo-hong-zero-day-tren-microsoft-exchange-server-12714.html"
        },
        {
            "source_name": "reference",
            "url": "https://en.gteltsc.vn/blog/cap-nhat-nhe-ve-lo-hong-bao-mat-0day-microsoft-exchange-dang-duoc-su-dung-de-tan-cong-cac-to-chuc-tai-viet-nam-9685.html"
        }
    ],
    "object_marking_refs": [
        "marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9"
    ]
}

Note, I’ve also mapped some of the Sigma YAML properties into some of the Indicator properties too.

You will have seen the stix specific pattern_type listed above. This is a pattern language defined by OASIS in the STIX 2.1 specification.

Here is the general structure of a STIX Pattern;

It is a lot! Let me try and take this structure apart for you.

Comparison Expressions and Operators

Comparison Expressions are the fundamental building blocks of STIX patterns.

They take an Object Path (using SCOs) and Object Value with a Comparison Operator to evaluate their relationship.

Multiple Comparison Expressions can joined by Comparison Expression Operators to create an Observation Expression.

My earlier example of a filename showed a simple Comparison Expression in a Pattern.

Here is an example of a simple Comparison Expression to detect an IPv4 address:

[ipv4-addr:value='198.51.100.1']

It uses the IPv4 Address SCO (ipv4-addr) and its ID Contributing Property (value) as the Object path (shown in specification screenshot below). The Object value is 198.51.100.1.

Another example, using a Windows Registry Key;

[windows-registry-key:key='HKEY_LOCAL_MACHINE\\System\\Foo\\Bar']

Here I use Windows Registry Key Object Key SCO and its ID Contributing Property (key) (shown in specification screenshot below). The Object value is HKEY_LOCAL_MACHINE\\System\\Foo\\Bar.

You can use a range Comparison Operators
in addition to equals (=). Does not equal (!=), is greater than (>), is less than or equal to (>=), etc.

[directory:path LIKE 'C:\\Windows\\%\\foo']

In the above example I am using the LIKE Comparison Operator. You will notice it is possible to pass capture groups. In the example above % catches 0 or more characters.

As such a pattern would match (be true) if C:\Windows\DAVID\foo, C:\Windows\JAMES\foo, etc. was observed.

Observation Expressions, Operators and Qualifiers

More than one Comparison Expression can be joined using a Comparison Expression Operator to create an Observation Expression.

The entire Observation Expression is captured in square brackets [].

For example, a pattern to match match on either 198.51.100.1/32 or 203.0.113.33/32 could be expressed with the OR Comparison Expression Operator;

[ipv4-addr:value='198.51.100.1/32' OR ipv4-addr:value='203.0.113.33/32']

Changing the Comparison Expression Operator to an AND makes the pattern match on both 198.51.100.1/32 and 203.0.113.33/32;

[ipv4-addr:value='198.51.100.1/32' AND ipv4-addr:value='203.0.113.33/32']

Observation Expressions can also be joinged using Observation Operators.

In the following example there are two Observation Expressions joined by the Observation Operator FOLLOWEDBY;

[ipv4-addr:value='198.51.100.1/32'] FOLLOWEDBY [ipv4-addr:value='203.0.113.33/32']

The FOLLOWEDBY Observation Operator defines the order in which Comparison Expressions must match. In this case 198.51.100.1/32 must be followed by 203.0.113.33/32. Put another way, 198.51.100.1/32 must be detected before 203.0.113.33/32.

Observation Expression Qualifiers allow for even more definition at the end of a pattern.

You can define WITHIN, START/ STOP, and REPEATS Observation Expression Qualifiers.

The following example requires the two Observation Expressions to repeat 5 times in order for a match;

([ipv4-addr:value='198.51.100.1/32'] FOLLOWEDBY [ipv4-addr:value='203.0.113.33/32']) REPEATS 5 TIMES

Here is another example that is very similar to a pattern used for malware detection;

([file:hashes.'SHA-256'='ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb'] AND [win-registry-key:key='hkey']) WITHIN 120 SECONDS

Here if the file hash Observation Expression and a Windows Registry Observation Expression are true within 120 seconds of each other then the pattern matches.

Precedence and Parenthesis

Operator Precedence is an important consideration to keep in mind when writing Patterns.

Consider the following Pattern:

[ipv4-addr:value='198.51.100.1/32'] FOLLOWEDBY ([ipv4-addr:value='203.0.113.33/32'] REPEATS 5 TIMES)

Here, the first Observation Expression requires a match on an ipv4-addr:value equal to 198.51.100.1/32 that precedes 5 occurrences of the Observation Expression where ipv4-addr:value equal to 203.0.113.33/32.

Now consider the following Pattern (almost identical to before, but notice the parentheses):

([ipv4-addr:value='198.51.100.1/32'] FOLLOWEDBY [ipv4-addr:value='203.0.113.33/32']) REPEATS 5 TIMES

The first Observation Expression requires a match on an ipv4-addr:value equal to 198.51.100.1/32 followed by a match on the second Observation Expression for an ipv4-addr:value equal to 203.0.113.33/32, this pattern must be seen 5 times for a match.

Some examples to test you

Below is a sample from a Linux audit log…

2019-08-20 09:08:55:906 type=USER_LOGIN msg=audit(1566306445.906:280) user pid=2318 uid=0 auid 4294967295 ses=4294967295 username=unknown subj=system_u:system_r:sshd_t:s0-"(unknown)" exe="/usr/sbin/sshd" hostname=? addr=218.92.0.173 terminal=ssh res=failed'
2019-08-20 09:07:25:647 type=USER_LOGIN msg=audit(1566306445.647:242) user pid=2314 uid=0 auid 4294967295 ses=4294967295 username=mike subj=system_u:system_r:sshd_t:s0-"(mike)" exe="/usr/sbin/sshd" hostname=? addr=60.242.115.215 terminal=ssh res=failed'
2019-08-20 09:07:25:195 type=USER_LOGIN msg=audit(1566306445.195.262) user pid=2311 uid=0 auid 4294967295 ses=4294967295 username=mike subj=system_u:system_r:sshd_t:s0-"(mike)" exe="/usr/sbin/sshd" hostname=? addr=60.242.115.215 terminal=ssh res=failed'

Assume the SIEM has aliased field names correctly (e.g. addr field in the logs resolves to an IPv4 address field in the data model, which in turn is mapped to the ipv4-addr SCO).

Example 1: Using the OR Observation Expression

[ipv4-addr:value='218.92.0.173'] OR [ipv4-addr:value='1.1.1.1']

Matches.

The statement IPv4 218.92.0.173 was True for one line (log line 1).

Example 2: Using the AND Observation Expression

[ipv4-addr:value='218.92.0.173'] AND [ipv4-addr:value='1.1.1.1']

Does not match.

Both of the statements needed to be True to satisfy the AND operator, but only the IPv4 218.92.0.173 statement was ever true (log line 1).

Example 3: Using the FOLLOWEDBY Observation Expression

[ipv4-addr:value='60.242.115.215'] FOLLOWEDBY [user-account.account_login='mike']

Matches.

The IPv4 address 60.242.115.215 (log line 3) is immediately followed by mike user account login (log line 2)

Example 4: Using the != Comparison Operators

[ipv4-addr:value!='218.92.0.173']

Matches.

The IPv4 address value 218.92.0.173 was not seen (log line 2 and 3)

Example 5: Using the > Comparison Operators

[process:pid>='2315']

Matches.

Log line 1 is the only line where process ID is greater than pid=2315 (the other two lines have process IDs less than 2315)

Example 6: Parentheses Precedent

[ipv4-addr:value='218.92.0.173'] FOLLOWEDBY ([user-account:account_login='mike'] OR [user-account:account_login='david'])

Does not match.

The IPv4 address 218.92.0.173 must be followed by at least one of the statements in the parenthesis. Log line 1 contains 218.92.0.173 but does not have and logs that follow it (by time), thus this statement is not true for the 3 logs shown.

Example 7: Using the WITHIN Observation Expression Qualifier

[ipv4-addr:value='60.242.115.215'] FOLLOWEDBY [ipv4-addr:value='218.92.0.173'] WITHIN 1 MINUTE

Does not match.

The IPv4 address 60.242.115.215 was seen at 09:07:25:647 (log line 2) then the IPv4 address 218.92.0.173 was seen at 09:08:55:906 (log line 1) which is more than 1 minute apart.

Example 8: Using the REPEATS Observation Expression Qualifier

([ipv4-addr:value='60.242.115.215'] FOLLOWEDBY [ipv4-addr:value='60.242.115.215']) REPEATS 2 TIMES

Does not match.

The IPv4 address 60.242.115.215 (log line 2) was followed IPv4 address 218.92.0.173 (log line 1) but it was not repeated twice.

Helpful tools to create and validate STIX Patterns

The STIX 2 Pattern Validator from OASIS is a great tool in checking your patterns are written correctly.

Simply run the STIX 2 Pattern Validator script by declaring your Pattern…

mkdir stix2-patterns
python3 -m venv stix2-patterns
source stix2-patterns/bin/activate
pip3 install stix2-patterns
validate-patterns
Enter a pattern to validate: [file:hashes.md5 = '79054025255fb1a26e4bc422aef54eb4']
PASS: [file:hashes.md5 = '79054025255fb1a26e4bc422aef54eb4']
Enter a pattern to validate: [bad pattern]
FAIL: Error found at line 1:5. no viable alternative at input 'badpattern' 

If you are trying to see if content in an Observed Data SDO matches an existing STIX Pattern you can use the CTI Pattern Matcher.

Lets start by creating an Observed Data SDO, and two related SCOs;

observed-data--b67d30ff-02ac-498a-92f9-32f845f448cf.json;

{
    "type": "observed-data",
    "spec_version": "2.1",
    "id": "observed-data--699546f4-6d73-4a35-a961-181a34fa3b14",
    "created": "2016-04-06T19:58:16.000Z",
    "modified": "2016-04-06T19:58:16.000Z",
    "first_observed": "2015-12-21T19:00:00Z",
    "last_observed": "2015-12-21T19:00:00Z",
    "number_observed": 2,
    "object_refs": [
        "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445",
        "domain-name--dd686e37-6889-53bd-8ae1-b1a503452613"
    ]
}

ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445.json;

{
    "type": "ipv4-addr",
    "spec_version": "2.1",
    "id": "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445",
    "value": "177.60.40.7"
}

domain-name--dd686e37-6889-53bd-8ae1-b1a503452613.json;

{
    "type": "domain-name",
    "spec_version": "2.1",
    "id": "domain-name--dd686e37-6889-53bd-8ae1-b1a503452613",
    "value": "google.com"
}

The CTI Pattern Matcher accepts “A file containing JSON list of STIX observed-data SDOs” (in a STIX bundle). Lets create that objects-bundle.json;

{
    "type": "bundle",
    "id": "bundle--cb06ef7f-acb8-46b6-98e1-27c6fe8d23c2",
    "objects": [
        {
            "type": "observed-data",
            "spec_version": "2.1",
            "id": "observed-data--699546f4-6d73-4a35-a961-181a34fa3b14",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "first_observed": "2020-01-01T00:00:00.000Z",
            "last_observed": "2020-01-01T00:00:00.000Z",
            "number_observed": 2,
            "object_refs": [
                "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445",
                "domain-name--dd686e37-6889-53bd-8ae1-b1a503452613"
            ]
        },
        {
            "type": "ipv4-addr",
            "spec_version": "2.1",
            "id": "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445",
            "value": "177.60.40.7"
        },
        {
            "type": "domain-name",
            "spec_version": "2.1",
            "id": "domain-name--dd686e37-6889-53bd-8ae1-b1a503452613",
            "value": "google.com"
        }
    ]
}

And lets write a pattern I know matches, and does not match, and store it to patterns.txt

[ipv4-addr:value='177.60.40.7']
[domain:value='microsoft.com']

So if I pass both of these to stix2-matcher;

mkdir stix2-matcher
python3 -m venv stix2-matcher
source stix2-matcher/bin/activate
pip3 install stix2-matcher
stix2-matcher --patterns patterns.txt --file objects-bundle.json --stix_version 2.1
MATCH:  [ipv4-addr:value='177.60.40.7']
NO MATCH:  [domain:value='microsoft.com']

Which brings us to a slight tangent; how to use Observed Data SDOs.

Pattern Matches as Sighting SROs

Now you have seen how Patterns can be used, detections (aka sightings) of these patterns need to modelled.

If you start to use STIX Patterns for threat detection, you will probably want to represent the detection matches in STIX format too.

That is where the STIX Sighting SRO and Observed Data SDO can help, as detailed in the previous post.

The previous steps to create this relationship might be;

  • IPv4 SCO created with Indicator containing pattern referencing the IPv4 SCO
  • IPv4 SCO sent to SIEM (or other tooling) for detections
  • Detection observed and Observed Data SDO and Sighting SRO created

Creating a series of objects as follows;

{
    "type": "bundle",
    "id": "bundle--177c6477-2dee-43d5-b4c9-8b7f3f5ec542",
    "objects": [
        {
            "type": "indicator",
            "spec_version": "2.1",
            "id": "indicator--8e2e2d2b-17d4-4cbf-938f-98ee46b3cd3f",
            "created_by_ref": "identity--f431f809-377b-45e0-aa1c-6a4751cae5ff",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "indicator_types": [
                "malicious-activity"
            ],
            "name": "Some Malware",
            "description": "Some malware description",
            "pattern": "[ipv4-addr:value='177.60.40.7']",
            "pattern_type": "stix",
            "valid_from": "2016-01-01T00:00:00Z"
        },
        {
            "type": "sighting",
            "spec_version": "2.1",
            "id": "sighting--ee20065d-2555-424f-ad9e-0f8428623c75",
            "created_by_ref": "identity--f431f809-377b-45e0-aa1c-6a4751cae5ff",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "first_seen": "2020-01-01T00:00:00.000Z",
            "last_seen": "2020-01-01T00:00:00.000Z",
            "count": 50,
            "sighting_of_ref": "indicator--8e2e2d2b-17d4-4cbf-938f-98ee46b3cd3f",
            "observed_data_refs": [
                "observed-data--b67d30ff-02ac-498a-92f9-32f845f448cf"
            ],
            "where_sighted_refs": [
                "identity--d3f9a82b-7272-417e-9195-f3b0f68159e9"
            ]
        },
        {
            "type": "identity",
            "spec_version": "2.1",
            "id": "identity--d3f9a82b-7272-417e-9195-f3b0f68159e9",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "name": "Splunk Enterprise Security",
            "identity_class": "system"
        },
        {
            "type": "observed-data",
            "spec_version": "2.1",
            "id": "observed-data--b67d30ff-02ac-498a-92f9-32f845f448cf",
            "created_by_ref": "identity--f431f809-377b-45e0-aa1c-6a4751cae5ff",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "first_observed": "2020-01-01T00:00:00.000Z",
            "last_observed": "2020-01-01T00:00:00.000Z",
            "number_observed": 1,
            "object_refs": [
                "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445"
            ]
        },
        {
            "type": "ipv4-addr",
            "spec_version": "2.1",
            "id": "ipv4-addr--dc63603e-e634-5357-b239-d4b562bc5445",
            "value": "177.60.40.7"
        }
    ]
}

Which looks as follows;