Converting STIX patterns so they are understood by downstream tools (using STIX Shifter)

I wrote a previous tutorial on STIX patterns;

Though here’s the reality, I’ve never come across a SIEM or EDR that understands STIX formatted patterns natively.

One advantage of using other pattern_types in a STIX pattern, like Sigma, is that the pattern can be understood by these downstream tools for this reason.

This is where STIX Shifter comes in.

STIX Shifter can convert STIX formatted patterns into other detection languages. Let me show you how…

STIX-shifter is an open source python library allowing software to connect to products that house data repositories by using STIX Patterning, and return results as STIX Observations.

STIX Shifter;

  1. takes STIX 2.x Patterns as input
  2. converts them to target rule formats
  3. sends the converted rule to the downstream tool
  4. detects data that matches the patterns inside downstream tools (e.g. SIEMs, EDRs, etc)
  5. transforms the output (the detection) into STIX 2.x Observed Data Objects.

Here’s a nice presentation describing STIX Shifter

STIX Shifter Connectors

STIX Shifter is based around the concept of Connectors.

A STIX Shifter connector is a module inside the STIX Shifter library that implements an interface for:

  • data source query and result set translation
  • data source communication

Each Connector supports a set of STIX objects and properties as defined in the connector’s mapping files.

There are about 30 Connectors that currently exist, detailed here.

Let me demonstrate this concept using some examples.

Installing STIX Shifter

STIX Shifter can be used as a command line utility or as a Python library.

To install STIX Shifter in both ways;

mkdir stix-shifter
python3 -m venv stix-shifter
source stix-shifter/bin/activate
pip3 install stix-shifter
pip3 install stix-shifter-utils
stix-shifter -h

STIX Shifter core functions

STIX Shifter provides three core functions;

  1. translate: The translate command converts STIX patterns into data source queries (in whatever query language the data source might use) and translates data source results (in JSON format) into bundled STIX observation objects.
  2. transmit: The transmit command allows stix-shifter to connect with products that house repositories of cybersecurity data. Connection and authentication credentials are passed to the data source APIs where stix-shifter can make calls to ping the data source, make queries, delete queries, check query status, and fetch query results.
  3. execute: The translation and transmission functions can work in sequence by using the execute command from the CLI.

Converting STIX Patterns to target formats

To use a connector (to translate a STIX pattern), you must first install it. You can do this using pip as follows;

pip3 install stix-shifter-modules-<CONNECTOR NAME>

For example, to install the Splunk Connector;

pip3 install stix-shifter-modules-splunk

The translate command line argument takes the form;

stix-shifter translate <CONNECTOR NAME> query "<STIX IDENTITY OBJECT>" "<STIX PATTERN>" "<OPTIONS>"

Therefore to convert the STIX Pattern [url:value = 'http://www.testaddress.com'] OR [ipv4-addr:value = '192.168.122.84'] using the newly installed Splunk Connector I can run;

stix-shifter translate splunk query "{}" "[url:value = 'http://www.testaddress.com'] OR [ipv4-addr:value = '192.168.122.84']"

Note, I passed an empty

Prints the converted Splunk query in a JSON response;

{
    "queries": [
        "search (url = \"http://www.testaddress.com\") OR ((src_ip = \"192.168.122.84\") OR (dest_ip = \"192.168.122.84\")) earliest=\"-5minutes\" | head 10000 | fields src_ip, src_port, src_mac, src_ipv6, dest_ip, dest_port, dest_mac, dest_ipv6, file_hash, user, url, protocol, host, source, DeviceType, Direction, severity, EventID, EventName, ss_name, TacticId, Tactic, TechniqueId, Technique, process, process_id, process_name, process_exec, process_path, process_hash, parent_process, parent_process_id, parent_process_name, parent_process_exec, description, result, signature, signature_id, query, answer"
    ]
}

You will notice the Splunk search (nested in the queries field). The key part of the search is;

(url = \"http://www.testaddress.com\") OR ((src_ip = \"192.168.122.84\") OR (dest_ip = \"192.168.122.84\"))

STIX Shifter has converted the STIX fields url:value into url and ipv4-addr:value into both src_ip and dest_ip fields (as the STIX pattern could refer to either).

You read a description of the logic performed by the Splunk Connector to perform the translation here.

Splunk data (assumed to be in the Common Information Model (CIM) standard) to STIX mapping is defined in the Splunk modules to_stix_map.json file.

Also notice how the search ends with earliest=\"-5minutes\" | head 10000 | fields src_ip,.... These are added by default in the conversion and are not converted from the STIX Pattern. In short these Splunk commands:

  • earliest=\"-5minutes\" is defining the time range to look back
  • head is limiting the number of results returned (to first 10,000) and
  • fields specifies which fields to keep or remove from the search results

The purpose of including these commands is to limit to scope of the search and ensure all fields are present when matches are found, when STIX-Shifter is used to created Observed Data Object (more on that to follow). If you just want to use STIX-Shifter for conversion to Splunk format, I would remove this from the output as it’s not really useful.

Lets try another conversion, this time using the Elastic ECS Connector on the same STIX Pattern;

pip3 install stix-shifter-modules-elastic_ecs

The Elastic Common Schema (ECS) is Elastics own standard, similar to the Splunk CIM.

stix-shifter translate elastic_ecs query "{}" "[url:value = 'http://www.testaddress.com'] OR [ipv4-addr:value = '192.168.122.84']"
{
    "queries": [
        "(url.original : \"http://www.testaddress.com\") OR ((source.ip : \"192.168.122.84\" OR destination.ip : \"192.168.122.84\" OR client.ip : \"192.168.122.84\" OR server.ip : \"192.168.122.84\" OR host.ip : \"192.168.122.84\" OR dns.resolved_ip : \"192.168.122.84\")) AND (@timestamp:[\"2022-08-23T06:28:44.754Z\" TO \"2022-08-23T06:33:44.754Z\"])"
    ]
}

You can see the STIX Pattern to Elastic ECS conversion logic here.

STIX Shifter has converted the STIX fields url:value into url.original and ipv4-addr:value into source.ip, server.ip, host.ip, and dns.resolved_ip.

Like with Splunk, the query also included a 5 minute time window @timestamp:[\"2022-08-23T06:28:44.754Z\" TO \"2022-08-23T06:33:44.754Z\"]).

Creating STIX Observed Data from Detections

In this post I won’t cover transmit, where STIX-Shifter can authenticate to downstream products via a Connector which can be used to push rules. However, imagine my converted rules were sent down to Splunk to look for matching log lines.

I will show a simulated example of a match being detected and written into a STIX 2.1 Observed Data Object, similar to the flow I showed manually before.

Lets assume the downstream tool (Splunk) detects a match between a converted STIX Pattern ([ipv4-addr:value = '1.1.1.1']src_ip=1.1.1.1 OR dest_ip=1.1.1.1) and a log line that contains src_ip=1.1.1.1. In addition to the matching field, the log line has the following fields (this is where the fields command in the the Splunk STIX-Shifter output is important) modelled in json;

[
    {
        "src_ip": "1.1.1.1",
        "dest_ip": "2.2.2.2",
        "url": "www.testaddress.com"
    }
]

It is vital that the fields match those defined in the STIX-Shifter Connector so that the can be mapped to the correct STIX Cyber Observable Object (e.g. IPv4 STIX Cyber Observable Object) during the translation. The STIX-Shifter Splunk Connector expects CIM compliant fields (src_ip, dest_ip and url are all CIM compliant).

This time the translate query takes a slightly different form to create STIX 2.1 Observed Data and Cyber Observable Objects from the detection (using result instead of query);

stix-shifter translate <MODULE NAME> results '<STIX IDENTITY OBJECT>' '<LIST OF JSON RESULTS>'

Unlike before, a STIX Identity Object is required to be used in the command to attribute the Observed Data Objects to someone. I will use a demo Identity as follows;

{
    "type": "identity",
    "spec_version": "2.1",
    "id": "identity--d2916708-57b9-5636-8689-62f049e9f727",
    "created_by_ref": "identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3",
    "created": "2020-01-01T00:00:00.000Z",
    "modified": "2020-01-01T00:00:00.000Z",
    "name": "signalscorps-demo",
    "description": "https://github.com/signalscorps/",
    "identity_class": "organization",
    "sectors": [
        "technology"
    ],
    "contact_information": "https://www.dogesec.com/contact/",
    "object_marking_refs": [
        "marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9",
        "marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"
    ]
}

Which written out into an entire translate query gives;

python main.py translate splunk results \
    '{"type":"identity","spec_version":"2.1","id":"identity--d2916708-57b9-5636-8689-62f049e9f727","created_by_ref":"identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3","created":"2020-01-01T00:00:00.000Z","modified":"2020-01-01T00:00:00.000Z","name":"signalscorps-demo","description":"https://github.com/signalscorps/","identity_class":"organization","sectors":["technology"],"contact_information":"https://www.dogesec.com/contact/","object_marking_refs":["marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9","marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"]}' \
    '[{"src_ip":"1.1.1.1","dest_ip":"2.2.2.2","url":"www.testaddress.com"}]' \
    '{"stix_2.1": true}'

By default, JSON results are translated into STIX 2.0. To return STIX 2.1 results include {"stix_2.1": true} in the options part (last part) of the CLI command.

This command prints a JSON bundle with a STIX 2.1 Observed Data Object (covering the entire log line representing the match), and four STIX 2.1 Cyber Observable Objects representing each field type in the log line, src_ip, dest_ip, and url. Note, there are four results as the single url in my log (www.testaddress.com) is converted by the Splunk STIX-Shifter Connector into STIX 2.1 SCO types URL and Domain Name.

{
    "type": "bundle",
    "id": "bundle--9bbb1b3e-ddfe-4ca7-979d-2610371b8de7",
    "objects": [
        {
            "type": "identity",
            "spec_version": "2.1",
            "id": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created_by_ref": "identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "name": "signalscorps-demo",
            "description": "https://github.com/signalscorps/",
            "identity_class": "organization",
            "sectors": [
                "technology"
            ],
            "contact_information": "https://www.dogesec.com/contact/",
            "object_marking_refs": [
                "marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9",
                "marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"
            ]
        },
        {
            "id": "observed-data--130b5d08-e0a2-4f0d-9c21-c8f77f66d987",
            "type": "observed-data",
            "created_by_ref": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created": "2020-01-01T07:40:50.410Z",
            "modified": "2020-01-01T07:40:50.410Z",
            "first_observed": "2020-01-01T07:40:50.410Z",
            "last_observed": "2020-01-01T07:40:50.410Z",
            "number_observed": 1,
            "object_refs": [
                "ipv4-addr--cbd67181-b9f8-595b-8bc3-3971e34fa1cc",
                "ipv4-addr--a4c470a9-5498-5e8e-9fa2-66b1ceadcc12",
                "url--cc6ef2fe-d31f-510e-9809-bf0f6478e749",
                "domain-name--cc6ef2fe-d31f-510e-9809-bf0f6478e749"
            ],
            "spec_version": "2.1"
        },
        {
            "type": "ipv4-addr",
            "value": "1.1.1.1",
            "id": "ipv4-addr--cbd67181-b9f8-595b-8bc3-3971e34fa1cc",
            "spec_version": "2.1"
        },
        {
            "type": "ipv4-addr",
            "value": "2.2.2.2",
            "id": "ipv4-addr--a4c470a9-5498-5e8e-9fa2-66b1ceadcc12",
            "spec_version": "2.1"
        },
        {
            "type": "url",
            "value": "www.testaddress.com",
            "id": "url--cc6ef2fe-d31f-510e-9809-bf0f6478e749",
            "spec_version": "2.1"
        },
        {
            "type": "domain-name",
            "value": "www.testaddress.com",
            "id": "domain-name--cc6ef2fe-d31f-510e-9809-bf0f6478e749",
            "spec_version": "2.1"
        }
    ]
}

Now let me highlight why the fields printed in the log data, must match those expected by the Connector.

This time I will use the Elastic ECS Connector on the same log line. Elastic ECS does not use the CIM field name standard used by Splunk. For example, as shown in the example translate conversion from STIX Pattern to Elastic ECS, IPs are captured in the field name source.ip (in Splunk the CIM compliant field is src_ip).

Demonstrating using the same command as I did for Splunk, the only difference being the connector used (this time elastic_ecs);

python main.py translate elastic_ecs results \
    '{"type":"identity","spec_version":"2.1","id":"identity--d2916708-57b9-5636-8689-62f049e9f727","created_by_ref":"identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3","created":"2020-01-01T00:00:00.000Z","modified":"2020-01-01T00:00:00.000Z","name":"signalscorps-demo","description":"https://github.com/signalscorps/","identity_class":"organization","sectors":["technology"],"contact_information":"https://www.dogesec.com/contact/","object_marking_refs":["marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9","marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"]}' \
    '[{"src_ip":"1.1.1.1","dest_ip":"2.2.2.2","url":"www.testaddress.com"}]' \
    '{"stix_2.1": true}'
{
    "type": "bundle",
    "id": "bundle--a95e4858-6508-49fd-a280-1dbade00fd84",
    "objects": [
        {
            "type": "identity",
            "spec_version": "2.1",
            "id": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created_by_ref": "identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "name": "signalscorps-demo",
            "description": "https://github.com/signalscorps/",
            "identity_class": "organization",
            "sectors": [
                "technology"
            ],
            "contact_information": "https://www.dogesec.com/contact/",
            "object_marking_refs": [
                "marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9",
                "marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"
            ]
        },
        {
            "id": "observed-data--733197a4-dc58-4d27-a656-51e76c65582b",
            "type": "observed-data",
            "created_by_ref": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created": "2020-01-01T08:14:17.421Z",
            "modified": "2020-01-01T08:14:17.421Z",
            "first_observed": "2020-01-01T08:14:17.421Z",
            "last_observed": "2020-01-01T08:14:17.421Z",
            "number_observed": 1,
            "object_refs": [],
            "spec_version": "2.1"
        }
    ]
}

See how an Observed Data Object is created, but STIX Shifter cannot convert any Cyber Observable Data Objects from the input because the field names in the log are not mapped in the Elastic ECS Connector configuration.

It is important to understand that when field mappings are incorrect, STIX Shifter can product inconsistent results.

Let me demonstrate using the QRadar Connector;

pip3 install stix-shifter-modules-qradar
python main.py translate qradar results \
    '{"type":"identity","spec_version":"2.1","id":"identity--d2916708-57b9-5636-8689-62f049e9f727","created_by_ref":"identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3","created":"2020-01-01T00:00:00.000Z","modified":"2020-01-01T00:00:00.000Z","name":"signalscorps-demo","description":"https://github.com/signalscorps/","identity_class":"organization","sectors":["technology"],"contact_information":"https://www.dogesec.com/contact/","object_marking_refs":["marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9","marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"]}' \
    '[{"src_ip":"1.1.1.1","dest_ip":"2.2.2.2","url":"www.testaddress.com"}]' \
    '{"stix_2.1": true}'
{
    "type": "bundle",
    "id": "bundle--fb406a9f-df00-4286-8f34-fb9dc1844f75",
    "objects": [
        {
            "type": "identity",
            "spec_version": "2.1",
            "id": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created_by_ref": "identity--aae8eb2d-ea6c-56d6-a606-cc9f755e2dd3",
            "created": "2020-01-01T00:00:00.000Z",
            "modified": "2020-01-01T00:00:00.000Z",
            "name": "signalscorps-demo",
            "description": "https://github.com/signalscorps/",
            "identity_class": "organization",
            "sectors": [
                "technology"
            ],
            "contact_information": "https://www.dogesec.com/contact/",
            "object_marking_refs": [
                "marking-definition--613f2e26-407d-48c7-9eca-b8e91df99dc9",
                "marking-definition--3f588e96-e413-57b5-b735-f0ec6c3a8771"
            ]
        },
        {
            "id": "observed-data--71aa43a9-7915-419a-bfb5-f84fbf3e22b6",
            "type": "observed-data",
            "created_by_ref": "identity--d2916708-57b9-5636-8689-62f049e9f727",
            "created": "2020-01-01T08:24:24.811Z",
            "modified": "2020-01-01T08:24:24.811Z",
            "first_observed": "2020-01-01T08:24:24.811Z",
            "last_observed": "2020-01-01T08:24:24.811Z",
            "number_observed": 1,
            "object_refs": [
                "x-signalscorps-demo--52107335-d213-49da-b739-d865001c2007",
                "url--cc6ef2fe-d31f-510e-9809-bf0f6478e749"
            ],
            "spec_version": "2.1"
        },
        {
            "type": "x-Signals Corps Demos",
            "src_ip": "1.1.1.1",
            "id": "x-signalscorps-demo--52107335-d213-49da-b739-d865001c2007",
            "spec_version": "2.1",
            "dest_ip": "2.2.2.2"
        },
        {
            "type": "url",
            "value": "www.testaddress.com",
            "id": "url--cc6ef2fe-d31f-510e-9809-bf0f6478e749",
            "spec_version": "2.1"
        }
    ]
}

QRadar does use the url field name, so this is mapped correctly to a STIX URL Cyber Observable Object (note, this is different behaviour to the Splunk Connector which creates a URL and Domain Observable for this record).

However, for the unrecognised fields (src_ip and dest_ip) the QRadar Connector creates a custom STIX 2.1 Cyber Observable Object ("type": "signalscorps-demo"), which contains the properties "src_ip": "1.1.1.1" and "dest_ip": "2.2.2.2" for these unrecognised fields. I’ll cover custom STIX Objects in the next post.

The point being here; be careful with field mappings, because if the Connector does not support the fields in the log, the results from STIX-Shifter can be unexpected. This is the age old problem of SIEMs – normalising fields between logs being ingested and normalising fields across SIEMs.