Skip to main content

· 5 min read
Quentin Jerome

In the complex field of incident response, effective training for Security Operations Center (SOC) operators is critical. One of the key challenges in SOC training is providing realistic, data-driven environments that accurately simulate the threats and incidents operators will face. Additionally, detection engineers need reliable and actionable data to create robust detection rules that align with real-world security monitoring systems. However, gathering and analyzing real-world malware samples, which is essential to this process, can be time-consuming and prone to errors when done manually.

In this blog post, we introduce an approach to solving these challenges through automation. We explore how a Kunai-based sandbox can streamline the collection and analysis of malware samples, offering a practical solution.

By leveraging this sandbox infrastructure, the project opens up new opportunities for more efficient malware analysis while supporting a wide range of CPU architectures, including those specific to IoT and mobile devices.

The Need for Realistic Data

One prerequisite for offering cyber ranges or training solutions in the context of detection engineering and security monitoring is the collection of real-world malware samples.

To provide high-quality training and realistic experiences, these samples can be used as injects in various training scenarios or for testing detection rules.

A common approach is to collect such data manually by running and monitoring malware samples, preferably in a confined environment such as a virtual machine (VM). However, this approach has several drawbacks: it lacks reproducibility under identical experimental conditions and involves repetitive, error-prone tasks (uploading files, running monitoring tools/malware samples, monitoring network traffic, conducting post-analysis, etc.).

Thus, this process is an ideal candidate for automation. Our first motivation for creating this new project is to address these challenges. Our second goal is to provide detection engineers with a reliable way to generate actionable data from malware samples.

The Concept of a Kunai-Based Sandbox

Malware sample sandboxing is a frequent task performed at various stages of a security alert's lifecycle, from incident/malware triage to more detailed malware analysis. This task is typically supported by numerous tools, ranging from open-source options like Cape Sandbox to paid alternatives like Joe Sandbox, VMRay, or Any Run.

While these solutions are excellent in many respects—such as defeating anti-sandboxing techniques and providing deep insight into a sample's capabilities—we believe they are not always the best tools for gathering actionable information for detection engineers. For many organizations, there is no direct mapping between the data collected from malware analysis platforms (sandboxes) and their monitoring systems. As a result, a task that should be simple—building detection rules tailored to an organization’s security monitoring tools—can become challenging.

To solve this issue, we propose a simple yet powerful sandboxing infrastructure based on QEMU for virtualization and Kunai for sample monitoring. This infrastructure can serve multiple purposes: analyzing malware samples using the same tools employed for detection and collecting data for use within the NGSOTI project.

Project Status

The source code for the project is available in the Kunai sandbox repository. Additionally, our under-construction open dataset, extracted using this sandbox, can be found at NGSOTI malware dataset.

Currently, the sandboxing system can run Linux malware samples within a virtual environment, monitor them using Kunai, and capture the network traffic generated by the system. Another key feature of our sandbox is its support for multiple CPU architectures (currently Intel 32/64bits and ARM 64bits), enabling the analysis of a broader range of malware samples. We believe ARM achitecture support is crucial, as it can be used to analyze malware samples specific to IoT or mobile (phones, tablets, etc.) devices.

Limitations

While our approach provides a great opportunity for detection engineers to obtain data that is directly usable for creating Kunai-based detection rules, we must remember that it does not achieve the same level of stealthiness as other sandboxing platforms, which often rely on custom hypervisors. Therefore, our approach should not be considered a replacement for dedicated sandboxing platforms but rather a complement that facilitates detection engineering-related tasks.

Conclusion

The NGSOTI project aims to bridge the gap between theoretical knowledge and practical skills for SOC operators by offering realistic, data-driven training experiences. By automating the collection and analysis of malware samples through the Kunai-based sandbox, we provide a straightforward, efficient, and repeatable method for detection engineers to generate actionable insights. This approach is not intended to replace traditional sandboxing but rather to complement it. With support for multiple CPU architectures, including those specific to IoT and mobile devices, the sandbox expands the possibilities for analyzing and generating data from a wider range of malware, enhancing the diversity of scenarios that NGSOTI can offer. As the project progresses, we look forward to further enriching the open dataset and continuing to develop solutions that address the evolving challenges in detection engineering.

References

Funding

The NGSOTI project is dedicated to training the next generation of Security Operation Center (SOC) operators, focusing on the human aspect of cybersecurity. It underscores the significance of providing SOC operators with the necessary skills and open-source tools to address challenges such as detection engineering, incident response, and threat intelligence analysis. Involving key partners such as CIRCL, Restena, Tenzir, and the University of Luxembourg, the project aims to establish a real operational infrastructure for practical training. This initiative integrates academic curricula with industry insights, offering hands-on experience in cyber ranges.

NGSOTI is co-funded under Digital Europe Programme (DEP) via the ECCC (European cybersecurity competence network and competence centre).

· 9 min read

MISP is an open-source cyber-threat information sharing platform which has been adopted by many actors of the industry over the last years. Organizations usually use it to exchange information about their own IT security incidents or about their Cyber Threat Intelligence (CTI) activities. Therefore a MISP instance, well connected with other instances, can quickly become a real gold mine containing a massive amount of Indicators of Compromise (IoC). By essence IoC are very specific and can be used to quickly identify compromised systems. In this blog post we are going to detail how to easily use IoC stored in a MISP instance to configure Kunai for real time compromise detection.

Warm up

The first step to do is to get familiar with the kind of events Kunai is able to monitor on your system. So please, take a quick read over the events documentation, in order to better understand the following steps. Additionally, the reader may want to get familiar with the documentation explaining how to use the tool for threat detection purposes.

For those not having the time to go through the whole documentation, nothing beats a good example. So below one can find the example of an execve_script event generated for every script being executed on a system running Kunai.

{
"data": {
"ancestors": "/usr/lib/systemd/systemd|/usr/bin/login|/usr/bin/zsh|/usr/bin/bash|/usr/bin/xinit|/usr/bin/i3|/usr/bin/bash|/usr/bin/urxvt|/usr/bin/zsh|/usr/bin/bash",
"parent_exe": "/usr/bin/bash",
"command_line": "/bin/bash /tmp/tmp.msdKnvj7ax_kunai_test.sh",
"exe": {
"file": "/tmp/tmp.msdKnvj7ax_kunai_test.sh",
"md5": "64b8185d28042ea96feb251e12fe632b",
"sha1": "31683c67b020d90f02a42e43e7758184ef98c12f",
"sha256": "cda81b42b75647daf6b70a626380c199fe665d721e63bfe34c96b65da0289627",
"sha512": "63165b902db5242a01296b39c1d6f2995fde961e29d9470d1862ccde8e2c8a3083659bf5d9c0794bbca620f37c419baec3c1d1941333d37fb9ced795553d2e83",
"size": 21
},
"interpreter": {
"file": "/usr/bin/bash",
"md5": "e742da46d05de5afca58a2abcba5343e",
"sha1": "8d48bdcb10eb85a0bd80c34e13fc01c2f6776043",
"sha256": "664428e8dd065099a20cb364bdc293dd8f787ef10b9454b64e127a197950a5d6",
"sha512": "b4e6f555571636f02704271d3a40b8470d04447ca3aaad073818f4041d944533bfbca0d5586dc945a2de8033f8fd4123f4203219e9c7b97ebbc52acd340e598f",
"size": 1112880
}
},
"info": {
"host": "...",
"event": {
"source": "kunai",
"id": 2,
"name": "execve_script",
"uuid": "520487fc-020c-5569-ed88-38393e49a2d2",
"batch": 32
},
"task": "...",
"parent_task": "...",
"utc_time": "2024-02-13T08:34:29.312127521Z"
}
}

As you can see, every Kunai event is composed of fields of various types. Some of these types can directly be matched against IoCs, following this correspondence table:

Kunai data typescover IoC type
pathabsolute path
md5, sha1, sha256, sha512hash
domain / fqdndomain / fqdn
IP addressIPv4 / IPv6

So any field of any Kunai event having a type in the above table will be checked against the IoCs loaded in the tool. Now the only thing we have to do is to feed Kunai with data in the expected IoC format. The format is pretty basic, yet flexible - the tool simply expects a file containing JSON documents. One can find an example file below:

{"uuid": "81050c82-68a5-4130-a56d-a465c8337066", "source":"My MISP Instance", "value":"why.kunai.rocks"}
{"uuid": "dd19ecd1-8237-427a-9b1d-35ff7d17381f", "source":"My MISP Instance", "value":"kunai.rocks"}

The format is simple enough to accommodate any IoC feed and is easily scriptable. So one could even imagine the creation of such a file from an IP list found on the Internet.

Now, the reader should have a good picture of the topic and we can walk through a small experiment to get a bit more familiar with Kunai and its IoC matching capabilities.

A Little Experiment

Let's run Kunai without any arguments, merely taking care of redirecting its output:

sudo kunai | tee /tmp/kunai.log
# if one has jq installed one can pipe kunai output to jq
# in order to get a prettier output
sudo kunai | jq '.' | tee /tmp/kunai.log

You should start seeing some events printed on the terminal. You can try generating some more activity by leaving Kunai to run in a corner and use your system as you are used to. You can then stop it using Ctrl + c.

If you inspect the logs in /tmp/kunai.log, you will most likely find a wealth of useful information, especially if you have security monitoring needs. However, you may also come to the conclusion that there is simply too much useless information for your specific needs. So one can use filtering / detection rules to be very selective of the events (out of scope for this post) or use IoC matching, to only display events matching an IoC. So lets do the exact same experiment as previously, but taking a file containing IoCs as parameter.

The first thing you need to do is to copy the following content into a file, let's say /tmp/kunai-iocs.json:

{"uuid": "81050c82-68a5-4130-a56d-a465c8337066", "source":"My MISP Instance", "value":"why.kunai.rocks"}
{"uuid": "dd19ecd1-8237-427a-9b1d-35ff7d17381f", "source":"My MISP Instance", "value":"kunai.rocks"}

Once this is done, start Kunai, passing this IoC file in the command line:

sudo kunai -i /tmp/kunai-iocs.json

And now the magic happens! You won't see logs any longer, but don't worry it is absolutely normal. Under the hood, Kunai is analyzing all the events (as the ones you saw in previous experiments) but it will only display the ones matching IoCs. So try to generate some network traffic (use dig, curl ...) towards domain why.kunai.rocks and you should see some events popping up on your terminal. If you visit the website with your browser, make sure it doesn't use DOH or any DNS protocol different from the one running on port 53.

If you did the experiment and managed to generate an event matching one of the IoCs configured, you may have seen that the generated events contains additional information about the matching IoC in the .detection section of the event.

{
"data": {
"ancestors": "/usr/lib/systemd/systemd|/usr/bin/login|/usr/bin/zsh|/usr/bin/bash|/usr/bin/xinit|/usr/bin/i3|/usr/bin/bash|/usr/bin/urxvt|/usr/bin/zsh",
"command_line": "dig why.kunai.rocks",
"exe": {
"file": "/usr/bin/dig"
},
"query": "why.kunai.rocks",
"proto": "udp",
"response": "kunai-project.github.io",
"dns_server": {
"ip": "10.96.0.1",
"port": 53,
"public": false,
"is_v6": false
}
},
"detection": {
"iocs": [
"why.kunai.rocks"
],
"severity": 10
},
"info": {
"host": "...",
"event": {
"source": "kunai",
"id": 61,
"name": "dns_query",
"uuid": "7cf3a92b-b8fd-9035-4ced-8ca216adbf32",
"batch": 38
},
"task": "...",
"parent_task": "...",
"utc_time": "2024-04-18T09:34:31.887637287Z"
}
}

In the above example, we see only one IoC under the detection section, however if several IoCs would match an event, all of them would be in the list. Things should be a bit more concrete for you now, so lets dive into how to automatically ingest MISP IoCs into Kunai.

Getting MISP IoCs into Kunai

The only thing which is missing, in order to configure Kunai from a MISP instance, is a way to extract IoCs and translate them into the expected format. If you have already played with PyMISP, this is not something that should be too scary and if you have not, here is the good news: we have a script which does it for you. You can find the script in question over at the Kunai tools repository under the misp directory.

Before going further, make sure you have all the modules required (check out requirements.txt at repository root) by Python so that this script works.

The next step is to create a configuration file for the script, simply rename the example configuration into config.toml and edit it so that it contains the settings to connect to the desired MISP instance.

If you have fulfilled the previous steps, you can simply run the misp-to-kunai.py script and you should see your MISP's attributes translated into Kunai IoC format flowing in your terminal. This script has some options to be more selective on the IoCs to pull, however we will not go through all of them and will let the reader explore them. The only option we will use is the -o to write IoCs into a file. We encourage you to use the -o option to write into a file rather than doing a pure shell redirect as this option prevents IoC duplication.

If you take a look at the output generated by misp-to-kunai.py you may have noticed that IoCs are not exactly in the same format as the one described previously. Indeed, those contain an additional field - it being the event_uuid field -, which encodes the MISP Event UUID that the IoC belongs to. Any additional field to the ones described above, is ignored by Kunai, making the IoC format fairly flexible. Thanks to this you can add as many fields as you want, for instance to bring contextual information along with the IoC. We thought this one would be useful to enrich the IoC file with additional context, in case you would wish to correlate back to a MISP Event.

It is now time to put everything together:

# pull IoCs from MISP and store them in Kunai IoC format
./misp-to-kunai.py -o /tmp/kunai-misp-iocs.json

# run Kunai and check all events happening on your
# system against MISP IoCs
sudo kunai -i /tmp/kunai-misp-iocs.json

As seen in the previous experiment, you should not see any event coming out of Kunai until one actually matches an IoC you've just loaded. The easiest way to try and see if everything works as expected is to execute a dig command against a domain from the IoC list.

Yes, you went though this entire blog post just to end up typing two simple lines into a shell, but now you understand why you typed those and how all this works.

Final Words

This post aimed at being dense and straightforward, mostly to prevent you from giving up. While monitoring an infrastructure, IoC checking is mandatory, not to miss "known things". On the other hand, IoC do not offer too much flexibility in the sense they cannot be used to detect Tactics, Techniques and Procedures (TTP) used by attackers. So if you would like to go further on this topic, we encourage you to learn how to configure Kunai with rules.

On the Python script side, there are also some interesting options deserving exploration, especially if you are interested in turning it into a service.

All the tools presented here are open-source, so feel free to read the code, modify it and contribute to it even in the form of feedback or GitHub issues. This is how we can keep improving our projects and better fit the users' needs.

We hope you learned useful things or at least that you enjoyed reading this article.

References

Kunai project on GitHub
Kunai documentation
Kunai tools
MISP
PyMISP

· 15 min read
Quentin Jerome

This blog post is meant to give an insight of how to use Kunai for detection engineering.

For those who didn't have the opportunity to attend the Kunai workshop at Hack.lu 2023 edition this is a way to catch up on a big part of what we have been doing during this session. For those who actually attended the workshop, you can take a read anyway because the post goes even more into the details, as we were limited in time.