Inferring Recovery Steps from Cyber Threat Intelligence Reports


Within the constantly changing threat landscape, Security Operation Centers are overwhelmed by suspicious alerts, which require manual investigation. Nonetheless, given the impact and severity of modern threats, it is crucial to quickly mitigate and respond to potential incidents. Currently, security operators use predefined sets of actions from so-called playbooks to respond to incidents. However, these playbooks need to be manually created and updated for each threat, again increasing the workload of the operators. In this work, we research approaches to automate the inference of recovery steps by automatically identifying steps taken by threat actors within Cyber Threat Intelligence reports and translating these steps into recovery steps that can be defined in playbooks. Our insight is that by analyzing the text describing threats, we can effectively infer their corresponding recovery actions. To this end, we first design and implement a semantic approach based on traditional Natural Language Processing techniques, and we then study a generative approach based on recent Large Language Models (LLMs). Our experiments show that even if the LLMs were not designed to solve domain-specific problems, they outperform the precision of semantic approaches by up to 45%. We also evaluate factuality showing that LLMs tend to produce up to 90 factual errors over the entire dataset.

Conference paper
In Proceedings of the Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA)