A Joint Extraction Framework for Cyber Threat Intelligence Triplets Based on Knowledge Enhancement and Multi-task Optimization

Main Article Content

Chuanyue Li

Abstract

Triplet extraction from unstructured text is a fundamental task in the construction of cyber threat intelligence (CTI) knowledge graphs. However, traditional pipeline approaches often suffer from redundant relation prediction, entity overlap, complex contextual dependencies, and limited domain knowledge. This paper proposes a joint triplet extraction framework tailored to CTI scenarios. Our model integrates a knowledge enhancement module, potential relation prediction, relation-specific entity labeling, and a global triplet validation module. The knowledge enhancement module enriches threat texts using an external knowledge base, improving semantic understanding of security-related terms. The potential relation prediction module filters out invalid relations, while the dual BIO-based labeling mechanism addresses overlapping entities. The final validation module scores and selects the most valid triplets. We constructed a CTI-specific knowledge base from MITRE ATT&CK and other public sources, and evaluated our method on two datasets: HACKER and RE-DNRTI. Experimental results show our method outperforms strong baselines such as PRGC and OD-RTE in both precision and F1 score, particularly in noisy and complex scenarios. Ablation and sensitivity experiments demonstrate the importance of each module and the robustness of the overall framework. This research contributes a reliable and interpretable method for high-quality CTI triplet extraction.

Article Details

Section
Articles