Thesis title: Toward Reliable and Adaptive Large Language Models in the Cybersecurity Domain
Large Language Models (LLMs) exhibit not only strong reasoning abilities but also a remarkable
capacity for decision support in knowledge-intensive domains; however, applying them to
cybersecurity demands reliability, interpretability, and continuous adaptability, qualities that
general-purpose models still lack. This work aims to make LLMs reliable and adaptive tools
across five interconnected domains that span the entire cybersecurity lifecycle: malware and threat
analysis, cyber threat intelligence, vulnerability detection, access control, and misinformation.
The research begins by employing Transformer models to learn behavioral patterns encoded
in API call sequences. Although these models perform well in detecting and categorizing
malicious activity, they struggle to capture higher-level semantic relationships between threats,
tactics, and defences. To address this limitation, the work introduces knowledge graphs that
connect malware samples, attack techniques, vulnerabilities, and countermeasures, enabling
dynamic updates and multi-hop connections across entities. Building on this, a retrieval
augmented assistant is developed to integrate both structured graph data and unstructured textual
sources, thereby reducing hallucinations and improving factual reliability. The system is then
extended with specialized, task-oriented modules that translate analytical insight into operational
capability: a reinforcement learning–based vulnerability detector, a natural language translator for
access control policy generation, and a misinformation engine for both generation and detection.
Finally, the thesis focuses on improving the reasoning process itself, introducing methods
that generate more concise, stable, and interpretable output while reducing computational cost.
Overall, the research demonstrates that reliability in cybersecurity does not arise from a single
universal model but from an ecosystem of task-aware LLMs built on structured knowledge,
retrieval, and optimized reasoning.