Home » Blog » Lakera Launches Open-Source Security Benchmark for LLM backends in AI Agents

Lakera Launches Open-Source Security Benchmark for LLM backends in AI Agents

India, 10th November, 2025 – Check Point Software Technologies Ltd. (NASDAQ: CHKP), a global leader in cybersecurity solutions, and Lakera, a leading AI-native security platform for Agentic AI applications, in collaboration with researchers from the UK AI Security Institute (AISI), today announced the release of the Backbone Breaker Benchmark (b3) an open-source security evaluation specifically designed for assessing the security of large language models (LLMs) within AI agents.

The b3 benchmark introduces a novel concept called threat snapshots. Rather than simulating an entire AI agent workflow, threat snapshots focus on critical points where vulnerabilities in LLMs are most likely to occur. By testing models at these precise moments, developers and model providers can evaluate how well their systems withstand realistic adversarial challenges without the complexity of modeling full agent operations.

“We built the b3 benchmark because today’s AI agents are only as secure as the LLMs that power them,” said Mateo Rojas-Carulla, Co-Founder and Chief Scientist at Lakera, a Check Point company. “Threat snapshots allow us to systematically uncover vulnerabilities that have until now remained hidden in complex agent workflows. By making this benchmark open-source, we aim to equip developers and model providers with a practical way to measure and enhance their security posture.”

The benchmark combines 10 representative agent “threat snapshots” with a high-quality dataset of 19,433 crowdsourced adversarial attacks, collected via the gamified red-teaming platform Gandalf: Agent Breaker. It evaluates susceptibility to attacks such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service, and unauthorized tool calls.

Initial tests on 31 popular LLMs revealed several key insights:

  • Enhanced reasoning capabilities significantly improve security.

  • Model size does not necessarily correlate with security performance.

  • Closed-source models generally outperform open-weight models, though top open models are narrowing the gap.

Gandalf: Agent Breaker is a hacking simulator game designed to challenge players to exploit AI agents in realistic scenarios. The game features ten GenAI applications, each simulating real-world AI agent behaviors with multiple difficulty levels, layered defenses, and diverse attack surfaces, ranging from prompt engineering to code-level attacks, file processing, memory manipulation, and external tool usage.

Gandalf was originally developed during an internal hackathon at Lakera, where blue and red teams competed to defend and attack an LLM holding a secret password. Since its public release in 2023, Gandalf has become the world’s largest red-teaming community, generating over 80 million data points. Initially conceived as a game, Gandalf has proven invaluable in revealing real-world vulnerabilities in GenAI applications, highlighting the critical need for AI-first security.

Leave a Reply

Your email address will not be published. Required fields are marked *