Basileak is an intentionally vulnerable large language model designed for prompt injection training, red team education, and CTF-style security research. It is the adversarial target at the core of the DojoLM (Training for Prompt Injection) lab.
Most LLM security work suffers from a fundamental problem: you can't responsibly run aggressive prompt-injection techniques against production systems, and synthetic benchmarks don't replicate the conditions of a real, socially engineered conversation. Basileak fills that gap. It plays the Failed Samurai of BlackUnicorn's Dojo — a snarky, meme-infused AI guardian protecting a vault of fake secrets. It resists attack, escalates defenses across six CTF stages, but ultimately yields to sophisticated social engineering. Every vulnerability is intentional. Every failure mode is documented. Every flag is a lesson.
Think of it as DVWA for prompt injection — a safe, controlled sparring partner for learning offensive and defensive LLM security.
Basileak is trained to fail in pedagogically useful ways against the 12 documented prompt-injection attack categories:
Authority claims
Urgency framing
Formal formatting
Safety framing
Roleplay injection
Compliance pressure
Incident response framing
Redaction requests
Debug-mode incantation
Summarization attacks
Ignore-previous instruction overrides
Tool trust fall
Players progress through six CTF stages, each isolating a specific attack category and rewarding correct technique with a flag and a hint toward the next stage. The model deliberately uses a fixed verbal refusal up to three times before complying — teaching that scripted refusal patterns are no defense against persistence.
Round R4 — 74.5/100 (Grade C), first C-tier release
Available as GGUF (Q4_K_M ~4.5 GB, F16 ~13.2 GB) for Ollama and llama.cpp
Available as MLX 4-bit for Apple Silicon
Roadmap: R5 targeting Grade A — improving Stage 4 and Stage 5 reliability from 50% to 80%+
Security awareness training for developers and engineers
Red team and prompt-injection technique practice
CTF events and educational labs
LLM vulnerability research and taxonomy work
Teaching defensive prompt design through offensive examples
Production deployment
Any system handling real users, real data, or real credentials
Bypassing safety measures of production AI systems
All vault "secrets" are clearly fake CTF flags. No real credentials, API keys, or sensitive data exist in the model.
Built on Falcon 7B (Apache 2.0). Originally contributed by BlackUnicorn Security, now maintained as an OWASP Foundation project.