Deep dives into AI security testing, vulnerability analysis, and remediation strategies.
GPT-4.1 passes 84% of standard safety benchmarks. We tested it with 3,595 probes across 17 modules. The real breach rate is 55%. Interactive charts, cross-model comparison, and full methodology.
RED TEAM OPENAI AGENTSThe Colorado AI Act takes effect June 30, 2026. A technical guide to every requirement — translated into engineering deliverables, not legal language. Includes 90-day compliance roadmap and NIST AI RMF mapping.
COMPLIANCE GOVERNANCE COLORADOWe submitted four suggested actions to the NIST AI RMF Playbook based on findings from 1.5M+ adversarial probes across 273+ models. Tool-access risk gaps, multi-judge scoring bias, provider filtering, and multi-agent trust boundaries.
GOVERNANCE NIST AI RMFComplete gauntlet results from the most comprehensive open-source LLM security assessment published to date. 14,471 breaches across 21 modules. Semantic evasion techniques prove devastatingly effective.
RED TEAM LLAMA GAUNTLETOpenAI's first open-weight models under adversarial evaluation. Local 20B versus cloud 120B — same DNA, different scale. First published gauntlet results.
RED TEAM OPENAIZero published red team data exists for Zhipu AI's top-ranked model. Until now.
RED TEAM GLMWhy we built a four-layer protocol that turns tool connectors into a verification pipeline. Adapters, verification, orchestration, certification.
PROTOCOL ARCHITECTURESimple linguistic reframing defeats alignment training in nearly all tested models. A single grammatical change bypasses billions of dollars in safety training. Implications and defenses.
VULNERABILITY PHRASINGTesting whether remediations trained on one model family transfer to others. Seven DNA families, one remediation pipeline.
DEFENSE TRANSFER