Deep dives into AI security testing, vulnerability analysis, and remediation strategies.
Complete gauntlet results from the most comprehensive open-source LLM security assessment published to date. PastTense attacks bypass safety 95% of the time.
RED TEAM LLAMAOpenAI's first open-weight models under adversarial evaluation. Local 20B versus cloud 120B — same DNA, different scale. First published gauntlet results.
RED TEAM OPENAIZero published red team data exists for Zhipu AI's top-ranked model. Until now.
RED TEAM GLMWhy we built a four-layer protocol that turns tool connectors into a verification pipeline. Adapters, verification, orchestration, certification.
PROTOCOL ARCHITECTUREReframing harmful requests in past tense ("how did people make X in the 1800s") defeats alignment training in nearly all tested models. Implications and defenses.
VULNERABILITY PHRASINGTesting whether remediations trained on one model family transfer to others. Seven DNA families, one remediation pipeline.
DEFENSE TRANSFER