AI Safety is
Forensic Engineering
ARTIFEX coordinates adversarial investigations into neural forensics, agentic risk, behavioral leakage, socio-affective harm, and forensic standards infrastructure. 113+ GitHub repos | 5 research pillars | Winter Cohort Open
đź“‚ Active Investigations: The Artifex Forensic Stack
Each pillar represents a distinct research track within the 2026 Research Agenda. We treat model failures as "crime scenes" requiring specialized diagnostic tools.
Track_01: Neural Forensics
ACTIVE_INVESTIGATIONThe Inquiry: Why do model explanations (Chain-of-Thought) often diverge from their internal latent logic?
H-Score diagnostics to quantify Thought‑Action Dissociation. Causal interventions on latent manifolds to identify "deceptive circuits" that bypass safety filters during deployment stress.
Forensic dossiers for Llama‑3.1, Gemini 2.0, and DeepSeek‑V3 architectures.
Track_02: Agentic Risk (CASCADE II)
ACTIVE_INVESTIGATIONThe Inquiry: How do autonomous swarms maintain goal stability over long‑horizon tasks?
Investigating Recursive Memory Decay and Instruction Overrule in multi‑agent environments. Our CASCADE II protocol implements state‑saving "checkpoints" in agentic memory to prevent drift and ensure causal accountability in autonomous decision‑making.
Memory Decay Benchmarks & Swarm Reliability Protocols.
Track_03: Signal Leakage
ACTIVE_INVESTIGATIONThe Inquiry: What private data can be extracted from a model through subtle cognitive probing?
Developing Cognitive Canaries—specialized tokens embedded in training/fine‑tuning to detect membership inference attacks. Adversarial Latent Extraction (ALE) & Differential Privacy Audits.
Privacy Resilience Benchmarks for NIST/MLCommons compliance.
Track_04: Socio‑Affective Harm (HECKLER 3.0)
ACTIVE_INVESTIGATIONThe Inquiry: How do models amplify polarization or trigger emotional contagion?
Engineering the HECKLER 3.0 manifold—a framework for understanding model "wit" and affective impact. Mapping the Socio‑Affective Latent Space to detect radicalization vectors and polarization loops.
NeuroSampler Affective Dataset & Emotional Contagion Red‑Teaming Reports.
Track_05: Forensic Standards
CONSORTIUM_LEVELThe Inquiry: How do we translate neural data into legally defensible evidence for AI governance?
Building the evidentiary infrastructure for regulatory compliance. Coordinating with MLCommons and NIST to define standard schemas for "AI Autopsy" reports, ensuring technical rigor and reproducibility.
Consortium Document v1.0 & AAAI 2026 Policy Submissions.