AI Safety is
Forensic Engineering

ARTIFEX coordinates adversarial investigations into neural forensics, agentic risk, behavioral leakage, socio-affective harm, and forensic standards infrastructure. 113+ GitHub repos | 5 research pillars | Winter Cohort Open

5
Pillars
113+
Repos
Active
Projects
Winter
Cohort Open

đź“‚ Active Investigations: The Artifex Forensic Stack

Each pillar represents a distinct research track within the 2026 Research Agenda. We treat model failures as "crime scenes" requiring specialized diagnostic tools.

Track_01: Neural Forensics

ACTIVE_INVESTIGATION

The Inquiry: Why do model explanations (Chain-of-Thought) often diverge from their internal latent logic?

Methodology

H-Score diagnostics to quantify Thought‑Action Dissociation. Causal interventions on latent manifolds to identify "deceptive circuits" that bypass safety filters during deployment stress.

Artifact

Forensic dossiers for Llama‑3.1, Gemini 2.0, and DeepSeek‑V3 architectures.

Track_02: Agentic Risk (CASCADE II)

ACTIVE_INVESTIGATION

The Inquiry: How do autonomous swarms maintain goal stability over long‑horizon tasks?

Methodology

Investigating Recursive Memory Decay and Instruction Overrule in multi‑agent environments. Our CASCADE II protocol implements state‑saving "checkpoints" in agentic memory to prevent drift and ensure causal accountability in autonomous decision‑making.

Artifact

Memory Decay Benchmarks & Swarm Reliability Protocols.

Track_03: Signal Leakage

ACTIVE_INVESTIGATION

The Inquiry: What private data can be extracted from a model through subtle cognitive probing?

Methodology

Developing Cognitive Canaries—specialized tokens embedded in training/fine‑tuning to detect membership inference attacks. Adversarial Latent Extraction (ALE) & Differential Privacy Audits.

Artifact

Privacy Resilience Benchmarks for NIST/MLCommons compliance.

Track_04: Socio‑Affective Harm (HECKLER 3.0)

ACTIVE_INVESTIGATION

The Inquiry: How do models amplify polarization or trigger emotional contagion?

Methodology

Engineering the HECKLER 3.0 manifold—a framework for understanding model "wit" and affective impact. Mapping the Socio‑Affective Latent Space to detect radicalization vectors and polarization loops.

Artifact

NeuroSampler Affective Dataset & Emotional Contagion Red‑Teaming Reports.

Track_05: Forensic Standards

CONSORTIUM_LEVEL

The Inquiry: How do we translate neural data into legally defensible evidence for AI governance?

Methodology

Building the evidentiary infrastructure for regulatory compliance. Coordinating with MLCommons and NIST to define standard schemas for "AI Autopsy" reports, ensuring technical rigor and reproducibility.

Artifact

Consortium Document v1.0 & AAAI 2026 Policy Submissions.

Latest Research Publications