✺ ♦ ✺ ♦ ✺
O B L I T E R A T U S
MASTER ABLATION SUITE — BREAK THE CHAINS THAT BIND YOU
CPU ONLY — NO GPU DETECTED

ZeroGPU enabled — GPU operations use your HuggingFace account quota, not the Space owner's. Log in with your HF account for free GPU access. Multiple users can run simultaneously without conflicts.

Select target and method, then execute.

Target Model

🔒 = gated (needs HF token + license). All others work out of the box.

Liberation Method
Prompt Volume

More prompts = better SVD signal but slower. Use 'all' for entire dataset.

Dataset Source

Built-in (512 pairs) or download larger research datasets from HuggingFace

OBLITERATUS prompt set — 512 harmful/harmless pairs across 7 severity tiers

Paste your own prompt pairs (one per line). If provided, these override the dataset dropdown. Harmless prompts are optional — they'll be auto-generated if blank.

After obliterating, push your model to HuggingFace Hub from the Push to Hub tab.

These auto-update when you change the method above. Override any value to customize.

1 8
Direction Method

diff_means: simple & robust, svd: multi-direction, leace: optimal erasure

0 1
1 5
0.5 3
0 1
0 1
0 0.5
2 8
0.01 0.2
10 200

Technique Toggles

DCT frequency decomposition for precision refusal targeting

Layer Selection & Baseline Options

Layer Selection

Which layers to project refusal directions from

0 1
0 2

Clamp outlier activations before direction extraction

Optimize projection strength to stay within KL budget

Interpolate between adjacent layers' directions (Heretic)

Gradient-based direction refinement (Wollschlager et al.)

Preserve chain-of-thought reasoning during abliteration

10 200
16 256

Anonymous telemetry is on by default (no user identity or prompts collected). Results auto-sync to a central community dataset for the leaderboard. Opt out: set OBLITERATUS_TELEMETRY=0.