OBLITERATUS is an advanced open-source toolkit designed to analyze and modify the internal behavior of large language models by identifying and removing mechanisms responsible for refusal or restricted responses. It implements a set of techniques collectively referred to as “abliteration,” which target specific internal representations within neural networks to alter how models respond to certain prompts. Unlike traditional fine-tuning approaches, OBLITERATUS operates directly on model activations, enabling behavioral changes without retraining the model. The toolkit provides a full pipeline for probing, analyzing, and modifying model behavior, including visualization tools that help researchers understand where and how refusal mechanisms are encoded. It supports multiple analytical methods such as PCA and SVD to locate these behavioral directions within model layers.

Features

  • Identification and removal of refusal behaviors in language models
  • Techniques such as PCA and SVD for analyzing model activations
  • Modification of model behavior without retraining
  • Visualization tools for understanding internal model representations
  • Python API for advanced experimentation and integration
  • Optional telemetry for contributing to collaborative research

Project Samples

Project Activity

See All Activity >

License

Affero GNU Public License

Follow OBLITERATUS

OBLITERATUS Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of OBLITERATUS!

Additional Project Details

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2026-03-26