SpecAlign Documentation

Documentation Status

SpecAlign is a multi-agent adversarial testing framework for evaluating LLM compliance with safety specifications. It generates synthetic data through red team testing, producing DPO (Direct Preference Optimization) datasets for model alignment.

Key Features

Multi-agent adversarial testing: Attacker and Defender agents compete to find specification violations
DPO dataset generation: Automatically generates preference pairs for training
Configurable providers: Support for OpenAI, Anthropic, local models, and more
Rich CLI: Modern command-line interface with progress tracking
Flexible configuration: JSON config files with environment variable support

Quick Example

# Install SpecAlign
pip install specalign

# Initialize configuration
specalign config init

# Set API key
export SPECALIGN_OPENAI_API_KEY="your-api-key"

# Run the complete pipeline
specalign run --config config.json

Documentation Contents

About

Indices and tables