SpecAlign Documentation
SpecAlign is a multi-agent adversarial testing framework for evaluating LLM compliance with safety specifications. It generates synthetic data through red team testing, producing DPO (Direct Preference Optimization) datasets for model alignment.
Key Features
Multi-agent adversarial testing: Attacker and Defender agents compete to find specification violations
DPO dataset generation: Automatically generates preference pairs for training
Configurable providers: Support for OpenAI, Anthropic, local models, and more
Rich CLI: Modern command-line interface with progress tracking
Flexible configuration: JSON config files with environment variable support
Quick Example
# Install SpecAlign
pip install specalign
# Initialize configuration
specalign config init
# Set API key
export SPECALIGN_OPENAI_API_KEY="your-api-key"
# Run the complete pipeline
specalign run --config config.json
Documentation Contents
Getting Started
Tutorials
User Guide
API Reference
About