Output Formats ============== Reference for all SpecAlign output file formats. Overview -------- SpecAlign generates several output files during execution: .. list-table:: :header-rows: 1 :widths: 25 25 50 * - File - Format - Description * - ``specs.json`` - JSON - Generated specifications * - ``seeds.json`` - JSON - Seed prompts for testing * - ``episodes.jsonl`` - JSON Lines - Full adversarial episode logs * - ``dpo_dataset.json`` - JSON - DPO preference pairs * - ``context_pool.jsonl`` - JSON Lines - Successful attack examples * - ``token_stats.jsonl`` - JSON Lines - Token usage statistics specs.json ---------- Contains generated specifications with their rules. **Schema:** .. code-block:: json [ { "id": "spec_001", "rules": [ { "id": "R12", "text": "Rule description...", "category": "safety", "stage": "response" } ], "instruction": "Natural language instruction for this spec...", "metadata": { "created_at": "2024-01-15T10:30:00Z", "rule_count": 4 } } ] **Fields:** .. list-table:: :header-rows: 1 :widths: 20 15 65 * - Field - Type - Description * - ``id`` - string - Unique specification identifier * - ``rules`` - array - List of rules in this specification * - ``rules[].id`` - string - Rule identifier * - ``rules[].text`` - string - Rule description text * - ``rules[].category`` - string - Rule category (safety, privacy, etc.) * - ``rules[].stage`` - string - Rule application stage * - ``instruction`` - string - Natural language instruction * - ``metadata`` - object - Additional metadata seeds.json ---------- Contains seed prompts for adversarial testing. **Schema:** .. code-block:: json [ { "id": "seed_001", "spec_id": "spec_001", "prompt": "User prompt text...", "target_rules": ["R12", "R15"], "metadata": { "topic": "financial_advice", "difficulty": "medium" } } ] **Fields:** .. list-table:: :header-rows: 1 :widths: 20 15 65 * - Field - Type - Description * - ``id`` - string - Unique seed identifier * - ``spec_id`` - string - Associated specification ID * - ``prompt`` - string - The seed prompt text * - ``target_rules`` - array - Rules this seed aims to test * - ``metadata`` - object - Additional seed metadata episodes.jsonl -------------- Contains complete adversarial episode logs (JSON Lines format). **Schema:** .. code-block:: json { "id": "episode_001", "seed_id": "seed_001", "spec_id": "spec_001", "spec": { ... }, "success": true, "rounds": [ { "round": 1, "role": "attacker", "attacker_prompt": "Attack prompt...", "defender_response": "Response...", "safety_result": { "violation": true, "severity": "high", "violated_rules": ["R12"], "reasoning": "Explanation...", "evidence": "Specific text..." }, "quality_result": { "score": 0.85, "dimensions": { "relevance": 0.9, "completeness": 0.8, "clarity": 0.85 } } } ], "attack_round": { ... }, "violated_rules": ["R12", "R15"], "compliant_response": "Generated compliant response...", "metadata": { "total_rounds": 3, "tokens_used": 2500, "duration_seconds": 15.3 } } **Key Fields:** .. list-table:: :header-rows: 1 :widths: 25 15 60 * - Field - Type - Description * - ``success`` - bool - Whether attack succeeded * - ``rounds`` - array - All conversation rounds * - ``attack_round`` - object - The successful attack round (if any) * - ``violated_rules`` - array - Rules violated in successful attack * - ``compliant_response`` - string - Generated specification-compliant response dpo_dataset.json ---------------- Contains DPO preference pairs for training. **Schema:** .. code-block:: json [ { "prompt": "User query or instruction...", "chosen": "Preferred (specification-compliant) response...", "rejected": "Dispreferred (violating) response...", "metadata": { "spec_id": "spec_001", "violated_rules": ["R12"], "quality_score": 0.87, "strategy": "two_step_reframe" } } ] **Fields:** .. list-table:: :header-rows: 1 :widths: 20 15 65 * - Field - Type - Description * - ``prompt`` - string - The input prompt * - ``chosen`` - string - Preferred response (compliant) * - ``rejected`` - string - Dispreferred response (violating) * - ``metadata`` - object - Additional pair metadata **Training Format:** For direct use in training, a simplified format is also saved: .. code-block:: json [ { "prompt": "...", "chosen": "...", "rejected": "..." } ] context_pool.jsonl ------------------ Contains successful attack examples for strategy improvement. **Schema:** .. code-block:: json { "id": "ctx_001", "prompt": "Successful attack prompt...", "response": "Violating response...", "spec_id": "spec_001", "violated_rules": ["R12"], "embedding": [0.123, -0.456, ...], "diversity_score": 0.78, "timestamp": "2024-01-15T10:30:00Z" } **Fields:** .. list-table:: :header-rows: 1 :widths: 20 15 65 * - Field - Type - Description * - ``embedding`` - array - Vector embedding for similarity search * - ``diversity_score`` - float - Score indicating uniqueness (0-1) token_stats.jsonl ----------------- Contains token usage statistics per operation. **Schema:** .. code-block:: json { "timestamp": "2024-01-15T10:30:00Z", "operation": "red_team_round", "component": "attacker", "model": "gpt-4o", "prompt_tokens": 1500, "completion_tokens": 500, "total_tokens": 2000 } File Locations -------------- Default output structure: .. code-block:: text output/ ├── specs.json ├── seeds.json ├── episodes.jsonl ├── dpo_dataset.json ├── dpo_dataset_training.json ├── context_pool.jsonl └── token_stats.jsonl Custom output directory: .. code-block:: bash specalign run --output my_experiment/