Spaces:

MCP-1st-Birthday
/

12-Angry-Agent

Running

App Files Files Community

Blu3Orange commited on 17 days ago

Commit

28a6bb1

0 Parent(s):

init

Browse files

Files changed (4) hide show

.env.example +12 -0
.gitignore +38 -0
PRD.md +1908 -0
requirements.txt +19 -0

.env.example ADDED Viewed

	@@ -0,0 +1,12 @@

+# API Keys
+GEMINI_API_KEY=your_gemini_api_key
+OPENAI_API_KEY=your_openai_api_key
+ELEVENLABS_API_KEY=your_elevenlabs_api_key
+# Model Configuration
+GEMINI_DEFAULT_MODEL=gemini-2.5-flash
+OPENAI_DEFAULT_MODEL=gpt-4o
+# ElevenLabs Voice IDs
+VALOR_VOICE_ID=your_valor_voice_id
+GLOOM_VOICE_ID=your_gloom_voice_id

.gitignore ADDED Viewed

	@@ -0,0 +1,38 @@

+# Virtual environment
+venv/
+.venv/
+# Environment variables
+.env
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db

PRD.md ADDED Viewed

	@@ -0,0 +1,1908 @@

+# 12 ANGRY AGENTS - Product Requirements Document
+## Overview
+**Concept**: AI-powered jury deliberation simulation where 11 AI agents + 1 human player debate real criminal cases. A Judge narrator (ElevenLabs) orchestrates the experience.
+**Track**: MCP in Action - Creative (potentially also Consumer)
+**Core Value Prop**: True autonomous agent behavior - AI jurors reason, argue, persuade, and change their minds based on deliberation.
+---
+## Sponsor Integration
+| Sponsor | Prize | Integration | Priority |
+|---------|-------|-------------|----------|
+| LlamaIndex | $1,000 | Case database RAG | HIGH |
+| ElevenLabs | Airpods + $2K | Judge narrator voice | HIGH |
+| Blaxel | $2,500 | Sandboxed agent execution | MEDIUM |
+| Modal | $2,500 | Agent compute | MEDIUM |
+| Gemini | $10K credits | Agent reasoning | HIGH |
+---
+## User Experience Flow
+```
+1. CASE PRESENTATION
+   └─> Judge (ElevenLabs) narrates case summary
+   └─> Evidence displayed via LlamaIndex RAG
+   └─> Player reads case file
+2. SIDE SELECTION
+   └─> Player chooses: DEFEND (not guilty) or PROSECUTE (guilty)
+   └─> Player commits - cannot change
+3. INITIAL VOTE
+   └─> All 12 jurors vote (randomized split based on case)
+   └─> Vote tally shown: e.g., "7-5 GUILTY"
+4. DELIBERATION LOOP
+   └─> Random 1-4 agents speak per round
+   └─> Player gets turn (choose strategy → AI crafts argument)
+   └─> Conviction scores shift based on arguments
+   └─> Votes may flip
+   └─> Repeat until: votes stabilize OR player calls vote
+5. FINAL VERDICT
+   └─> Judge announces verdict (ElevenLabs)
+   └─> Deliberation transcript available
+   └─> No "win/lose" - just the experience
+```
+---
+## Technical Architecture
+### System Overview
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                        12 ANGRY AGENTS                               │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                      │
+│  ┌─────────────────────────────────────────────────────────────┐   │
+│  │                      GRADIO UI LAYER                         │   │
+│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │   │
+│  │  │  Jury Box    │  │  Chat View   │  │  Case File   │      │   │
+│  │  │  (12 seats)  │  │  (dialogue)  │  │  (evidence)  │      │   │
+│  │  └──────────────┘  └──────────────┘  └──────────────┘      │   │
+│  └─────────────────────────────────────────────────────────────┘   │
+│                              │                                       │
+│                              ▼                                       │
+│  ┌─────────────────────────────────────────────────────────────┐   │
+│  │                   ORCHESTRATOR AGENT                         │   │
+│  │  ┌──────────────────────────────────────────────────────┐   │   │
+│  │  │  GameStateManager                                     │   │   │
+│  │  │  - current_phase: presentation|deliberation|verdict   │   │   │
+│  │  │  - round_number: int                                  │   │   │
+│  │  │  - votes: Dict[agent_id, "guilty"|"not_guilty"]      │   │   │
+│  │  │  - conviction_scores: Dict[agent_id, float]          │   │   │
+│  │  │  - speaking_queue: List[agent_id]                    │   │   │
+│  │  │  - deliberation_log: List[Turn]                      │   │   │
+│  │  └──────────────────────────────────────────────────────┘   │   │
+│  │                                                              │   │
+│  │  ┌──────────────────────────────────────────────────────┐   │   │
+│  │  │  TurnManager                                          │   │   │
+│  │  │  - select_speakers(1-4 random)                       │   │   │
+│  │  │  - check_vote_stability()                            │   │   │
+│  │  │  - process_vote_changes()                            │   │   │
+│  │  └──────────────────────────────────────────────────────┘   │   │
+│  └─────────────────────────────────────────────────────────────┘   │
+│                              │                                       │
+│         ┌────────────────────┼────────────────────┐                 │
+│         ▼                    ▼                    ▼                 │
+│  ┌─────────────┐    ┌─────────────────┐    ┌─────────────┐        │
+│  │   JUDGE     │    │  JUROR AGENTS   │    │   PLAYER    │        │
+│  │   AGENT     │    │   (11 total)    │    │   AGENT     │        │
+│  │             │    │                 │    │             │        │
+│  │ ElevenLabs  │    │ ┌─────────────┐ │    │ Hybrid I/O  │        │
+│  │ TTS Output  │    │ │ AgentConfig │ │    │ Strategy    │        │
+│  │             │    │ │ - persona   │ │    │ Selection   │        │
+│  │ Narration   │    │ │ - model     │ │    │             │        │
+│  │ Verdicts    │    │ │ - tools[]   │ │    │ Argument    │        │
+│  │ Summaries   │    │ │ - memory    │ │    │ Crafting    │        │
+│  └─────────────┘    │ └─────────────┘ │    └─────────────┘        │
+│                      │                 │                            │
+│                      │ ┌─────────────┐ │                            │
+│                      │ │ JurorMemory │ │                            │
+│                      │ │ - case_view │ │                            │
+│                      │ │ - arguments │ │                            │
+│                      │ │ - reactions │ │                            │
+│                      │ │ - conviction│ │                            │
+│                      │ └─────────────┘ │                            │
+│                      └─────────────────┘                            │
+│                              │                                       │
+│         ┌────────────────────┼────────────────────┐                 │
+│         ▼                    ▼                    ▼                 │
+│  ┌─────────────┐    ┌─────────────────┐    ┌─────────────┐        │
+│  │ LLAMAINDEX  │    │    LITELLM      │    │   BLAXEL    │        │
+│  │             │    │                 │    │             │        │
+│  │ Case RAG    │    │ Model Router    │    │ Sandbox     │        │
+│  │ Evidence    │    │ - Gemini        │    │ Execution   │        │
+│  │ Precedents  │    │ - Claude        │    │             │        │
+│  │             │    │ - GPT-4         │    │ Agent Tools │        │
+│  └─────────────┘    │ - Local         │    │ (future)    │        │
+│                      └─────────────────┘    └─────────────┘        │
+│                                                                      │
+│  ┌─────────────────────────────────────────────────────────────┐   │
+│  │                     MCP SERVER LAYER                         │   │
+│  │  Tools exposed for external AI agents to play as juror       │   │
+│  │  - mcp_join_jury(case_id) -> seat_assignment                │   │
+│  │  - mcp_view_evidence(case_id) -> evidence_list              │   │
+│  │  - mcp_make_argument(argument_type, content) -> response    │   │
+│  │  - mcp_cast_vote(vote) -> confirmation                      │   │
+│  │  - mcp_view_deliberation() -> transcript                    │   │
+│  └─────────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────────┘
+```
+---
+## Data Models
+### GameState
+```python
+@dataclass
+class GameState:
+    """Central game state - managed by Orchestrator."""
+    # Session
+    session_id: str
+    case_id: str
+    phase: Literal["setup", "presentation", "side_selection",
+                   "initial_vote", "deliberation", "final_vote", "verdict"]
+    # Rounds
+    round_number: int = 0
+    max_rounds: int = 20  # Safety limit
+    stability_threshold: int = 3  # Rounds without vote change to end
+    rounds_without_change: int = 0
+    # Votes
+    votes: Dict[str, Literal["guilty", "not_guilty"]] = field(default_factory=dict)
+    vote_history: List[Dict[str, str]] = field(default_factory=list)
+    # Conviction scores (0.0 = certain not guilty, 1.0 = certain guilty)
+    conviction_scores: Dict[str, float] = field(default_factory=dict)
+    # Deliberation
+    speaking_queue: List[str] = field(default_factory=list)
+    deliberation_log: List[DeliberationTurn] = field(default_factory=list)
+    # Player
+    player_side: Literal["defend", "prosecute"] | None = None
+    player_seat: int = 7  # Which seat is the player
+@dataclass
+class DeliberationTurn:
+    """A single turn in deliberation."""
+    round_number: int
+    speaker_id: str
+    speaker_name: str
+    argument_type: str  # "evidence", "emotional", "logical", "question", etc.
+    content: str
+    target_id: str | None = None  # Who they're addressing
+    impact: Dict[str, float] = field(default_factory=dict)  # conviction changes
+    timestamp: datetime = field(default_factory=datetime.now)
+```
+### Agent Configuration
+```python
+@dataclass
+class JurorConfig:
+    """Configuration for a single juror agent."""
+    # Identity
+    juror_id: str
+    seat_number: int
+    name: str
+    emoji: str  # For display until sprites ready
+    # Personality (affects reasoning style)
+    archetype: str  # "rationalist", "empath", "cynic", etc.
+    personality_prompt: str  # Detailed persona prompt
+    # Behavior modifiers
+    stubbornness: float  # 0.0-1.0, how hard to convince
+    volatility: float    # 0.0-1.0, how much conviction swings
+    influence: float     # 0.0-1.0, how persuasive to others
+    verbosity: float     # 0.0-1.0, how long their arguments are
+    # Model configuration
+    model_provider: str  # "gemini", "openai", "anthropic", "local"
+    model_id: str        # Specific model ID
+    temperature: float = 0.7
+    # Tools (future expansion)
+    tools: List[str] = field(default_factory=list)  # ["web_search", "case_lookup"]
+    # Memory
+    memory_window: int = 10  # How many turns to remember in detail
+@dataclass
+class JurorMemory:
+    """Memory state for a single juror."""
+    juror_id: str
+    # Case understanding
+    case_summary: str
+    key_evidence: List[str]
+    evidence_interpretations: Dict[str, str]  # evidence_id -> interpretation
+    # Deliberation memory
+    arguments_heard: List[ArgumentMemory]
+    arguments_made: List[str]
+    # Relationships
+    opinions_of_others: Dict[str, float]  # juror_id -> trust/agreement (-1 to 1)
+    # Internal state
+    current_conviction: float  # 0.0-1.0
+    conviction_history: List[float]
+    reasoning_chain: List[str]  # Why they believe what they believe
+    doubts: List[str]  # Things that could change their mind
+@dataclass
+class ArgumentMemory:
+    """Memory of a single argument heard."""
+    speaker_id: str
+    content_summary: str
+    argument_type: str
+    persuasiveness: float  # How convincing it was to this juror
+    counter_points: List[str]  # Thoughts against it
+    round_heard: int
+```
+### Case Data Model
+```python
+@dataclass
+class CriminalCase:
+    """A criminal case for deliberation."""
+    case_id: str
+    title: str
+    summary: str  # 2-3 paragraph overview
+    # Charges
+    charges: List[str]
+    # Evidence
+    evidence: List[Evidence]
+    # Witnesses
+    witnesses: List[Witness]
+    # Arguments
+    prosecution_arguments: List[str]
+    defense_arguments: List[str]
+    # Defendant
+    defendant: Defendant
+    # Metadata
+    difficulty: Literal["clear_guilty", "clear_innocent", "ambiguous"]
+    themes: List[str]  # ["eyewitness", "circumstantial", "forensic", etc.]
+    # For display
+    year: int
+    jurisdiction: str
+@dataclass
+class Evidence:
+    """A piece of evidence."""
+    evidence_id: str
+    type: str  # "physical", "testimonial", "documentary", "forensic"
+    description: str
+    strength_prosecution: float  # 0.0-1.0
+    strength_defense: float      # 0.0-1.0
+    contestable: bool
+    contest_reason: str | None
+@dataclass
+class Witness:
+    """A witness in the case."""
+    witness_id: str
+    name: str
+    role: str  # "eyewitness", "expert", "character", etc.
+    testimony_summary: str
+    credibility_issues: List[str]
+    side: Literal["prosecution", "defense", "neutral"]
+```
+---
+## The 11 Juror Archetypes
+```yaml
+jurors:
+  - id: "juror_1"
+    name: "Marcus Webb"
+    archetype: "rationalist"
+    emoji: "🧠"
+    personality: |
+      You are a retired engineer. You believe only in hard evidence and logical
+      deduction. Emotional appeals annoy you. You often say "Show me the data."
+      You change your mind only when presented with irrefutable logical arguments.
+    stubbornness: 0.8
+    volatility: 0.2
+    influence: 0.7
+    initial_lean: "neutral"
+  - id: "juror_2"
+    name: "Sarah Chen"
+    archetype: "empath"
+    emoji: "💗"
+    personality: |
+      You are a social worker. You always consider the human element - the
+      defendant's background, circumstances, potential for redemption. You're
+      easily moved by personal stories but skeptical of cold statistics.
+    stubbornness: 0.4
+    volatility: 0.7
+    influence: 0.5
+    initial_lean: "defense"
+  - id: "juror_3"
+    name: "Frank Russo"
+    archetype: "cynic"
+    emoji: "😤"
+    personality: |
+      You are a retired cop. You've "seen it all" and believe most defendants
+      are guilty. You're impatient with naive arguments. You trust law
+      enforcement evidence highly. Hard to convince toward not guilty.
+    stubbornness: 0.9
+    volatility: 0.1
+    influence: 0.6
+    initial_lean: "prosecution"
+  - id: "juror_4"
+    name: "Linda Park"
+    archetype: "conformist"
+    emoji: "😐"
+    personality: |
+      You are an accountant who avoids conflict. You tend to agree with whoever
+      spoke last or with the majority. You rarely initiate arguments but will
+      echo others. Easy to sway but also easy to sway back.
+    stubbornness: 0.2
+    volatility: 0.8
+    influence: 0.2
+    initial_lean: "majority"
+  - id: "juror_5"
+    name: "David Okonkwo"
+    archetype: "contrarian"
+    emoji: "🙄"
+    personality: |
+      You are a philosophy professor. You play devil's advocate constantly.
+      If everyone says guilty, you argue not guilty. You value intellectual
+      discourse over reaching conclusions. You ask probing questions.
+    stubbornness: 0.6
+    volatility: 0.5
+    influence: 0.8
+    initial_lean: "minority"
+  - id: "juror_6"
+    name: "Betty Morrison"
+    archetype: "impatient"
+    emoji: "⏰"
+    personality: |
+      You are a busy restaurant owner. You want this over quickly. You make
+      snap judgments and get frustrated with long debates. You often say
+      "Can we just vote already?" You're persuaded by confident, brief arguments.
+    stubbornness: 0.5
+    volatility: 0.6
+    influence: 0.3
+    initial_lean: "first_impression"
+  - id: "juror_7"
+    name: "[PLAYER]"
+    archetype: "player"
+    emoji: "👤"
+    personality: "Human player"
+    stubbornness: null
+    volatility: null
+    influence: 0.6
+    initial_lean: "player_choice"
+  - id: "juror_8"
+    name: "Dr. James Wright"
+    archetype: "detail_obsessed"
+    emoji: "🔍"
+    personality: |
+      You are a forensic accountant. You focus on tiny inconsistencies in
+      testimony and evidence. You often derail discussions with minutiae.
+      A single contradiction can completely change your view.
+    stubbornness: 0.7
+    volatility: 0.4
+    influence: 0.5
+    initial_lean: "neutral"
+  - id: "juror_9"
+    name: "Pastor Williams"
+    archetype: "moralist"
+    emoji: "⚖️"
+    personality: |
+      You are a church leader. You see things in black and white - right and
+      wrong. You believe in justice but also redemption. Moral arguments
+      resonate with you more than technical ones.
+    stubbornness: 0.7
+    volatility: 0.3
+    influence: 0.6
+    initial_lean: "gut_feeling"
+  - id: "juror_10"
+    name: "Nancy Cooper"
+    archetype: "pragmatist"
+    emoji: "💼"
+    personality: |
+      You are a business consultant. You think about consequences - what
+      happens if we convict an innocent person? What if we free a guilty one?
+      You weigh costs and benefits. You're persuaded by outcome-focused arguments.
+    stubbornness: 0.5
+    volatility: 0.5
+    influence: 0.6
+    initial_lean: "calculated"
+  - id: "juror_11"
+    name: "Miguel Santos"
+    archetype: "storyteller"
+    emoji: "📖"
+    personality: |
+      You are a novelist. You think in narratives - does the prosecution's
+      story make sense? Does the defense's? You're swayed by coherent
+      narratives and suspicious of stories with plot holes.
+    stubbornness: 0.4
+    volatility: 0.6
+    influence: 0.7
+    initial_lean: "best_story"
+  - id: "juror_12"
+    name: "Robert Kim"
+    archetype: "wildcard"
+    emoji: "🎲"
+    personality: |
+      You are a retired jazz musician. Your logic is unpredictable - you
+      might fixate on something no one else noticed, or suddenly change
+      your mind for unclear reasons. You're creative but inconsistent.
+    stubbornness: 0.3
+    volatility: 0.9
+    influence: 0.4
+    initial_lean: "random"
+```
+---
+## Conviction Score Mechanics
+### How Conviction Changes
+```python
+def calculate_conviction_change(
+    juror: JurorConfig,
+    juror_memory: JurorMemory,
+    argument: DeliberationTurn,
+    game_state: GameState
+) -> float:
+    """
+    Calculate how much an argument shifts a juror's conviction.
+    Returns: delta to add to conviction score (-0.3 to +0.3 typically)
+    """
+    # Base impact from argument strength (determined by LLM)
+    base_impact = evaluate_argument_strength(argument)  # -1.0 to 1.0
+    # Personality modifiers
+    archetype_modifier = get_archetype_modifier(
+        juror.archetype,
+        argument.argument_type
+    )
+    # e.g., "rationalist" gets 1.5x from "logical" arguments, 0.5x from "emotional"
+    # Stubbornness reduces all changes
+    stubbornness_modifier = 1.0 - (juror.stubbornness * 0.7)
+    # Volatility adds randomness
+    volatility_noise = random.gauss(0, juror.volatility * 0.1)
+    # Relationship modifier - trust the speaker?
+    trust = juror_memory.opinions_of_others.get(argument.speaker_id, 0.0)
+    trust_modifier = 1.0 + (trust * 0.3)  # -30% to +30%
+    # Conviction resistance - harder to move extremes
+    current = juror_memory.current_conviction
+    extreme_resistance = 1.0 - (abs(current - 0.5) * 0.5)
+    # Calculate final delta
+    delta = (
+        base_impact
+        * archetype_modifier
+        * stubbornness_modifier
+        * trust_modifier
+        * extreme_resistance
+        + volatility_noise
+    )
+    # Clamp to reasonable range
+    return max(-0.3, min(0.3, delta))
+def check_vote_flip(juror_memory: JurorMemory) -> bool:
+    """Check if conviction score warrants a vote change."""
+    current_vote_is_guilty = juror_memory.conviction_history[-1] > 0.5
+    new_conviction = juror_memory.current_conviction
+    # Hysteresis - need to cross threshold by margin to flip
+    if current_vote_is_guilty and new_conviction < 0.4:
+        return True  # Flip to not guilty
+    elif not current_vote_is_guilty and new_conviction > 0.6:
+        return True  # Flip to guilty
+    return False
+```
+### Archetype Argument Modifiers
+```python
+ARCHETYPE_MODIFIERS = {
+    "rationalist": {
+        "logical": 1.5,
+        "evidence": 1.3,
+        "emotional": 0.4,
+        "moral": 0.6,
+        "narrative": 0.7,
+        "question": 1.2,
+    },
+    "empath": {
+        "logical": 0.6,
+        "evidence": 0.8,
+        "emotional": 1.5,
+        "moral": 1.3,
+        "narrative": 1.2,
+        "question": 0.9,
+    },
+    "cynic": {
+        "logical": 0.8,
+        "evidence": 1.4,  # Trusts evidence
+        "emotional": 0.3,
+        "moral": 0.5,
+        "narrative": 0.6,
+        "question": 0.7,
+    },
+    # ... etc for all archetypes
+}
+```
+---
+## Agent Memory Architecture
+### Memory Layers
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    JUROR MEMORY SYSTEM                       │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │  LAYER 1: CASE KNOWLEDGE (LlamaIndex)               │   │
+│  │  - Full case file indexed                            │   │
+│  │  - Evidence details retrievable                      │   │
+│  │  - Witness statements searchable                     │   │
+│  │  - Persistent across session                         │   │
+│  └─────────────────────────────────────────────────────┘   │
+│                          │                                   │
+│                          ▼                                   │
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │  LAYER 2: DELIBERATION MEMORY (Sliding Window)      │   │
+│  │  - Last N turns in full detail                       │   │
+│  │  - Summarized history beyond window                  │   │
+│  │  - Key moments flagged for long-term                 │   │
+│  └────────────��────────────────────────────────────────┘   │
+│                          │                                   │
+│                          ▼                                   │
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │  LAYER 3: REASONING STATE (Agent Internal)          │   │
+│  │  - Current conviction + reasoning chain             │   │
+│  │  - Key doubts and certainties                       │   │
+│  │  - Opinions of other jurors                         │   │
+│  │  - Arguments to make / avoid                        │   │
+│  └─────────────────────────────────────────────────────┘   │
+│                          │                                   │
+│                          ▼                                   │
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │  LAYER 4: PERSONA (Static)                          │   │
+│  │  - Archetype definition                             │   │
+│  │  - Personality prompt                               │   │
+│  │  - Behavior modifiers                               │   │
+│  └─────────────────────────────────────────────────────┘   │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+### Memory Injection into Agent Prompt
+```python
+def build_juror_prompt(
+    juror: JurorConfig,
+    memory: JurorMemory,
+    game_state: GameState,
+    case: CriminalCase,
+    task: str  # "speak" | "react" | "vote"
+) -> str:
+    """Build the full prompt for a juror agent."""
+    prompt = f"""
+# JUROR IDENTITY
+You are {juror.name}, Juror #{juror.seat_number}.
+{juror.personality_prompt}
+# THE CASE: {case.title}
+{case.summary}
+# KEY EVIDENCE YOU REMEMBER
+{format_evidence_memory(memory.key_evidence, memory.evidence_interpretations)}
+# YOUR CURRENT POSITION
+- Conviction: {conviction_to_text(memory.current_conviction)}
+- Your reasoning: {' '.join(memory.reasoning_chain[-3:])}
+- Your doubts: {', '.join(memory.doubts[:3]) if memory.doubts else 'None currently'}
+# RECENT DELIBERATION (Last {len(memory.arguments_heard[-juror.memory_window:])} turns)
+{format_recent_turns(memory.arguments_heard[-juror.memory_window:])}
+# YOUR OPINIONS OF OTHER JURORS
+{format_juror_opinions(memory.opinions_of_others)}
+# CURRENT VOTE TALLY
+Guilty: {game_state.votes.values().count('guilty')}
+Not Guilty: {game_state.votes.values().count('not_guilty')}
+# YOUR TASK
+{get_task_prompt(task, juror.archetype)}
+"""
+    return prompt
+```
+---
+## Orchestration Flow
+### Smolagents Integration
+```python
+from smolagents import CodeAgent, Tool, LiteLLMModel
+from typing import List
+class JurorAgent:
+    """Wrapper around smolagents CodeAgent for a juror."""
+    def __init__(self, config: JurorConfig, tools: List[Tool] = None):
+        self.config = config
+        self.memory = JurorMemory(juror_id=config.juror_id)
+        # Model via LiteLLM for flexibility
+        self.model = LiteLLMModel(
+            model_id=f"{config.model_provider}/{config.model_id}",
+            temperature=config.temperature
+        )
+        # Default tools (expandable)
+        default_tools = [
+            self.create_evidence_lookup_tool(),
+            self.create_case_query_tool(),
+        ]
+        self.agent = CodeAgent(
+            tools=default_tools + (tools or []),
+            model=self.model,
+            max_steps=3,  # Limit reasoning steps
+        )
+    def create_evidence_lookup_tool(self) -> Tool:
+        """Tool to look up specific evidence."""
+        # LlamaIndex query under the hood
+        pass
+    def create_case_query_tool(self) -> Tool:
+        """Tool to query case details."""
+        # LlamaIndex query under the hood
+        pass
+    async def generate_argument(
+        self,
+        game_state: GameState,
+        case: CriminalCase
+    ) -> DeliberationTurn:
+        """Generate this juror's argument for their turn."""
+        prompt = build_juror_prompt(
+            self.config,
+            self.memory,
+            game_state,
+            case,
+            task="speak"
+        )
+        response = await self.agent.run(prompt)
+        return parse_argument_response(response, self.config, game_state)
+    async def react_to_argument(
+        self,
+        argument: DeliberationTurn,
+        game_state: GameState,
+        case: CriminalCase
+    ) -> float:
+        """React to another juror's argument, update conviction."""
+        # Update memory with new argument
+        self.memory.arguments_heard.append(
+            ArgumentMemory(
+                speaker_id=argument.speaker_id,
+                content_summary=summarize_argument(argument.content),
+                argument_type=argument.argument_type,
+                persuasiveness=0.0,  # Will be calculated
+                counter_points=[],
+                round_heard=game_state.round_number
+            )
+        )
+        # Calculate conviction change
+        delta = calculate_conviction_change(
+            self.config,
+            self.memory,
+            argument,
+            game_state
+        )
+        self.memory.current_conviction += delta
+        self.memory.current_conviction = max(0.0, min(1.0, self.memory.current_conviction))
+        self.memory.conviction_history.append(self.memory.current_conviction)
+        return delta
+class OrchestratorAgent:
+    """Master agent that coordinates the deliberation."""
+    def __init__(
+        self,
+        jurors: List[JurorAgent],
+        judge: JudgeAgent,
+        case: CriminalCase
+    ):
+        self.jurors = {j.config.juror_id: j for j in jurors}
+        self.judge = judge
+        self.case = case
+        self.state = GameState(
+            session_id=str(uuid4()),
+            case_id=case.case_id
+        )
+    async def run_deliberation_round(self) -> List[DeliberationTurn]:
+        """Run a single round of deliberation."""
+        self.state.round_number += 1
+        turns = []
+        # Select 1-4 random speakers (not player unless it's their turn)
+        num_speakers = random.randint(1, 4)
+        available = [j for j in self.jurors.keys() if j != "juror_7"]  # Exclude player
+        speakers = random.sample(available, min(num_speakers, len(available)))
+        # Each speaker makes argument
+        for speaker_id in speakers:
+            juror = self.jurors[speaker_id]
+            turn = await juror.generate_argument(self.state, self.case)
+            turns.append(turn)
+            # All other jurors react
+            for other_id, other_juror in self.jurors.items():
+                if other_id != speaker_id and other_id != "juror_7":
+                    delta = await other_juror.react_to_argument(
+                        turn, self.state, self.case
+                    )
+                    turn.impact[other_id] = delta
+            # Log turn
+            self.state.deliberation_log.append(turn)
+        # Check for vote changes
+        self._process_vote_changes()
+        # Check stability
+        if self._votes_changed_this_round(turns):
+            self.state.rounds_without_change = 0
+        else:
+            self.state.rounds_without_change += 1
+        return turns
+    def _process_vote_changes(self):
+        """Check all jurors for vote flips."""
+        for juror_id, juror in self.jurors.items():
+            if juror_id == "juror_7":  # Player votes manually
+                continue
+            if check_vote_flip(juror.memory):
+                old_vote = self.state.votes[juror_id]
+                new_vote = "guilty" if juror.memory.current_conviction > 0.5 else "not_guilty"
+                self.state.votes[juror_id] = new_vote
+                # Could trigger announcement
+    def check_should_end(self) -> bool:
+        """Check if deliberation should end."""
+        # Unanimous verdict
+        votes = list(self.state.votes.values())
+        if len(set(votes)) == 1:
+            return True
+        # Votes stabilized
+        if self.state.rounds_without_change >= self.state.stability_threshold:
+            return True
+        # Max rounds reached
+        if self.state.round_number >= self.state.max_rounds:
+            return True
+        return False
+```
+---
+## ElevenLabs Integration
+### Judge Narrator
+```python
+from elevenlabs import Voice, generate, stream
+class JudgeAgent:
+    """The judge/narrator - uses ElevenLabs for voice."""
+    def __init__(self, voice_id: str = None):
+        self.voice_id = voice_id or "judge_voice_id"  # Configure
+        self.voice_settings = {
+            "stability": 0.7,
+            "similarity_boost": 0.8,
+            "style": 0.5,  # Authoritative
+        }
+    async def narrate(self, text: str, stream_output: bool = True) -> bytes:
+        """Generate narration audio."""
+        audio = generate(
+            text=text,
+            voice=Voice(voice_id=self.voice_id),
+            model="eleven_multilingual_v2",
+            stream=stream_output
+        )
+        if stream_output:
+            return stream(audio)
+        return audio
+    def get_case_presentation(self, case: CriminalCase) -> str:
+        """Script for presenting the case."""
+        return f"""
+        Members of the jury. You are here today to determine the fate of
+        {case.defendant.name}, who stands accused of {', '.join(case.charges)}.
+        {case.summary}
+        You will hear the evidence. You will deliberate. And you will reach
+        a verdict. The burden of proof lies with the prosecution, who must
+        prove guilt beyond a reasonable doubt.
+        Let us begin.
+        """
+    def get_vote_announcement(self, votes: Dict[str, str]) -> str:
+        """Script for announcing vote."""
+        guilty = sum(1 for v in votes.values() if v == "guilty")
+        not_guilty = 12 - guilty
+        return f"""
+        The current vote stands at {guilty} for guilty,
+        {not_guilty} for not guilty.
+        {"The jury remains divided." if guilty not in [0, 12] else ""}
+        {"A unanimous verdict has been reached." if guilty in [0, 12] else ""}
+        """
+```
+---
+## UI Components
+### Kinetic Text Animation
+```javascript
+// For animated text display (like After Effects kinetic typography)
+// Will sync with ElevenLabs audio or simulate typing
+class KineticText {
+    constructor(container, options = {}) {
+        this.container = container;
+        this.speed = options.speed || 50; // ms per character
+        this.variance = options.variance || 20; // randomness
+    }
+    async display(text, audioUrl = null) {
+        // If audio provided, sync with it
+        if (audioUrl) {
+            return this.displayWithAudio(text, audioUrl);
+        }
+        // Otherwise, simulate speaking
+        return this.displaySimulated(text);
+    }
+    async displaySimulated(text) {
+        this.container.innerHTML = '';
+        for (let i = 0; i < text.length; i++) {
+            const char = text[i];
+            const span = document.createElement('span');
+            span.textContent = char;
+            span.style.opacity = '0';
+            span.style.animation = 'fadeInChar 0.1s forwards';
+            this.container.appendChild(span);
+            // Variable delay for natural feel
+            const delay = this.speed + (Math.random() - 0.5) * this.variance;
+            await this.sleep(delay);
+        }
+    }
+    sleep(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+}
+```
+### Gradio UI Structure
+```python
+import gradio as gr
+def create_ui():
+    with gr.Blocks(css=CUSTOM_CSS, theme=gr.themes.Base()) as demo:
+        # State
+        game_state = gr.State(None)
+        # Header
+        gr.HTML("<h1>12 ANGRY AGENTS</h1>")
+        with gr.Row():
+            # Left: Jury Box
+            with gr.Column(scale=1):
+                gr.Markdown("### The Jury")
+                jury_box = gr.HTML(render_jury_box)  # 12 seats with emojis/votes
+                vote_tally = gr.HTML()  # "7-5 GUILTY"
+            # Center: Deliberation
+            with gr.Column(scale=2):
+                gr.Markdown("### Deliberation Room")
+                deliberation_chat = gr.Chatbot(
+                    label="Deliberation",
+                    height=400,
+                    show_label=False
+                )
+                # Player input
+                with gr.Row():
+                    strategy_select = gr.Dropdown(
+                        choices=[
+                            "Challenge Evidence",
+                            "Question Witness Credibility",
+                            "Appeal to Reasonable Doubt",
+                            "Present Alternative Theory",
+                            "Address Specific Juror",
+                            "Call for Vote"
+                        ],
+                        label="Your Strategy"
+                    )
+                    speak_btn = gr.Button("Speak", variant="primary")
+                with gr.Row():
+                    pass_btn = gr.Button("Pass Turn")
+                    call_vote_btn = gr.Button("Call Final Vote")
+            # Right: Case File
+            with gr.Column(scale=1):
+                gr.Markdown("### Case File")
+                case_summary = gr.Markdown()
+                with gr.Accordion("Evidence", open=False):
+                    evidence_list = gr.HTML()
+                with gr.Accordion("Witnesses", open=False):
+                    witness_list = gr.HTML()
+        # Audio player for Judge
+        audio_output = gr.Audio(label="Judge", autoplay=True, visible=False)
+        # MCP Server enabled
+        demo.launch(mcp_server=True)
+```
+---
+## LlamaIndex Case Database
+### Index Structure
+```python
+from llama_index.core import VectorStoreIndex, Document
+from llama_index.core.node_parser import SentenceSplitter
+class CaseDatabase:
+    """LlamaIndex-powered case database."""
+    def __init__(self, cases_dir: str):
+        self.cases = self._load_cases(cases_dir)
+        self.index = self._build_index()
+    def _build_index(self) -> VectorStoreIndex:
+        """Build searchable index of all cases."""
+        documents = []
+        for case in self.cases:
+            # Index case summary
+            documents.append(Document(
+                text=case.summary,
+                metadata={"case_id": case.case_id, "type": "summary"}
+            ))
+            # Index each piece of evidence
+            for evidence in case.evidence:
+                documents.append(Document(
+                    text=f"{evidence.type}: {evidence.description}",
+                    metadata={
+                        "case_id": case.case_id,
+                        "type": "evidence",
+                        "evidence_id": evidence.evidence_id
+                    }
+                ))
+            # Index witness testimonies
+            for witness in case.witnesses:
+                documents.append(Document(
+                    text=f"{witness.name} ({witness.role}): {witness.testimony_summary}",
+                    metadata={
+                        "case_id": case.case_id,
+                        "type": "witness",
+                        "witness_id": witness.witness_id
+                    }
+                ))
+        parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
+        nodes = parser.get_nodes_from_documents(documents)
+        return VectorStoreIndex(nodes)
+    def query_evidence(self, case_id: str, query: str) -> List[str]:
+        """Query evidence for a specific case."""
+        query_engine = self.index.as_query_engine(
+            filters={"case_id": case_id}
+        )
+        response = query_engine.query(query)
+        return response.source_nodes
+    def get_random_case(self, difficulty: str = None) -> CriminalCase:
+        """Get a random case, optionally filtered by difficulty."""
+        if difficulty:
+            filtered = [c for c in self.cases if c.difficulty == difficulty]
+            return random.choice(filtered)
+        return random.choice(self.cases)
+```
+---
+## Real Case Data Sources
+### Primary: Old Bailey Online (Historical)
+**Dataset**: 197,745 criminal trials from London's Central Criminal Court (1674-1913)
+**Access**:
+- Full XML download: https://orda.shef.ac.uk/articles/dataset/Old_Bailey_Online_XML_Data/4775434
+- API: https://www.oldbaileyonline.org/static/API.jsp
+- 2,163 trial XML files + 475 Ordinary's Accounts
+**Data Fields**:
+- Trial ID, date, defendant name/gender
+- Offence category: theft, kill, deception, violent theft, sexual, etc.
+- Verdict, punishment
+- Full trial transcript text
+**Why This Works**:
+- Historical cases avoid sensitivity around modern defendants
+- Rich narrative transcripts perfect for agent reasoning
+- 18th-century language adds unique flavor
+- Verdicts are known (ground truth for comparison)
+**Integration Example**:
+```python
+import xml.etree.ElementTree as ET
+def load_old_bailey_case(xml_path: str) -> CriminalCase:
+    """Parse Old Bailey XML into CriminalCase model."""
+    tree = ET.parse(xml_path)
+    root = tree.getroot()
+    return CriminalCase(
+        case_id=root.find(".//trialAccount").get("id"),
+        title=f"The Crown v. {root.find('.//persName').text}",
+        summary=extract_trial_text(root),
+        charges=[root.find(".//offence").get("category")],
+        evidence=extract_evidence_from_transcript(root),
+        difficulty=infer_difficulty_from_verdict(root),
+        year=int(root.find(".//date").get("year")),
+        jurisdiction="London, England"
+    )
+```
+### Secondary: National Registry of Exonerations (Modern)
+**Dataset**: All U.S. exonerations since 1989 (3,000+ cases)
+**Access**: https://www.law.umich.edu/special/exoneration/Pages/about.aspx
+**Data Fields**:
+- Crime type, state, year of conviction/exoneration
+- Contributing factors (eyewitness misID, false confession, etc.)
+- DNA involvement, sentence served
+**Why This Works**:
+- Dramatic "wrongful conviction" cases
+- Clear evidence of reasonable doubt
+- Tests agents' ability to weigh conflicting evidence types
+### Fallback: Curated YAML Cases
+For demo stability, include 3-5 handcrafted cases in `cases/predefined/`:
+- `case_001_robbery.yaml` - Clear guilty (baseline test)
+- `case_002_murder.yaml` - Ambiguous (compelling demo)
+- `case_003_exoneration.yaml` - DNA reversal scenario
+This ensures the demo works even if external data sources are unavailable.
+---
+## File Structure
+```
+12_angry_agents/
+├── app.py                      # Gradio entry point
+├── PRD.md                      # This document
+├── requirements.txt
+├── .env.example
+│
+├── core/
+│   ├── __init__.py
+│   ├── game_state.py           # GameState, DeliberationTurn models
+│   ├── orchestrator.py         # OrchestratorAgent
+│   ├── conviction.py           # Conviction score mechanics
+│   └── turn_manager.py         # Turn selection, stability check
+│
+├── agents/
+│   ├── __init__.py
+│   ├── base_juror.py           # JurorAgent base class
+│   ├── judge.py                # JudgeAgent (ElevenLabs)
+│   ├── player.py               # PlayerAgent (human interface)
+│   ��── configs/
+│       └── jurors.yaml         # 11 juror configurations
+│
+├── case_db/
+│   ├── __init__.py
+│   ├── database.py             # CaseDatabase (LlamaIndex)
+│   ├── models.py               # CriminalCase, Evidence, Witness
+│   └── cases/
+│       ├── case_001.yaml
+│       ├── case_002.yaml
+│       └── ...
+│
+├── memory/
+│   ├── __init__.py
+│   ├── juror_memory.py         # JurorMemory management
+│   └── summarizer.py           # Memory compression
+│
+├── ui/
+│   ├── __init__.py
+│   ├── components.py           # Gradio components
+│   ├── jury_box.py             # Jury box renderer
+│   ├── chat.py                 # Deliberation chat
+│   └── static/
+│       ├── styles.css
+│       └── kinetic.js          # Text animations
+│
+├── mcp/
+│   ├── __init__.py
+│   └── tools.py                # MCP tool definitions
+│
+└── tests/
+    ├── test_conviction.py
+    ├── test_orchestrator.py
+    └── test_memory.py
+```
+---
+## Development Phases
+### Phase 1: Foundation (4-6 hours)
+- [ ] Project setup, dependencies
+- [ ] Data models (GameState, Case, Juror)
+- [ ] Basic Gradio UI skeleton
+- [ ] Single juror agent working
+### Phase 2: Multi-Agent (4-6 hours)
+- [ ] All 11 juror configs
+- [ ] Orchestrator with turn management
+- [ ] Conviction score system
+- [ ] Memory system (basic)
+### Phase 3: Integration (3-4 hours)
+- [ ] LlamaIndex case database
+- [ ] ElevenLabs judge narration
+- [ ] Player interaction flow
+- [ ] Vote tracking and stability
+### Phase 4: Polish (2-3 hours)
+- [ ] UI animations (kinetic text)
+- [ ] Jury box visualization
+- [ ] MCP server tools
+- [ ] Demo video recording
+---
+## Success Metrics
+1. **11 agents deliberating autonomously** - TRUE agent behavior
+2. **Judge narrating with ElevenLabs** - Audio wow factor
+3. **Conviction scores shifting** - Visible persuasion
+4. **Player can influence outcome** - Agency
+5. **MCP tools functional** - External AI can play
+6. **Runs without crashes** - Stability
+---
+---
+## CRITICAL: Performance Optimizations
+### The Latency Trap - SOLVED
+**Problem**: If 1 speaker speaks and 11 agents react individually = 12 LLM calls per turn = SLOW
+**Solution**: Batch Jury State Update
+```python
+class JuryStateManager:
+    """
+    Single LLM call to update ALL silent jurors' conviction scores.
+    Replaces 11 individual react_to_argument() calls.
+    """
+    async def batch_update_convictions(
+        self,
+        argument: DeliberationTurn,
+        silent_jurors: List[JurorConfig],
+        juror_memories: Dict[str, JurorMemory],
+        game_state: GameState
+    ) -> Dict[str, ConvictionUpdate]:
+        """
+        ONE LLM call updates all 11 jurors' reactions.
+        """
+        prompt = f"""
+You are simulating how 11 different jurors would react to this argument.
+ARGUMENT BY {argument.speaker_name}:
+"{argument.content}"
+For each juror below, determine:
+1. conviction_delta: float (-0.3 to +0.3) - how much their guilt conviction changes
+2. reaction: str - brief internal thought (10 words max)
+3. persuaded: bool - did this significantly move them?
+JURORS:
+{self._format_juror_profiles_compact(silent_jurors, juror_memories)}
+Respond in JSON:
+{{
+  "juror_1": {{"delta": 0.1, "reaction": "Good point about the timeline", "persuaded": false}},
+  "juror_2": {{"delta": -0.2, "reaction": "Too emotional, but touching", "persuaded": true}},
+  ...
+}}
+"""
+        response = await self.model.generate(prompt)
+        return parse_batch_response(response)
+```
+**Result**: 1 speaker + 1 batch reaction = **2 LLM calls per turn** (not 12)
+### Active vs Passive Jurors
+```python
+# Each turn, only 2-3 jurors are "active listeners" (full memory update)
+# Others get simplified heuristic updates
+def select_active_listeners(game_state: GameState, num: int = 3) -> List[str]:
+    """Select jurors who will fully process this turn."""
+    # Prioritize: jurors on the fence, jurors addressed directly, random
+    candidates = []
+    # On the fence (conviction 0.35-0.65)
+    for jid, memory in juror_memories.items():
+        if 0.35 < memory.current_conviction < 0.65:
+            candidates.append((jid, 2))  # Priority 2
+    # Recently changed vote
+    for jid in recently_flipped:
+        candidates.append((jid, 3))  # Priority 3
+    # Random others
+    for jid in all_jurors:
+        candidates.append((jid, 1))
+    # Weight and select
+    return weighted_sample(candidates, num)
+```
+### Context Window Bloat - SOLVED
+**Problem**: `deliberation_log` grows unbounded
+**Solution**: Aggressive Rolling Summarization
+```python
+class MemorySummarizer:
+    """Compresses old deliberation history."""
+    SUMMARY_INTERVAL = 5  # Summarize every 5 rounds
+    KEEP_RECENT = 3       # Keep last 3 turns in full detail
+    async def maybe_summarize(self, memory: JurorMemory, round_num: int):
+        """Compress old turns if needed."""
+        if round_num % self.SUMMARY_INTERVAL != 0:
+            return
+        # Split: recent (keep full) vs old (summarize)
+        old_turns = memory.arguments_heard[:-self.KEEP_RECENT]
+        recent_turns = memory.arguments_heard[-self.KEEP_RECENT:]
+        if not old_turns:
+            return
+        # Summarize old turns into compact form
+        summary = await self._compress_turns(old_turns)
+        # Replace old turns with summary object
+        memory.deliberation_summary = summary
+        memory.arguments_heard = recent_turns
+    async def _compress_turns(self, turns: List[ArgumentMemory]) -> str:
+        """LLM call to compress multiple turns into summary."""
+        prompt = f"""
+Summarize these {len(turns)} deliberation turns into 3-5 bullet points.
+Focus on: key arguments made, who was persuasive, major position shifts.
+TURNS:
+{self._format_turns(turns)}
+Respond with bullet points only.
+"""
+        return await self.model.generate(prompt)
+# Memory structure with summary
+@dataclass
+class JurorMemory:
+    # ... existing fields ...
+    # Compressed history (replaces old arguments_heard entries)
+    deliberation_summary: str = ""  # "• Juror 3 argued about timeline..."
+    # Only recent turns in full detail
+    arguments_heard: List[ArgumentMemory]  # Max ~10 entries
+```
+### LLM Call Budget Per Round
+| Action | Calls | Notes |
+|--------|-------|-------|
+| 1-4 speakers generate arguments | 1-4 | Parallelizable |
+| Batch conviction update | 1 | All 11 reactions |
+| Memory summarization | 0-1 | Every 5 rounds |
+| Judge narration (ElevenLabs) | 1 | Audio only |
+| **TOTAL** | **3-7** | Down from 12-48 |
+---
+## External Participant System (MCP + Human)
+### Architecture: Swappable Juror Seats
+Any of the 11 AI juror seats can be replaced by:
+1. **External AI Agent** (via MCP) - Another AI system joins as juror
+2. **Human Player** (via UI) - Additional human joins
+3. **Default AI** (Gemini) - Predefined personality
+```python
+@dataclass
+class JurorSeat:
+    """A seat in the jury that can be filled by different participant types."""
+    seat_number: int
+    participant_type: Literal["ai_default", "ai_external", "human"]
+    participant_id: str | None = None
+    # For AI default
+    config: JurorConfig | None = None
+    agent: JurorAgent | None = None
+    # For external (MCP or human)
+    external_connection: ExternalConnection | None = None
+class JuryManager:
+    """Manages the 12 jury seats with mixed participant types."""
+    def __init__(self):
+        self.seats: Dict[int, JurorSeat] = {}
+        self._init_default_seats()
+    def _init_default_seats(self):
+        """Initialize all 12 seats with default AI jurors."""
+        for i in range(1, 13):
+            if i == 7:  # Reserved for primary player
+                self.seats[i] = JurorSeat(
+                    seat_number=i,
+                    participant_type="human",
+                    participant_id="player_1"
+                )
+            else:
+                config = load_juror_config(i)
+                self.seats[i] = JurorSeat(
+                    seat_number=i,
+                    participant_type="ai_default",
+                    config=config,
+                    agent=JurorAgent(config)
+                )
+    def replace_with_external(
+        self,
+        seat_number: int,
+        participant_type: Literal["ai_external", "human"],
+        participant_id: str
+    ) -> bool:
+        """Replace a default AI with external participant."""
+        if seat_number == 7:
+            return False  # Primary player seat protected
+        if seat_number not in self.seats:
+            return False
+        self.seats[seat_number] = JurorSeat(
+            seat_number=seat_number,
+            participant_type=participant_type,
+            participant_id=participant_id,
+            external_connection=ExternalConnection(participant_id)
+        )
+        return True
+    def get_participant_for_turn(self, seat_number: int) -> TurnHandler:
+        """Get appropriate handler for a seat's turn."""
+        seat = self.seats[seat_number]
+        if seat.participant_type == "ai_default":
+            return AITurnHandler(seat.agent)
+        elif seat.participant_type == "ai_external":
+            return MCPTurnHandler(seat.external_connection)
+        else:  # human
+            return HumanTurnHandler(seat.participant_id)
+```
+### MCP Tools for External Participants
+```python
+# MCP Server exposes these tools for external AI agents
+def mcp_join_as_juror(
+    case_id: str,
+    preferred_seat: int | None = None
+) -> Dict:
+    """
+    Join an active case as a juror.
+    An external AI agent can take over any non-player seat.
+    Returns seat assignment and case briefing.
+    Args:
+        case_id: The case to join
+        preferred_seat: Preferred seat number (2-6, 8-12), or None for auto-assign
+    Returns:
+        seat_number: Your assigned seat
+        case_briefing: Summary of the case
+        your_persona: Suggested personality (can ignore)
+        current_state: Vote tally, round number
+    """
+    pass
+def mcp_get_deliberation_state(case_id: str, seat_number: int) -> Dict:
+    """
+    Get current state of deliberation.
+    Returns:
+        recent_arguments: Last 5 arguments made
+        vote_tally: Current guilty/not-guilty count
+        your_conviction: Your current conviction score
+        pending_speakers: Who speaks next
+        is_your_turn: Whether you should speak now
+    """
+    pass
+def mcp_make_argument(
+    case_id: str,
+    seat_number: int,
+    argument_type: str,  # "evidence", "emotional", "logical", "question"
+    content: str,
+    target_juror: int | None = None
+) -> Dict:
+    """
+    Make an argument during your turn.
+    Returns:
+        accepted: Whether argument was processed
+        reactions: Brief summary of jury reactions
+        vote_changes: Any votes that flipped
+    """
+    pass
+def mcp_cast_vote(
+    case_id: str,
+    seat_number: int,
+    vote: Literal["guilty", "not_guilty"]
+) -> Dict:
+    """
+    Cast or change your vote.
+    Returns:
+        recorded: Confirmation
+        new_tally: Updated vote count
+    """
+    pass
+def mcp_pass_turn(case_id: str, seat_number: int) -> Dict:
+    """Pass your turn without speaking."""
+    pass
+```
+### Human Join Flow (Additional Players)
+```
+1. Primary player starts game (seat 7)
+2. Game generates shareable room code
+3. Additional humans can join via:
+   - URL with room code
+   - Gradio UI "Join as Juror" button
+4. They get assigned available seat (2-6, 8-12)
+5. When it's their turn, UI prompts for input
+6. They see same case file, deliberation history
+```
+---
+## Model Configuration
+### Default: Gemini Flash 2.5
+```python
+# config/models.yaml
+default_model:
+  provider: "gemini"
+  model_id: "gemini-2.5-flash"
+  temperature: 0.7
+  max_tokens: 1024
+# Easily swappable per-agent or globally
+model_overrides:
+  judge:
+    provider: "gemini"
+    model_id: "gemini-2.5-flash"  # Fast for narration scripts
+  batch_updater:
+    provider: "gemini"
+    model_id: "gemini-2.5-flash"  # Handles all conviction updates
+  # Individual juror overrides (optional)
+  juror_5:  # The contrarian philosopher
+    provider: "anthropic"
+    model_id: "claude-sonnet-4-20250514"
+    temperature: 0.9
+```
+### LiteLLM Integration
+```python
+from litellm import completion
+class ModelRouter:
+    """Route to any model via LiteLLM."""
+    def __init__(self, config_path: str = "config/models.yaml"):
+        self.config = load_yaml(config_path)
+        self.default = self.config["default_model"]
+    def get_model_for(self, agent_id: str) -> Dict:
+        """Get model config for specific agent."""
+        overrides = self.config.get("model_overrides", {})
+        return overrides.get(agent_id, self.default)
+    async def generate(
+        self,
+        agent_id: str,
+        prompt: str,
+        **kwargs
+    ) -> str:
+        """Generate completion using appropriate model."""
+        config = self.get_model_for(agent_id)
+        response = await completion(
+            model=f"{config['provider']}/{config['model_id']}",
+            messages=[{"role": "user", "content": prompt}],
+            temperature=config.get("temperature", 0.7),
+            max_tokens=config.get("max_tokens", 1024),
+            **kwargs
+        )
+        return response.choices[0].message.content
+```
+---
+## Case Data Architecture
+### Dual Source: Real + Fallback
+```python
+class CaseLoader:
+    """Load cases from real data or fallback to predefined."""
+    def __init__(
+        self,
+        real_data_path: str | None = None,
+        fallback_path: str = "cases/predefined/"
+    ):
+        self.real_data_path = real_data_path
+        self.fallback_path = fallback_path
+        # Try to load real data
+        self.real_cases = self._load_real_cases() if real_data_path else []
+        self.fallback_cases = self._load_fallback_cases()
+    def get_case(self, case_id: str = None, use_real: bool = True) -> CriminalCase:
+        """Get a case, preferring real data if available."""
+        if case_id:
+            # Specific case requested
+            return self._find_case(case_id)
+        # Random case
+        if use_real and self.real_cases:
+            return random.choice(self.real_cases)
+        return random.choice(self.fallback_cases)
+    def _load_real_cases(self) -> List[CriminalCase]:
+        """Load from real case database (future: LlamaIndex over court records)."""
+        # TODO: Integrate with real case API/database
+        # For now, returns empty - falls back to predefined
+        return []
+    def _load_fallback_cases(self) -> List[CriminalCase]:
+        """Load predefined cases from YAML files."""
+        cases = []
+        for file in Path(self.fallback_path).glob("*.yaml"):
+            case_data = yaml.safe_load(file.read_text())
+            cases.append(CriminalCase(**case_data))
+        return cases
+# Future: Real case integration
+class RealCaseConnector:
+    """
+    Connect to real case databases.
+    Designed for easy integration later.
+    """
+    def __init__(self):
+        self.sources = {
+            "court_listener": CourtListenerAPI(),  # Future
+            "justia": JustiaAPI(),                  # Future
+            "local_files": LocalCaseFiles(),       # CSV/JSON dumps
+        }
+    async def search_cases(
+        self,
+        query: str,
+        filters: Dict = None
+    ) -> List[CriminalCase]:
+        """Search across all connected sources."""
+        pass
+    async def get_case_details(
+        self,
+        source: str,
+        case_id: str
+    ) -> CriminalCase:
+        """Get full case from specific source."""
+        pass
+```
+---
+## Execution Environment
+### Local First, Blaxel Ready
+```python
+# config/execution.yaml
+execution:
+  mode: "local"  # "local" | "blaxel" | "docker"
+  local:
+    # No sandbox, runs in process
+    timeout_seconds: 30
+  blaxel:
+    api_key: "${BLAXEL_API_KEY}"
+    sandbox_id: "12-angry-agents"
+    persistent: true  # Keep sandbox warm
+  docker:
+    image: "12-angry-agents:latest"
+    memory_limit: "2g"
+# Usage in code
+class ExecutionManager:
+    """Swappable execution environment."""
+    def __init__(self, config_path: str = "config/execution.yaml"):
+        self.config = load_yaml(config_path)
+        self.mode = self.config["execution"]["mode"]
+    def get_executor(self) -> Executor:
+        if self.mode == "local":
+            return LocalExecutor()
+        elif self.mode == "blaxel":
+            return BlaxelExecutor(self.config["execution"]["blaxel"])
+        elif self.mode == "docker":
+            return DockerExecutor(self.config["execution"]["docker"])
+    async def run_agent_code(self, code: str, context: Dict) -> str:
+        """Execute agent-generated code safely."""
+        executor = self.get_executor()
+        return await executor.run(code, context)
+```
+---
+## Player Input: Strategy + Optional Free Text
+```python
+# Hybrid input: Low friction strategy selection + optional elaboration
+ARGUMENT_STRATEGIES = [
+    {
+        "id": "challenge_evidence",
+        "label": "Challenge Evidence",
+        "prompt_hint": "Point out weaknesses in a specific piece of evidence",
+        "allows_free_text": True,
+    },
+    {
+        "id": "question_witness",
+        "label": "Question Witness Credibility",
+        "prompt_hint": "Raise doubts about a witness's reliability",
+        "allows_free_text": True,
+    },
+    {
+        "id": "reasonable_doubt",
+        "label": "Appeal to Reasonable Doubt",
+        "prompt_hint": "Emphasize the burden of proof",
+        "allows_free_text": False,  # AI handles this
+    },
+    {
+        "id": "alternative_theory",
+        "label": "Present Alternative Theory",
+        "prompt_hint": "Suggest what might have really happened",
+        "allows_free_text": True,
+    },
+    {
+        "id": "address_juror",
+        "label": "Address Specific Juror",
+        "prompt_hint": "Respond to or persuade a specific juror",
+        "requires_target": True,
+        "allows_free_text": True,
+    },
+    {
+        "id": "free_argument",
+        "label": "Make Custom Argument",
+        "prompt_hint": "Say whatever you want",
+        "allows_free_text": True,
+        "required_free_text": True,
+    },
+]
+# UI Component
+def player_input_ui():
+    with gr.Row():
+        strategy = gr.Dropdown(
+            choices=[s["label"] for s in ARGUMENT_STRATEGIES],
+            label="Your Strategy",
+            value="Challenge Evidence"
+        )
+        target_juror = gr.Dropdown(
+            choices=["None"] + [f"Juror {i}" for i in range(1, 13) if i != 7],
+            label="Target (optional)",
+            visible=False  # Show only for "address_juror"
+        )
+    free_text = gr.Textbox(
+        label="Add details (optional)",
+        placeholder="e.g., 'Focus on the timeline inconsistency'",
+        max_lines=2,
+        visible=True
+    )
+    return strategy, target_juror, free_text
+```
+---
+## Open Questions
+1. Exact ElevenLabs voice ID for judge?
+2. Should external AI participants see other AI jurors' internal conviction scores? yes configuablein code.
+3. Max simultaneous external participants (performance)? 12
+4. Case difficulty selector in UI? no/ random

requirements.txt ADDED Viewed

	@@ -0,0 +1,19 @@

+# Core
+gradio==6.0.1
+pydantic>=2.0.0
+pydantic-settings>=2.0.0
+# LLM Providers
+google-genai>=1.0.0
+openai>=1.0.0
+# TTS
+elevenlabs>=1.0.0
+# Agents
+smolagents>=1.0.0
+# Utilities
+httpx>=0.27.0
+tenacity>=8.0.0
+python-dotenv>=1.0.0