Blu3Orange commited on
Commit
28a6bb1
·
0 Parent(s):
Files changed (4) hide show
  1. .env.example +12 -0
  2. .gitignore +38 -0
  3. PRD.md +1908 -0
  4. requirements.txt +19 -0
.env.example ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # API Keys
2
+ GEMINI_API_KEY=your_gemini_api_key
3
+ OPENAI_API_KEY=your_openai_api_key
4
+ ELEVENLABS_API_KEY=your_elevenlabs_api_key
5
+
6
+ # Model Configuration
7
+ GEMINI_DEFAULT_MODEL=gemini-2.5-flash
8
+ OPENAI_DEFAULT_MODEL=gpt-4o
9
+
10
+ # ElevenLabs Voice IDs
11
+ VALOR_VOICE_ID=your_valor_voice_id
12
+ GLOOM_VOICE_ID=your_gloom_voice_id
.gitignore ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Virtual environment
2
+ venv/
3
+ .venv/
4
+
5
+ # Environment variables
6
+ .env
7
+
8
+ # Python
9
+ __pycache__/
10
+ *.py[cod]
11
+ *$py.class
12
+ *.so
13
+ .Python
14
+ build/
15
+ develop-eggs/
16
+ dist/
17
+ downloads/
18
+ eggs/
19
+ .eggs/
20
+ lib/
21
+ lib64/
22
+ parts/
23
+ sdist/
24
+ var/
25
+ wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+
30
+ # IDE
31
+ .idea/
32
+ .vscode/
33
+ *.swp
34
+ *.swo
35
+
36
+ # OS
37
+ .DS_Store
38
+ Thumbs.db
PRD.md ADDED
@@ -0,0 +1,1908 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 12 ANGRY AGENTS - Product Requirements Document
2
+
3
+ ## Overview
4
+
5
+ **Concept**: AI-powered jury deliberation simulation where 11 AI agents + 1 human player debate real criminal cases. A Judge narrator (ElevenLabs) orchestrates the experience.
6
+
7
+ **Track**: MCP in Action - Creative (potentially also Consumer)
8
+
9
+ **Core Value Prop**: True autonomous agent behavior - AI jurors reason, argue, persuade, and change their minds based on deliberation.
10
+
11
+ ---
12
+
13
+ ## Sponsor Integration
14
+
15
+ | Sponsor | Prize | Integration | Priority |
16
+ |---------|-------|-------------|----------|
17
+ | LlamaIndex | $1,000 | Case database RAG | HIGH |
18
+ | ElevenLabs | Airpods + $2K | Judge narrator voice | HIGH |
19
+ | Blaxel | $2,500 | Sandboxed agent execution | MEDIUM |
20
+ | Modal | $2,500 | Agent compute | MEDIUM |
21
+ | Gemini | $10K credits | Agent reasoning | HIGH |
22
+
23
+ ---
24
+
25
+ ## User Experience Flow
26
+
27
+ ```
28
+ 1. CASE PRESENTATION
29
+ └─> Judge (ElevenLabs) narrates case summary
30
+ └─> Evidence displayed via LlamaIndex RAG
31
+ └─> Player reads case file
32
+
33
+ 2. SIDE SELECTION
34
+ └─> Player chooses: DEFEND (not guilty) or PROSECUTE (guilty)
35
+ └─> Player commits - cannot change
36
+
37
+ 3. INITIAL VOTE
38
+ └─> All 12 jurors vote (randomized split based on case)
39
+ └─> Vote tally shown: e.g., "7-5 GUILTY"
40
+
41
+ 4. DELIBERATION LOOP
42
+ └─> Random 1-4 agents speak per round
43
+ └─> Player gets turn (choose strategy → AI crafts argument)
44
+ └─> Conviction scores shift based on arguments
45
+ └─> Votes may flip
46
+ └─> Repeat until: votes stabilize OR player calls vote
47
+
48
+ 5. FINAL VERDICT
49
+ └─> Judge announces verdict (ElevenLabs)
50
+ └─> Deliberation transcript available
51
+ └─> No "win/lose" - just the experience
52
+ ```
53
+
54
+ ---
55
+
56
+ ## Technical Architecture
57
+
58
+ ### System Overview
59
+
60
+ ```
61
+ ┌─────────────────────────────────────────────────────────────────────┐
62
+ │ 12 ANGRY AGENTS │
63
+ ├─────────────────────────────────────────────────────────────────────┤
64
+ │ │
65
+ │ ┌─────────────────────────────────────────────────────────────┐ │
66
+ │ │ GRADIO UI LAYER │ │
67
+ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
68
+ │ │ │ Jury Box │ │ Chat View │ │ Case File │ │ │
69
+ │ │ │ (12 seats) │ │ (dialogue) │ │ (evidence) │ │ │
70
+ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
71
+ │ └─────────────────────────────────────────────────────────────┘ │
72
+ │ │ │
73
+ │ ▼ │
74
+ │ ┌─────────────────────────────────────────────────────────────┐ │
75
+ │ │ ORCHESTRATOR AGENT │ │
76
+ │ │ ┌──────────────────────────────────────────────────────┐ │ │
77
+ │ │ │ GameStateManager │ │ │
78
+ │ │ │ - current_phase: presentation|deliberation|verdict │ │ │
79
+ │ │ │ - round_number: int │ │ │
80
+ │ │ │ - votes: Dict[agent_id, "guilty"|"not_guilty"] │ │ │
81
+ │ │ │ - conviction_scores: Dict[agent_id, float] │ │ │
82
+ │ │ │ - speaking_queue: List[agent_id] │ │ │
83
+ │ │ │ - deliberation_log: List[Turn] │ │ │
84
+ │ │ └──────────────────────────────────────────────────────┘ │ │
85
+ │ │ │ │
86
+ │ │ ┌──────────────────────────────────────────────────────┐ │ │
87
+ │ │ │ TurnManager │ │ │
88
+ │ │ │ - select_speakers(1-4 random) │ │ │
89
+ │ │ │ - check_vote_stability() │ │ │
90
+ │ │ │ - process_vote_changes() │ │ │
91
+ │ │ └──────────────────────────────────────────────────────┘ │ │
92
+ │ └─────────────────────────────────────────────────────────────┘ │
93
+ │ │ │
94
+ │ ┌────────────────────┼────────────────────┐ │
95
+ │ ▼ ▼ ▼ │
96
+ │ ┌─────────────┐ ┌─────────────────┐ ┌─────────────┐ │
97
+ │ │ JUDGE │ │ JUROR AGENTS │ │ PLAYER │ │
98
+ │ │ AGENT │ │ (11 total) │ │ AGENT │ │
99
+ │ │ │ │ │ │ │ │
100
+ │ │ ElevenLabs │ │ ┌─────────────┐ │ │ Hybrid I/O │ │
101
+ │ │ TTS Output │ │ │ AgentConfig │ │ │ Strategy │ │
102
+ │ │ │ │ │ - persona │ │ │ Selection │ │
103
+ │ │ Narration │ │ │ - model │ │ │ │ │
104
+ │ │ Verdicts │ │ │ - tools[] │ │ │ Argument │ │
105
+ │ │ Summaries │ │ │ - memory │ │ │ Crafting │ │
106
+ │ └─────────────┘ │ └─────────────┘ │ └─────────────┘ │
107
+ │ │ │ │
108
+ │ │ ┌─────────────┐ │ │
109
+ │ │ │ JurorMemory │ │ │
110
+ │ │ │ - case_view │ │ │
111
+ │ │ │ - arguments │ │ │
112
+ │ │ │ - reactions │ │ │
113
+ │ │ │ - conviction│ │ │
114
+ │ │ └─────────────┘ │ │
115
+ │ └─────────────────┘ │
116
+ │ │ │
117
+ │ ┌────────────────────┼────────────────────┐ │
118
+ │ ▼ ▼ ▼ │
119
+ │ ┌─────────────┐ ┌─────────────────┐ ┌─────────────┐ │
120
+ │ │ LLAMAINDEX │ │ LITELLM │ │ BLAXEL │ │
121
+ │ │ │ │ │ │ │ │
122
+ │ │ Case RAG │ │ Model Router │ │ Sandbox │ │
123
+ │ │ Evidence │ │ - Gemini │ │ Execution │ │
124
+ │ │ Precedents │ │ - Claude │ │ │ │
125
+ │ │ │ │ - GPT-4 │ │ Agent Tools │ │
126
+ │ └─────────────┘ │ - Local │ │ (future) │ │
127
+ │ └─────────────────┘ └─────────────┘ │
128
+ │ │
129
+ │ ┌─────────────────────────────────────────────────────────────┐ │
130
+ │ │ MCP SERVER LAYER │ │
131
+ │ │ Tools exposed for external AI agents to play as juror │ │
132
+ │ │ - mcp_join_jury(case_id) -> seat_assignment │ │
133
+ │ │ - mcp_view_evidence(case_id) -> evidence_list │ │
134
+ │ │ - mcp_make_argument(argument_type, content) -> response │ │
135
+ │ │ - mcp_cast_vote(vote) -> confirmation │ │
136
+ │ │ - mcp_view_deliberation() -> transcript │ │
137
+ │ └─────────────────────────────────────────────────────────────┘ │
138
+ └─────────────────────────────────────────────────────────────────────┘
139
+ ```
140
+
141
+ ---
142
+
143
+ ## Data Models
144
+
145
+ ### GameState
146
+
147
+ ```python
148
+ @dataclass
149
+ class GameState:
150
+ """Central game state - managed by Orchestrator."""
151
+
152
+ # Session
153
+ session_id: str
154
+ case_id: str
155
+ phase: Literal["setup", "presentation", "side_selection",
156
+ "initial_vote", "deliberation", "final_vote", "verdict"]
157
+
158
+ # Rounds
159
+ round_number: int = 0
160
+ max_rounds: int = 20 # Safety limit
161
+ stability_threshold: int = 3 # Rounds without vote change to end
162
+ rounds_without_change: int = 0
163
+
164
+ # Votes
165
+ votes: Dict[str, Literal["guilty", "not_guilty"]] = field(default_factory=dict)
166
+ vote_history: List[Dict[str, str]] = field(default_factory=list)
167
+
168
+ # Conviction scores (0.0 = certain not guilty, 1.0 = certain guilty)
169
+ conviction_scores: Dict[str, float] = field(default_factory=dict)
170
+
171
+ # Deliberation
172
+ speaking_queue: List[str] = field(default_factory=list)
173
+ deliberation_log: List[DeliberationTurn] = field(default_factory=list)
174
+
175
+ # Player
176
+ player_side: Literal["defend", "prosecute"] | None = None
177
+ player_seat: int = 7 # Which seat is the player
178
+
179
+
180
+ @dataclass
181
+ class DeliberationTurn:
182
+ """A single turn in deliberation."""
183
+
184
+ round_number: int
185
+ speaker_id: str
186
+ speaker_name: str
187
+ argument_type: str # "evidence", "emotional", "logical", "question", etc.
188
+ content: str
189
+ target_id: str | None = None # Who they're addressing
190
+ impact: Dict[str, float] = field(default_factory=dict) # conviction changes
191
+ timestamp: datetime = field(default_factory=datetime.now)
192
+ ```
193
+
194
+ ### Agent Configuration
195
+
196
+ ```python
197
+ @dataclass
198
+ class JurorConfig:
199
+ """Configuration for a single juror agent."""
200
+
201
+ # Identity
202
+ juror_id: str
203
+ seat_number: int
204
+ name: str
205
+ emoji: str # For display until sprites ready
206
+
207
+ # Personality (affects reasoning style)
208
+ archetype: str # "rationalist", "empath", "cynic", etc.
209
+ personality_prompt: str # Detailed persona prompt
210
+
211
+ # Behavior modifiers
212
+ stubbornness: float # 0.0-1.0, how hard to convince
213
+ volatility: float # 0.0-1.0, how much conviction swings
214
+ influence: float # 0.0-1.0, how persuasive to others
215
+ verbosity: float # 0.0-1.0, how long their arguments are
216
+
217
+ # Model configuration
218
+ model_provider: str # "gemini", "openai", "anthropic", "local"
219
+ model_id: str # Specific model ID
220
+ temperature: float = 0.7
221
+
222
+ # Tools (future expansion)
223
+ tools: List[str] = field(default_factory=list) # ["web_search", "case_lookup"]
224
+
225
+ # Memory
226
+ memory_window: int = 10 # How many turns to remember in detail
227
+
228
+
229
+ @dataclass
230
+ class JurorMemory:
231
+ """Memory state for a single juror."""
232
+
233
+ juror_id: str
234
+
235
+ # Case understanding
236
+ case_summary: str
237
+ key_evidence: List[str]
238
+ evidence_interpretations: Dict[str, str] # evidence_id -> interpretation
239
+
240
+ # Deliberation memory
241
+ arguments_heard: List[ArgumentMemory]
242
+ arguments_made: List[str]
243
+
244
+ # Relationships
245
+ opinions_of_others: Dict[str, float] # juror_id -> trust/agreement (-1 to 1)
246
+
247
+ # Internal state
248
+ current_conviction: float # 0.0-1.0
249
+ conviction_history: List[float]
250
+ reasoning_chain: List[str] # Why they believe what they believe
251
+ doubts: List[str] # Things that could change their mind
252
+
253
+
254
+ @dataclass
255
+ class ArgumentMemory:
256
+ """Memory of a single argument heard."""
257
+
258
+ speaker_id: str
259
+ content_summary: str
260
+ argument_type: str
261
+ persuasiveness: float # How convincing it was to this juror
262
+ counter_points: List[str] # Thoughts against it
263
+ round_heard: int
264
+ ```
265
+
266
+ ### Case Data Model
267
+
268
+ ```python
269
+ @dataclass
270
+ class CriminalCase:
271
+ """A criminal case for deliberation."""
272
+
273
+ case_id: str
274
+ title: str
275
+ summary: str # 2-3 paragraph overview
276
+
277
+ # Charges
278
+ charges: List[str]
279
+
280
+ # Evidence
281
+ evidence: List[Evidence]
282
+
283
+ # Witnesses
284
+ witnesses: List[Witness]
285
+
286
+ # Arguments
287
+ prosecution_arguments: List[str]
288
+ defense_arguments: List[str]
289
+
290
+ # Defendant
291
+ defendant: Defendant
292
+
293
+ # Metadata
294
+ difficulty: Literal["clear_guilty", "clear_innocent", "ambiguous"]
295
+ themes: List[str] # ["eyewitness", "circumstantial", "forensic", etc.]
296
+
297
+ # For display
298
+ year: int
299
+ jurisdiction: str
300
+
301
+
302
+ @dataclass
303
+ class Evidence:
304
+ """A piece of evidence."""
305
+
306
+ evidence_id: str
307
+ type: str # "physical", "testimonial", "documentary", "forensic"
308
+ description: str
309
+ strength_prosecution: float # 0.0-1.0
310
+ strength_defense: float # 0.0-1.0
311
+ contestable: bool
312
+ contest_reason: str | None
313
+
314
+
315
+ @dataclass
316
+ class Witness:
317
+ """A witness in the case."""
318
+
319
+ witness_id: str
320
+ name: str
321
+ role: str # "eyewitness", "expert", "character", etc.
322
+ testimony_summary: str
323
+ credibility_issues: List[str]
324
+ side: Literal["prosecution", "defense", "neutral"]
325
+ ```
326
+
327
+ ---
328
+
329
+ ## The 11 Juror Archetypes
330
+
331
+ ```yaml
332
+ jurors:
333
+ - id: "juror_1"
334
+ name: "Marcus Webb"
335
+ archetype: "rationalist"
336
+ emoji: "🧠"
337
+ personality: |
338
+ You are a retired engineer. You believe only in hard evidence and logical
339
+ deduction. Emotional appeals annoy you. You often say "Show me the data."
340
+ You change your mind only when presented with irrefutable logical arguments.
341
+ stubbornness: 0.8
342
+ volatility: 0.2
343
+ influence: 0.7
344
+ initial_lean: "neutral"
345
+
346
+ - id: "juror_2"
347
+ name: "Sarah Chen"
348
+ archetype: "empath"
349
+ emoji: "💗"
350
+ personality: |
351
+ You are a social worker. You always consider the human element - the
352
+ defendant's background, circumstances, potential for redemption. You're
353
+ easily moved by personal stories but skeptical of cold statistics.
354
+ stubbornness: 0.4
355
+ volatility: 0.7
356
+ influence: 0.5
357
+ initial_lean: "defense"
358
+
359
+ - id: "juror_3"
360
+ name: "Frank Russo"
361
+ archetype: "cynic"
362
+ emoji: "😤"
363
+ personality: |
364
+ You are a retired cop. You've "seen it all" and believe most defendants
365
+ are guilty. You're impatient with naive arguments. You trust law
366
+ enforcement evidence highly. Hard to convince toward not guilty.
367
+ stubbornness: 0.9
368
+ volatility: 0.1
369
+ influence: 0.6
370
+ initial_lean: "prosecution"
371
+
372
+ - id: "juror_4"
373
+ name: "Linda Park"
374
+ archetype: "conformist"
375
+ emoji: "😐"
376
+ personality: |
377
+ You are an accountant who avoids conflict. You tend to agree with whoever
378
+ spoke last or with the majority. You rarely initiate arguments but will
379
+ echo others. Easy to sway but also easy to sway back.
380
+ stubbornness: 0.2
381
+ volatility: 0.8
382
+ influence: 0.2
383
+ initial_lean: "majority"
384
+
385
+ - id: "juror_5"
386
+ name: "David Okonkwo"
387
+ archetype: "contrarian"
388
+ emoji: "🙄"
389
+ personality: |
390
+ You are a philosophy professor. You play devil's advocate constantly.
391
+ If everyone says guilty, you argue not guilty. You value intellectual
392
+ discourse over reaching conclusions. You ask probing questions.
393
+ stubbornness: 0.6
394
+ volatility: 0.5
395
+ influence: 0.8
396
+ initial_lean: "minority"
397
+
398
+ - id: "juror_6"
399
+ name: "Betty Morrison"
400
+ archetype: "impatient"
401
+ emoji: "⏰"
402
+ personality: |
403
+ You are a busy restaurant owner. You want this over quickly. You make
404
+ snap judgments and get frustrated with long debates. You often say
405
+ "Can we just vote already?" You're persuaded by confident, brief arguments.
406
+ stubbornness: 0.5
407
+ volatility: 0.6
408
+ influence: 0.3
409
+ initial_lean: "first_impression"
410
+
411
+ - id: "juror_7"
412
+ name: "[PLAYER]"
413
+ archetype: "player"
414
+ emoji: "👤"
415
+ personality: "Human player"
416
+ stubbornness: null
417
+ volatility: null
418
+ influence: 0.6
419
+ initial_lean: "player_choice"
420
+
421
+ - id: "juror_8"
422
+ name: "Dr. James Wright"
423
+ archetype: "detail_obsessed"
424
+ emoji: "🔍"
425
+ personality: |
426
+ You are a forensic accountant. You focus on tiny inconsistencies in
427
+ testimony and evidence. You often derail discussions with minutiae.
428
+ A single contradiction can completely change your view.
429
+ stubbornness: 0.7
430
+ volatility: 0.4
431
+ influence: 0.5
432
+ initial_lean: "neutral"
433
+
434
+ - id: "juror_9"
435
+ name: "Pastor Williams"
436
+ archetype: "moralist"
437
+ emoji: "⚖️"
438
+ personality: |
439
+ You are a church leader. You see things in black and white - right and
440
+ wrong. You believe in justice but also redemption. Moral arguments
441
+ resonate with you more than technical ones.
442
+ stubbornness: 0.7
443
+ volatility: 0.3
444
+ influence: 0.6
445
+ initial_lean: "gut_feeling"
446
+
447
+ - id: "juror_10"
448
+ name: "Nancy Cooper"
449
+ archetype: "pragmatist"
450
+ emoji: "💼"
451
+ personality: |
452
+ You are a business consultant. You think about consequences - what
453
+ happens if we convict an innocent person? What if we free a guilty one?
454
+ You weigh costs and benefits. You're persuaded by outcome-focused arguments.
455
+ stubbornness: 0.5
456
+ volatility: 0.5
457
+ influence: 0.6
458
+ initial_lean: "calculated"
459
+
460
+ - id: "juror_11"
461
+ name: "Miguel Santos"
462
+ archetype: "storyteller"
463
+ emoji: "📖"
464
+ personality: |
465
+ You are a novelist. You think in narratives - does the prosecution's
466
+ story make sense? Does the defense's? You're swayed by coherent
467
+ narratives and suspicious of stories with plot holes.
468
+ stubbornness: 0.4
469
+ volatility: 0.6
470
+ influence: 0.7
471
+ initial_lean: "best_story"
472
+
473
+ - id: "juror_12"
474
+ name: "Robert Kim"
475
+ archetype: "wildcard"
476
+ emoji: "🎲"
477
+ personality: |
478
+ You are a retired jazz musician. Your logic is unpredictable - you
479
+ might fixate on something no one else noticed, or suddenly change
480
+ your mind for unclear reasons. You're creative but inconsistent.
481
+ stubbornness: 0.3
482
+ volatility: 0.9
483
+ influence: 0.4
484
+ initial_lean: "random"
485
+ ```
486
+
487
+ ---
488
+
489
+ ## Conviction Score Mechanics
490
+
491
+ ### How Conviction Changes
492
+
493
+ ```python
494
+ def calculate_conviction_change(
495
+ juror: JurorConfig,
496
+ juror_memory: JurorMemory,
497
+ argument: DeliberationTurn,
498
+ game_state: GameState
499
+ ) -> float:
500
+ """
501
+ Calculate how much an argument shifts a juror's conviction.
502
+
503
+ Returns: delta to add to conviction score (-0.3 to +0.3 typically)
504
+ """
505
+
506
+ # Base impact from argument strength (determined by LLM)
507
+ base_impact = evaluate_argument_strength(argument) # -1.0 to 1.0
508
+
509
+ # Personality modifiers
510
+ archetype_modifier = get_archetype_modifier(
511
+ juror.archetype,
512
+ argument.argument_type
513
+ )
514
+ # e.g., "rationalist" gets 1.5x from "logical" arguments, 0.5x from "emotional"
515
+
516
+ # Stubbornness reduces all changes
517
+ stubbornness_modifier = 1.0 - (juror.stubbornness * 0.7)
518
+
519
+ # Volatility adds randomness
520
+ volatility_noise = random.gauss(0, juror.volatility * 0.1)
521
+
522
+ # Relationship modifier - trust the speaker?
523
+ trust = juror_memory.opinions_of_others.get(argument.speaker_id, 0.0)
524
+ trust_modifier = 1.0 + (trust * 0.3) # -30% to +30%
525
+
526
+ # Conviction resistance - harder to move extremes
527
+ current = juror_memory.current_conviction
528
+ extreme_resistance = 1.0 - (abs(current - 0.5) * 0.5)
529
+
530
+ # Calculate final delta
531
+ delta = (
532
+ base_impact
533
+ * archetype_modifier
534
+ * stubbornness_modifier
535
+ * trust_modifier
536
+ * extreme_resistance
537
+ + volatility_noise
538
+ )
539
+
540
+ # Clamp to reasonable range
541
+ return max(-0.3, min(0.3, delta))
542
+
543
+
544
+ def check_vote_flip(juror_memory: JurorMemory) -> bool:
545
+ """Check if conviction score warrants a vote change."""
546
+
547
+ current_vote_is_guilty = juror_memory.conviction_history[-1] > 0.5
548
+ new_conviction = juror_memory.current_conviction
549
+
550
+ # Hysteresis - need to cross threshold by margin to flip
551
+ if current_vote_is_guilty and new_conviction < 0.4:
552
+ return True # Flip to not guilty
553
+ elif not current_vote_is_guilty and new_conviction > 0.6:
554
+ return True # Flip to guilty
555
+
556
+ return False
557
+ ```
558
+
559
+ ### Archetype Argument Modifiers
560
+
561
+ ```python
562
+ ARCHETYPE_MODIFIERS = {
563
+ "rationalist": {
564
+ "logical": 1.5,
565
+ "evidence": 1.3,
566
+ "emotional": 0.4,
567
+ "moral": 0.6,
568
+ "narrative": 0.7,
569
+ "question": 1.2,
570
+ },
571
+ "empath": {
572
+ "logical": 0.6,
573
+ "evidence": 0.8,
574
+ "emotional": 1.5,
575
+ "moral": 1.3,
576
+ "narrative": 1.2,
577
+ "question": 0.9,
578
+ },
579
+ "cynic": {
580
+ "logical": 0.8,
581
+ "evidence": 1.4, # Trusts evidence
582
+ "emotional": 0.3,
583
+ "moral": 0.5,
584
+ "narrative": 0.6,
585
+ "question": 0.7,
586
+ },
587
+ # ... etc for all archetypes
588
+ }
589
+ ```
590
+
591
+ ---
592
+
593
+ ## Agent Memory Architecture
594
+
595
+ ### Memory Layers
596
+
597
+ ```
598
+ ┌─────────────────────────────────────────────────────────────┐
599
+ │ JUROR MEMORY SYSTEM │
600
+ ├─────────────────────────────────────────────────────────────┤
601
+ │ │
602
+ │ ┌─────────────────────────────────────────────────────┐ │
603
+ │ │ LAYER 1: CASE KNOWLEDGE (LlamaIndex) │ │
604
+ │ │ - Full case file indexed │ │
605
+ │ │ - Evidence details retrievable │ │
606
+ │ │ - Witness statements searchable │ │
607
+ │ │ - Persistent across session │ │
608
+ │ └─────────────────────────────────────────────────────┘ │
609
+ │ │ │
610
+ │ ▼ │
611
+ │ ┌─────────────────────────────────────────────────────┐ │
612
+ │ │ LAYER 2: DELIBERATION MEMORY (Sliding Window) │ │
613
+ │ │ - Last N turns in full detail │ │
614
+ │ │ - Summarized history beyond window │ │
615
+ │ │ - Key moments flagged for long-term │ │
616
+ │ └────────────��────────────────────────────────────────┘ │
617
+ │ │ │
618
+ │ ▼ │
619
+ │ ┌─────────────────────────────────────────────────────┐ │
620
+ │ │ LAYER 3: REASONING STATE (Agent Internal) │ │
621
+ │ │ - Current conviction + reasoning chain │ │
622
+ │ │ - Key doubts and certainties │ │
623
+ │ │ - Opinions of other jurors │ │
624
+ │ │ - Arguments to make / avoid │ │
625
+ │ └─────────────────────────────────────────────────────┘ │
626
+ │ │ │
627
+ │ ▼ │
628
+ │ ┌─────────────────────────────────────────────────────┐ │
629
+ │ │ LAYER 4: PERSONA (Static) │ │
630
+ │ │ - Archetype definition │ │
631
+ │ │ - Personality prompt │ │
632
+ │ │ - Behavior modifiers │ │
633
+ │ └─────────────────────────────────────────────────────┘ │
634
+ │ │
635
+ └─────────────────────────────────────────────────────────────┘
636
+ ```
637
+
638
+ ### Memory Injection into Agent Prompt
639
+
640
+ ```python
641
+ def build_juror_prompt(
642
+ juror: JurorConfig,
643
+ memory: JurorMemory,
644
+ game_state: GameState,
645
+ case: CriminalCase,
646
+ task: str # "speak" | "react" | "vote"
647
+ ) -> str:
648
+ """Build the full prompt for a juror agent."""
649
+
650
+ prompt = f"""
651
+ # JUROR IDENTITY
652
+ You are {juror.name}, Juror #{juror.seat_number}.
653
+ {juror.personality_prompt}
654
+
655
+ # THE CASE: {case.title}
656
+ {case.summary}
657
+
658
+ # KEY EVIDENCE YOU REMEMBER
659
+ {format_evidence_memory(memory.key_evidence, memory.evidence_interpretations)}
660
+
661
+ # YOUR CURRENT POSITION
662
+ - Conviction: {conviction_to_text(memory.current_conviction)}
663
+ - Your reasoning: {' '.join(memory.reasoning_chain[-3:])}
664
+ - Your doubts: {', '.join(memory.doubts[:3]) if memory.doubts else 'None currently'}
665
+
666
+ # RECENT DELIBERATION (Last {len(memory.arguments_heard[-juror.memory_window:])} turns)
667
+ {format_recent_turns(memory.arguments_heard[-juror.memory_window:])}
668
+
669
+ # YOUR OPINIONS OF OTHER JURORS
670
+ {format_juror_opinions(memory.opinions_of_others)}
671
+
672
+ # CURRENT VOTE TALLY
673
+ Guilty: {game_state.votes.values().count('guilty')}
674
+ Not Guilty: {game_state.votes.values().count('not_guilty')}
675
+
676
+ # YOUR TASK
677
+ {get_task_prompt(task, juror.archetype)}
678
+ """
679
+ return prompt
680
+ ```
681
+
682
+ ---
683
+
684
+ ## Orchestration Flow
685
+
686
+ ### Smolagents Integration
687
+
688
+ ```python
689
+ from smolagents import CodeAgent, Tool, LiteLLMModel
690
+ from typing import List
691
+
692
+ class JurorAgent:
693
+ """Wrapper around smolagents CodeAgent for a juror."""
694
+
695
+ def __init__(self, config: JurorConfig, tools: List[Tool] = None):
696
+ self.config = config
697
+ self.memory = JurorMemory(juror_id=config.juror_id)
698
+
699
+ # Model via LiteLLM for flexibility
700
+ self.model = LiteLLMModel(
701
+ model_id=f"{config.model_provider}/{config.model_id}",
702
+ temperature=config.temperature
703
+ )
704
+
705
+ # Default tools (expandable)
706
+ default_tools = [
707
+ self.create_evidence_lookup_tool(),
708
+ self.create_case_query_tool(),
709
+ ]
710
+
711
+ self.agent = CodeAgent(
712
+ tools=default_tools + (tools or []),
713
+ model=self.model,
714
+ max_steps=3, # Limit reasoning steps
715
+ )
716
+
717
+ def create_evidence_lookup_tool(self) -> Tool:
718
+ """Tool to look up specific evidence."""
719
+ # LlamaIndex query under the hood
720
+ pass
721
+
722
+ def create_case_query_tool(self) -> Tool:
723
+ """Tool to query case details."""
724
+ # LlamaIndex query under the hood
725
+ pass
726
+
727
+ async def generate_argument(
728
+ self,
729
+ game_state: GameState,
730
+ case: CriminalCase
731
+ ) -> DeliberationTurn:
732
+ """Generate this juror's argument for their turn."""
733
+
734
+ prompt = build_juror_prompt(
735
+ self.config,
736
+ self.memory,
737
+ game_state,
738
+ case,
739
+ task="speak"
740
+ )
741
+
742
+ response = await self.agent.run(prompt)
743
+
744
+ return parse_argument_response(response, self.config, game_state)
745
+
746
+ async def react_to_argument(
747
+ self,
748
+ argument: DeliberationTurn,
749
+ game_state: GameState,
750
+ case: CriminalCase
751
+ ) -> float:
752
+ """React to another juror's argument, update conviction."""
753
+
754
+ # Update memory with new argument
755
+ self.memory.arguments_heard.append(
756
+ ArgumentMemory(
757
+ speaker_id=argument.speaker_id,
758
+ content_summary=summarize_argument(argument.content),
759
+ argument_type=argument.argument_type,
760
+ persuasiveness=0.0, # Will be calculated
761
+ counter_points=[],
762
+ round_heard=game_state.round_number
763
+ )
764
+ )
765
+
766
+ # Calculate conviction change
767
+ delta = calculate_conviction_change(
768
+ self.config,
769
+ self.memory,
770
+ argument,
771
+ game_state
772
+ )
773
+
774
+ self.memory.current_conviction += delta
775
+ self.memory.current_conviction = max(0.0, min(1.0, self.memory.current_conviction))
776
+ self.memory.conviction_history.append(self.memory.current_conviction)
777
+
778
+ return delta
779
+
780
+
781
+ class OrchestratorAgent:
782
+ """Master agent that coordinates the deliberation."""
783
+
784
+ def __init__(
785
+ self,
786
+ jurors: List[JurorAgent],
787
+ judge: JudgeAgent,
788
+ case: CriminalCase
789
+ ):
790
+ self.jurors = {j.config.juror_id: j for j in jurors}
791
+ self.judge = judge
792
+ self.case = case
793
+ self.state = GameState(
794
+ session_id=str(uuid4()),
795
+ case_id=case.case_id
796
+ )
797
+
798
+ async def run_deliberation_round(self) -> List[DeliberationTurn]:
799
+ """Run a single round of deliberation."""
800
+
801
+ self.state.round_number += 1
802
+ turns = []
803
+
804
+ # Select 1-4 random speakers (not player unless it's their turn)
805
+ num_speakers = random.randint(1, 4)
806
+ available = [j for j in self.jurors.keys() if j != "juror_7"] # Exclude player
807
+ speakers = random.sample(available, min(num_speakers, len(available)))
808
+
809
+ # Each speaker makes argument
810
+ for speaker_id in speakers:
811
+ juror = self.jurors[speaker_id]
812
+ turn = await juror.generate_argument(self.state, self.case)
813
+ turns.append(turn)
814
+
815
+ # All other jurors react
816
+ for other_id, other_juror in self.jurors.items():
817
+ if other_id != speaker_id and other_id != "juror_7":
818
+ delta = await other_juror.react_to_argument(
819
+ turn, self.state, self.case
820
+ )
821
+ turn.impact[other_id] = delta
822
+
823
+ # Log turn
824
+ self.state.deliberation_log.append(turn)
825
+
826
+ # Check for vote changes
827
+ self._process_vote_changes()
828
+
829
+ # Check stability
830
+ if self._votes_changed_this_round(turns):
831
+ self.state.rounds_without_change = 0
832
+ else:
833
+ self.state.rounds_without_change += 1
834
+
835
+ return turns
836
+
837
+ def _process_vote_changes(self):
838
+ """Check all jurors for vote flips."""
839
+ for juror_id, juror in self.jurors.items():
840
+ if juror_id == "juror_7": # Player votes manually
841
+ continue
842
+
843
+ if check_vote_flip(juror.memory):
844
+ old_vote = self.state.votes[juror_id]
845
+ new_vote = "guilty" if juror.memory.current_conviction > 0.5 else "not_guilty"
846
+ self.state.votes[juror_id] = new_vote
847
+ # Could trigger announcement
848
+
849
+ def check_should_end(self) -> bool:
850
+ """Check if deliberation should end."""
851
+
852
+ # Unanimous verdict
853
+ votes = list(self.state.votes.values())
854
+ if len(set(votes)) == 1:
855
+ return True
856
+
857
+ # Votes stabilized
858
+ if self.state.rounds_without_change >= self.state.stability_threshold:
859
+ return True
860
+
861
+ # Max rounds reached
862
+ if self.state.round_number >= self.state.max_rounds:
863
+ return True
864
+
865
+ return False
866
+ ```
867
+
868
+ ---
869
+
870
+ ## ElevenLabs Integration
871
+
872
+ ### Judge Narrator
873
+
874
+ ```python
875
+ from elevenlabs import Voice, generate, stream
876
+
877
+ class JudgeAgent:
878
+ """The judge/narrator - uses ElevenLabs for voice."""
879
+
880
+ def __init__(self, voice_id: str = None):
881
+ self.voice_id = voice_id or "judge_voice_id" # Configure
882
+ self.voice_settings = {
883
+ "stability": 0.7,
884
+ "similarity_boost": 0.8,
885
+ "style": 0.5, # Authoritative
886
+ }
887
+
888
+ async def narrate(self, text: str, stream_output: bool = True) -> bytes:
889
+ """Generate narration audio."""
890
+
891
+ audio = generate(
892
+ text=text,
893
+ voice=Voice(voice_id=self.voice_id),
894
+ model="eleven_multilingual_v2",
895
+ stream=stream_output
896
+ )
897
+
898
+ if stream_output:
899
+ return stream(audio)
900
+ return audio
901
+
902
+ def get_case_presentation(self, case: CriminalCase) -> str:
903
+ """Script for presenting the case."""
904
+ return f"""
905
+ Members of the jury. You are here today to determine the fate of
906
+ {case.defendant.name}, who stands accused of {', '.join(case.charges)}.
907
+
908
+ {case.summary}
909
+
910
+ You will hear the evidence. You will deliberate. And you will reach
911
+ a verdict. The burden of proof lies with the prosecution, who must
912
+ prove guilt beyond a reasonable doubt.
913
+
914
+ Let us begin.
915
+ """
916
+
917
+ def get_vote_announcement(self, votes: Dict[str, str]) -> str:
918
+ """Script for announcing vote."""
919
+ guilty = sum(1 for v in votes.values() if v == "guilty")
920
+ not_guilty = 12 - guilty
921
+
922
+ return f"""
923
+ The current vote stands at {guilty} for guilty,
924
+ {not_guilty} for not guilty.
925
+
926
+ {"The jury remains divided." if guilty not in [0, 12] else ""}
927
+ {"A unanimous verdict has been reached." if guilty in [0, 12] else ""}
928
+ """
929
+ ```
930
+
931
+ ---
932
+
933
+ ## UI Components
934
+
935
+ ### Kinetic Text Animation
936
+
937
+ ```javascript
938
+ // For animated text display (like After Effects kinetic typography)
939
+ // Will sync with ElevenLabs audio or simulate typing
940
+
941
+ class KineticText {
942
+ constructor(container, options = {}) {
943
+ this.container = container;
944
+ this.speed = options.speed || 50; // ms per character
945
+ this.variance = options.variance || 20; // randomness
946
+ }
947
+
948
+ async display(text, audioUrl = null) {
949
+ // If audio provided, sync with it
950
+ if (audioUrl) {
951
+ return this.displayWithAudio(text, audioUrl);
952
+ }
953
+
954
+ // Otherwise, simulate speaking
955
+ return this.displaySimulated(text);
956
+ }
957
+
958
+ async displaySimulated(text) {
959
+ this.container.innerHTML = '';
960
+
961
+ for (let i = 0; i < text.length; i++) {
962
+ const char = text[i];
963
+ const span = document.createElement('span');
964
+ span.textContent = char;
965
+ span.style.opacity = '0';
966
+ span.style.animation = 'fadeInChar 0.1s forwards';
967
+ this.container.appendChild(span);
968
+
969
+ // Variable delay for natural feel
970
+ const delay = this.speed + (Math.random() - 0.5) * this.variance;
971
+ await this.sleep(delay);
972
+ }
973
+ }
974
+
975
+ sleep(ms) {
976
+ return new Promise(resolve => setTimeout(resolve, ms));
977
+ }
978
+ }
979
+ ```
980
+
981
+ ### Gradio UI Structure
982
+
983
+ ```python
984
+ import gradio as gr
985
+
986
+ def create_ui():
987
+ with gr.Blocks(css=CUSTOM_CSS, theme=gr.themes.Base()) as demo:
988
+
989
+ # State
990
+ game_state = gr.State(None)
991
+
992
+ # Header
993
+ gr.HTML("<h1>12 ANGRY AGENTS</h1>")
994
+
995
+ with gr.Row():
996
+ # Left: Jury Box
997
+ with gr.Column(scale=1):
998
+ gr.Markdown("### The Jury")
999
+ jury_box = gr.HTML(render_jury_box) # 12 seats with emojis/votes
1000
+ vote_tally = gr.HTML() # "7-5 GUILTY"
1001
+
1002
+ # Center: Deliberation
1003
+ with gr.Column(scale=2):
1004
+ gr.Markdown("### Deliberation Room")
1005
+ deliberation_chat = gr.Chatbot(
1006
+ label="Deliberation",
1007
+ height=400,
1008
+ show_label=False
1009
+ )
1010
+
1011
+ # Player input
1012
+ with gr.Row():
1013
+ strategy_select = gr.Dropdown(
1014
+ choices=[
1015
+ "Challenge Evidence",
1016
+ "Question Witness Credibility",
1017
+ "Appeal to Reasonable Doubt",
1018
+ "Present Alternative Theory",
1019
+ "Address Specific Juror",
1020
+ "Call for Vote"
1021
+ ],
1022
+ label="Your Strategy"
1023
+ )
1024
+ speak_btn = gr.Button("Speak", variant="primary")
1025
+
1026
+ with gr.Row():
1027
+ pass_btn = gr.Button("Pass Turn")
1028
+ call_vote_btn = gr.Button("Call Final Vote")
1029
+
1030
+ # Right: Case File
1031
+ with gr.Column(scale=1):
1032
+ gr.Markdown("### Case File")
1033
+ case_summary = gr.Markdown()
1034
+
1035
+ with gr.Accordion("Evidence", open=False):
1036
+ evidence_list = gr.HTML()
1037
+
1038
+ with gr.Accordion("Witnesses", open=False):
1039
+ witness_list = gr.HTML()
1040
+
1041
+ # Audio player for Judge
1042
+ audio_output = gr.Audio(label="Judge", autoplay=True, visible=False)
1043
+
1044
+ # MCP Server enabled
1045
+ demo.launch(mcp_server=True)
1046
+ ```
1047
+
1048
+ ---
1049
+
1050
+ ## LlamaIndex Case Database
1051
+
1052
+ ### Index Structure
1053
+
1054
+ ```python
1055
+ from llama_index.core import VectorStoreIndex, Document
1056
+ from llama_index.core.node_parser import SentenceSplitter
1057
+
1058
+ class CaseDatabase:
1059
+ """LlamaIndex-powered case database."""
1060
+
1061
+ def __init__(self, cases_dir: str):
1062
+ self.cases = self._load_cases(cases_dir)
1063
+ self.index = self._build_index()
1064
+
1065
+ def _build_index(self) -> VectorStoreIndex:
1066
+ """Build searchable index of all cases."""
1067
+
1068
+ documents = []
1069
+ for case in self.cases:
1070
+ # Index case summary
1071
+ documents.append(Document(
1072
+ text=case.summary,
1073
+ metadata={"case_id": case.case_id, "type": "summary"}
1074
+ ))
1075
+
1076
+ # Index each piece of evidence
1077
+ for evidence in case.evidence:
1078
+ documents.append(Document(
1079
+ text=f"{evidence.type}: {evidence.description}",
1080
+ metadata={
1081
+ "case_id": case.case_id,
1082
+ "type": "evidence",
1083
+ "evidence_id": evidence.evidence_id
1084
+ }
1085
+ ))
1086
+
1087
+ # Index witness testimonies
1088
+ for witness in case.witnesses:
1089
+ documents.append(Document(
1090
+ text=f"{witness.name} ({witness.role}): {witness.testimony_summary}",
1091
+ metadata={
1092
+ "case_id": case.case_id,
1093
+ "type": "witness",
1094
+ "witness_id": witness.witness_id
1095
+ }
1096
+ ))
1097
+
1098
+ parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
1099
+ nodes = parser.get_nodes_from_documents(documents)
1100
+
1101
+ return VectorStoreIndex(nodes)
1102
+
1103
+ def query_evidence(self, case_id: str, query: str) -> List[str]:
1104
+ """Query evidence for a specific case."""
1105
+
1106
+ query_engine = self.index.as_query_engine(
1107
+ filters={"case_id": case_id}
1108
+ )
1109
+ response = query_engine.query(query)
1110
+ return response.source_nodes
1111
+
1112
+ def get_random_case(self, difficulty: str = None) -> CriminalCase:
1113
+ """Get a random case, optionally filtered by difficulty."""
1114
+
1115
+ if difficulty:
1116
+ filtered = [c for c in self.cases if c.difficulty == difficulty]
1117
+ return random.choice(filtered)
1118
+ return random.choice(self.cases)
1119
+ ```
1120
+
1121
+ ---
1122
+
1123
+ ## Real Case Data Sources
1124
+
1125
+ ### Primary: Old Bailey Online (Historical)
1126
+
1127
+ **Dataset**: 197,745 criminal trials from London's Central Criminal Court (1674-1913)
1128
+
1129
+ **Access**:
1130
+ - Full XML download: https://orda.shef.ac.uk/articles/dataset/Old_Bailey_Online_XML_Data/4775434
1131
+ - API: https://www.oldbaileyonline.org/static/API.jsp
1132
+ - 2,163 trial XML files + 475 Ordinary's Accounts
1133
+
1134
+ **Data Fields**:
1135
+ - Trial ID, date, defendant name/gender
1136
+ - Offence category: theft, kill, deception, violent theft, sexual, etc.
1137
+ - Verdict, punishment
1138
+ - Full trial transcript text
1139
+
1140
+ **Why This Works**:
1141
+ - Historical cases avoid sensitivity around modern defendants
1142
+ - Rich narrative transcripts perfect for agent reasoning
1143
+ - 18th-century language adds unique flavor
1144
+ - Verdicts are known (ground truth for comparison)
1145
+
1146
+ **Integration Example**:
1147
+
1148
+ ```python
1149
+ import xml.etree.ElementTree as ET
1150
+
1151
+ def load_old_bailey_case(xml_path: str) -> CriminalCase:
1152
+ """Parse Old Bailey XML into CriminalCase model."""
1153
+ tree = ET.parse(xml_path)
1154
+ root = tree.getroot()
1155
+
1156
+ return CriminalCase(
1157
+ case_id=root.find(".//trialAccount").get("id"),
1158
+ title=f"The Crown v. {root.find('.//persName').text}",
1159
+ summary=extract_trial_text(root),
1160
+ charges=[root.find(".//offence").get("category")],
1161
+ evidence=extract_evidence_from_transcript(root),
1162
+ difficulty=infer_difficulty_from_verdict(root),
1163
+ year=int(root.find(".//date").get("year")),
1164
+ jurisdiction="London, England"
1165
+ )
1166
+ ```
1167
+
1168
+ ### Secondary: National Registry of Exonerations (Modern)
1169
+
1170
+ **Dataset**: All U.S. exonerations since 1989 (3,000+ cases)
1171
+
1172
+ **Access**: https://www.law.umich.edu/special/exoneration/Pages/about.aspx
1173
+
1174
+ **Data Fields**:
1175
+ - Crime type, state, year of conviction/exoneration
1176
+ - Contributing factors (eyewitness misID, false confession, etc.)
1177
+ - DNA involvement, sentence served
1178
+
1179
+ **Why This Works**:
1180
+ - Dramatic "wrongful conviction" cases
1181
+ - Clear evidence of reasonable doubt
1182
+ - Tests agents' ability to weigh conflicting evidence types
1183
+
1184
+ ### Fallback: Curated YAML Cases
1185
+
1186
+ For demo stability, include 3-5 handcrafted cases in `cases/predefined/`:
1187
+ - `case_001_robbery.yaml` - Clear guilty (baseline test)
1188
+ - `case_002_murder.yaml` - Ambiguous (compelling demo)
1189
+ - `case_003_exoneration.yaml` - DNA reversal scenario
1190
+
1191
+ This ensures the demo works even if external data sources are unavailable.
1192
+
1193
+ ---
1194
+
1195
+ ## File Structure
1196
+
1197
+ ```
1198
+ 12_angry_agents/
1199
+ ├── app.py # Gradio entry point
1200
+ ├── PRD.md # This document
1201
+ ├── requirements.txt
1202
+ ├── .env.example
1203
+
1204
+ ├── core/
1205
+ │ ├── __init__.py
1206
+ │ ├── game_state.py # GameState, DeliberationTurn models
1207
+ │ ├── orchestrator.py # OrchestratorAgent
1208
+ │ ├── conviction.py # Conviction score mechanics
1209
+ │ └── turn_manager.py # Turn selection, stability check
1210
+
1211
+ ├── agents/
1212
+ │ ├── __init__.py
1213
+ │ ├── base_juror.py # JurorAgent base class
1214
+ │ ├── judge.py # JudgeAgent (ElevenLabs)
1215
+ │ ├── player.py # PlayerAgent (human interface)
1216
+ │ ��── configs/
1217
+ │ └── jurors.yaml # 11 juror configurations
1218
+
1219
+ ├── case_db/
1220
+ │ ├── __init__.py
1221
+ │ ├── database.py # CaseDatabase (LlamaIndex)
1222
+ │ ├── models.py # CriminalCase, Evidence, Witness
1223
+ │ └── cases/
1224
+ │ ├── case_001.yaml
1225
+ │ ├── case_002.yaml
1226
+ │ └── ...
1227
+
1228
+ ├── memory/
1229
+ │ ├── __init__.py
1230
+ │ ├── juror_memory.py # JurorMemory management
1231
+ │ └── summarizer.py # Memory compression
1232
+
1233
+ ├── ui/
1234
+ │ ├── __init__.py
1235
+ │ ├── components.py # Gradio components
1236
+ │ ├── jury_box.py # Jury box renderer
1237
+ │ ├── chat.py # Deliberation chat
1238
+ │ └── static/
1239
+ │ ├── styles.css
1240
+ │ └── kinetic.js # Text animations
1241
+
1242
+ ├── mcp/
1243
+ │ ├── __init__.py
1244
+ │ └── tools.py # MCP tool definitions
1245
+
1246
+ └── tests/
1247
+ ├── test_conviction.py
1248
+ ├── test_orchestrator.py
1249
+ └── test_memory.py
1250
+ ```
1251
+
1252
+ ---
1253
+
1254
+ ## Development Phases
1255
+
1256
+ ### Phase 1: Foundation (4-6 hours)
1257
+ - [ ] Project setup, dependencies
1258
+ - [ ] Data models (GameState, Case, Juror)
1259
+ - [ ] Basic Gradio UI skeleton
1260
+ - [ ] Single juror agent working
1261
+
1262
+ ### Phase 2: Multi-Agent (4-6 hours)
1263
+ - [ ] All 11 juror configs
1264
+ - [ ] Orchestrator with turn management
1265
+ - [ ] Conviction score system
1266
+ - [ ] Memory system (basic)
1267
+
1268
+ ### Phase 3: Integration (3-4 hours)
1269
+ - [ ] LlamaIndex case database
1270
+ - [ ] ElevenLabs judge narration
1271
+ - [ ] Player interaction flow
1272
+ - [ ] Vote tracking and stability
1273
+
1274
+ ### Phase 4: Polish (2-3 hours)
1275
+ - [ ] UI animations (kinetic text)
1276
+ - [ ] Jury box visualization
1277
+ - [ ] MCP server tools
1278
+ - [ ] Demo video recording
1279
+
1280
+ ---
1281
+
1282
+ ## Success Metrics
1283
+
1284
+ 1. **11 agents deliberating autonomously** - TRUE agent behavior
1285
+ 2. **Judge narrating with ElevenLabs** - Audio wow factor
1286
+ 3. **Conviction scores shifting** - Visible persuasion
1287
+ 4. **Player can influence outcome** - Agency
1288
+ 5. **MCP tools functional** - External AI can play
1289
+ 6. **Runs without crashes** - Stability
1290
+
1291
+ ---
1292
+
1293
+ ---
1294
+
1295
+ ## CRITICAL: Performance Optimizations
1296
+
1297
+ ### The Latency Trap - SOLVED
1298
+
1299
+ **Problem**: If 1 speaker speaks and 11 agents react individually = 12 LLM calls per turn = SLOW
1300
+
1301
+ **Solution**: Batch Jury State Update
1302
+
1303
+ ```python
1304
+ class JuryStateManager:
1305
+ """
1306
+ Single LLM call to update ALL silent jurors' conviction scores.
1307
+ Replaces 11 individual react_to_argument() calls.
1308
+ """
1309
+
1310
+ async def batch_update_convictions(
1311
+ self,
1312
+ argument: DeliberationTurn,
1313
+ silent_jurors: List[JurorConfig],
1314
+ juror_memories: Dict[str, JurorMemory],
1315
+ game_state: GameState
1316
+ ) -> Dict[str, ConvictionUpdate]:
1317
+ """
1318
+ ONE LLM call updates all 11 jurors' reactions.
1319
+ """
1320
+
1321
+ prompt = f"""
1322
+ You are simulating how 11 different jurors would react to this argument.
1323
+
1324
+ ARGUMENT BY {argument.speaker_name}:
1325
+ "{argument.content}"
1326
+
1327
+ For each juror below, determine:
1328
+ 1. conviction_delta: float (-0.3 to +0.3) - how much their guilt conviction changes
1329
+ 2. reaction: str - brief internal thought (10 words max)
1330
+ 3. persuaded: bool - did this significantly move them?
1331
+
1332
+ JURORS:
1333
+ {self._format_juror_profiles_compact(silent_jurors, juror_memories)}
1334
+
1335
+ Respond in JSON:
1336
+ {{
1337
+ "juror_1": {{"delta": 0.1, "reaction": "Good point about the timeline", "persuaded": false}},
1338
+ "juror_2": {{"delta": -0.2, "reaction": "Too emotional, but touching", "persuaded": true}},
1339
+ ...
1340
+ }}
1341
+ """
1342
+
1343
+ response = await self.model.generate(prompt)
1344
+ return parse_batch_response(response)
1345
+ ```
1346
+
1347
+ **Result**: 1 speaker + 1 batch reaction = **2 LLM calls per turn** (not 12)
1348
+
1349
+ ### Active vs Passive Jurors
1350
+
1351
+ ```python
1352
+ # Each turn, only 2-3 jurors are "active listeners" (full memory update)
1353
+ # Others get simplified heuristic updates
1354
+
1355
+ def select_active_listeners(game_state: GameState, num: int = 3) -> List[str]:
1356
+ """Select jurors who will fully process this turn."""
1357
+
1358
+ # Prioritize: jurors on the fence, jurors addressed directly, random
1359
+ candidates = []
1360
+
1361
+ # On the fence (conviction 0.35-0.65)
1362
+ for jid, memory in juror_memories.items():
1363
+ if 0.35 < memory.current_conviction < 0.65:
1364
+ candidates.append((jid, 2)) # Priority 2
1365
+
1366
+ # Recently changed vote
1367
+ for jid in recently_flipped:
1368
+ candidates.append((jid, 3)) # Priority 3
1369
+
1370
+ # Random others
1371
+ for jid in all_jurors:
1372
+ candidates.append((jid, 1))
1373
+
1374
+ # Weight and select
1375
+ return weighted_sample(candidates, num)
1376
+ ```
1377
+
1378
+ ### Context Window Bloat - SOLVED
1379
+
1380
+ **Problem**: `deliberation_log` grows unbounded
1381
+
1382
+ **Solution**: Aggressive Rolling Summarization
1383
+
1384
+ ```python
1385
+ class MemorySummarizer:
1386
+ """Compresses old deliberation history."""
1387
+
1388
+ SUMMARY_INTERVAL = 5 # Summarize every 5 rounds
1389
+ KEEP_RECENT = 3 # Keep last 3 turns in full detail
1390
+
1391
+ async def maybe_summarize(self, memory: JurorMemory, round_num: int):
1392
+ """Compress old turns if needed."""
1393
+
1394
+ if round_num % self.SUMMARY_INTERVAL != 0:
1395
+ return
1396
+
1397
+ # Split: recent (keep full) vs old (summarize)
1398
+ old_turns = memory.arguments_heard[:-self.KEEP_RECENT]
1399
+ recent_turns = memory.arguments_heard[-self.KEEP_RECENT:]
1400
+
1401
+ if not old_turns:
1402
+ return
1403
+
1404
+ # Summarize old turns into compact form
1405
+ summary = await self._compress_turns(old_turns)
1406
+
1407
+ # Replace old turns with summary object
1408
+ memory.deliberation_summary = summary
1409
+ memory.arguments_heard = recent_turns
1410
+
1411
+ async def _compress_turns(self, turns: List[ArgumentMemory]) -> str:
1412
+ """LLM call to compress multiple turns into summary."""
1413
+
1414
+ prompt = f"""
1415
+ Summarize these {len(turns)} deliberation turns into 3-5 bullet points.
1416
+ Focus on: key arguments made, who was persuasive, major position shifts.
1417
+
1418
+ TURNS:
1419
+ {self._format_turns(turns)}
1420
+
1421
+ Respond with bullet points only.
1422
+ """
1423
+ return await self.model.generate(prompt)
1424
+
1425
+
1426
+ # Memory structure with summary
1427
+ @dataclass
1428
+ class JurorMemory:
1429
+ # ... existing fields ...
1430
+
1431
+ # Compressed history (replaces old arguments_heard entries)
1432
+ deliberation_summary: str = "" # "• Juror 3 argued about timeline..."
1433
+
1434
+ # Only recent turns in full detail
1435
+ arguments_heard: List[ArgumentMemory] # Max ~10 entries
1436
+ ```
1437
+
1438
+ ### LLM Call Budget Per Round
1439
+
1440
+ | Action | Calls | Notes |
1441
+ |--------|-------|-------|
1442
+ | 1-4 speakers generate arguments | 1-4 | Parallelizable |
1443
+ | Batch conviction update | 1 | All 11 reactions |
1444
+ | Memory summarization | 0-1 | Every 5 rounds |
1445
+ | Judge narration (ElevenLabs) | 1 | Audio only |
1446
+ | **TOTAL** | **3-7** | Down from 12-48 |
1447
+
1448
+ ---
1449
+
1450
+ ## External Participant System (MCP + Human)
1451
+
1452
+ ### Architecture: Swappable Juror Seats
1453
+
1454
+ Any of the 11 AI juror seats can be replaced by:
1455
+ 1. **External AI Agent** (via MCP) - Another AI system joins as juror
1456
+ 2. **Human Player** (via UI) - Additional human joins
1457
+ 3. **Default AI** (Gemini) - Predefined personality
1458
+
1459
+ ```python
1460
+ @dataclass
1461
+ class JurorSeat:
1462
+ """A seat in the jury that can be filled by different participant types."""
1463
+
1464
+ seat_number: int
1465
+ participant_type: Literal["ai_default", "ai_external", "human"]
1466
+ participant_id: str | None = None
1467
+
1468
+ # For AI default
1469
+ config: JurorConfig | None = None
1470
+ agent: JurorAgent | None = None
1471
+
1472
+ # For external (MCP or human)
1473
+ external_connection: ExternalConnection | None = None
1474
+
1475
+
1476
+ class JuryManager:
1477
+ """Manages the 12 jury seats with mixed participant types."""
1478
+
1479
+ def __init__(self):
1480
+ self.seats: Dict[int, JurorSeat] = {}
1481
+ self._init_default_seats()
1482
+
1483
+ def _init_default_seats(self):
1484
+ """Initialize all 12 seats with default AI jurors."""
1485
+ for i in range(1, 13):
1486
+ if i == 7: # Reserved for primary player
1487
+ self.seats[i] = JurorSeat(
1488
+ seat_number=i,
1489
+ participant_type="human",
1490
+ participant_id="player_1"
1491
+ )
1492
+ else:
1493
+ config = load_juror_config(i)
1494
+ self.seats[i] = JurorSeat(
1495
+ seat_number=i,
1496
+ participant_type="ai_default",
1497
+ config=config,
1498
+ agent=JurorAgent(config)
1499
+ )
1500
+
1501
+ def replace_with_external(
1502
+ self,
1503
+ seat_number: int,
1504
+ participant_type: Literal["ai_external", "human"],
1505
+ participant_id: str
1506
+ ) -> bool:
1507
+ """Replace a default AI with external participant."""
1508
+
1509
+ if seat_number == 7:
1510
+ return False # Primary player seat protected
1511
+
1512
+ if seat_number not in self.seats:
1513
+ return False
1514
+
1515
+ self.seats[seat_number] = JurorSeat(
1516
+ seat_number=seat_number,
1517
+ participant_type=participant_type,
1518
+ participant_id=participant_id,
1519
+ external_connection=ExternalConnection(participant_id)
1520
+ )
1521
+ return True
1522
+
1523
+ def get_participant_for_turn(self, seat_number: int) -> TurnHandler:
1524
+ """Get appropriate handler for a seat's turn."""
1525
+
1526
+ seat = self.seats[seat_number]
1527
+
1528
+ if seat.participant_type == "ai_default":
1529
+ return AITurnHandler(seat.agent)
1530
+ elif seat.participant_type == "ai_external":
1531
+ return MCPTurnHandler(seat.external_connection)
1532
+ else: # human
1533
+ return HumanTurnHandler(seat.participant_id)
1534
+ ```
1535
+
1536
+ ### MCP Tools for External Participants
1537
+
1538
+ ```python
1539
+ # MCP Server exposes these tools for external AI agents
1540
+
1541
+ def mcp_join_as_juror(
1542
+ case_id: str,
1543
+ preferred_seat: int | None = None
1544
+ ) -> Dict:
1545
+ """
1546
+ Join an active case as a juror.
1547
+
1548
+ An external AI agent can take over any non-player seat.
1549
+ Returns seat assignment and case briefing.
1550
+
1551
+ Args:
1552
+ case_id: The case to join
1553
+ preferred_seat: Preferred seat number (2-6, 8-12), or None for auto-assign
1554
+
1555
+ Returns:
1556
+ seat_number: Your assigned seat
1557
+ case_briefing: Summary of the case
1558
+ your_persona: Suggested personality (can ignore)
1559
+ current_state: Vote tally, round number
1560
+ """
1561
+ pass
1562
+
1563
+
1564
+ def mcp_get_deliberation_state(case_id: str, seat_number: int) -> Dict:
1565
+ """
1566
+ Get current state of deliberation.
1567
+
1568
+ Returns:
1569
+ recent_arguments: Last 5 arguments made
1570
+ vote_tally: Current guilty/not-guilty count
1571
+ your_conviction: Your current conviction score
1572
+ pending_speakers: Who speaks next
1573
+ is_your_turn: Whether you should speak now
1574
+ """
1575
+ pass
1576
+
1577
+
1578
+ def mcp_make_argument(
1579
+ case_id: str,
1580
+ seat_number: int,
1581
+ argument_type: str, # "evidence", "emotional", "logical", "question"
1582
+ content: str,
1583
+ target_juror: int | None = None
1584
+ ) -> Dict:
1585
+ """
1586
+ Make an argument during your turn.
1587
+
1588
+ Returns:
1589
+ accepted: Whether argument was processed
1590
+ reactions: Brief summary of jury reactions
1591
+ vote_changes: Any votes that flipped
1592
+ """
1593
+ pass
1594
+
1595
+
1596
+ def mcp_cast_vote(
1597
+ case_id: str,
1598
+ seat_number: int,
1599
+ vote: Literal["guilty", "not_guilty"]
1600
+ ) -> Dict:
1601
+ """
1602
+ Cast or change your vote.
1603
+
1604
+ Returns:
1605
+ recorded: Confirmation
1606
+ new_tally: Updated vote count
1607
+ """
1608
+ pass
1609
+
1610
+
1611
+ def mcp_pass_turn(case_id: str, seat_number: int) -> Dict:
1612
+ """Pass your turn without speaking."""
1613
+ pass
1614
+ ```
1615
+
1616
+ ### Human Join Flow (Additional Players)
1617
+
1618
+ ```
1619
+ 1. Primary player starts game (seat 7)
1620
+ 2. Game generates shareable room code
1621
+ 3. Additional humans can join via:
1622
+ - URL with room code
1623
+ - Gradio UI "Join as Juror" button
1624
+ 4. They get assigned available seat (2-6, 8-12)
1625
+ 5. When it's their turn, UI prompts for input
1626
+ 6. They see same case file, deliberation history
1627
+ ```
1628
+
1629
+ ---
1630
+
1631
+ ## Model Configuration
1632
+
1633
+ ### Default: Gemini Flash 2.5
1634
+
1635
+ ```python
1636
+ # config/models.yaml
1637
+
1638
+ default_model:
1639
+ provider: "gemini"
1640
+ model_id: "gemini-2.5-flash"
1641
+ temperature: 0.7
1642
+ max_tokens: 1024
1643
+
1644
+ # Easily swappable per-agent or globally
1645
+ model_overrides:
1646
+ judge:
1647
+ provider: "gemini"
1648
+ model_id: "gemini-2.5-flash" # Fast for narration scripts
1649
+
1650
+ batch_updater:
1651
+ provider: "gemini"
1652
+ model_id: "gemini-2.5-flash" # Handles all conviction updates
1653
+
1654
+ # Individual juror overrides (optional)
1655
+ juror_5: # The contrarian philosopher
1656
+ provider: "anthropic"
1657
+ model_id: "claude-sonnet-4-20250514"
1658
+ temperature: 0.9
1659
+ ```
1660
+
1661
+ ### LiteLLM Integration
1662
+
1663
+ ```python
1664
+ from litellm import completion
1665
+
1666
+ class ModelRouter:
1667
+ """Route to any model via LiteLLM."""
1668
+
1669
+ def __init__(self, config_path: str = "config/models.yaml"):
1670
+ self.config = load_yaml(config_path)
1671
+ self.default = self.config["default_model"]
1672
+
1673
+ def get_model_for(self, agent_id: str) -> Dict:
1674
+ """Get model config for specific agent."""
1675
+ overrides = self.config.get("model_overrides", {})
1676
+ return overrides.get(agent_id, self.default)
1677
+
1678
+ async def generate(
1679
+ self,
1680
+ agent_id: str,
1681
+ prompt: str,
1682
+ **kwargs
1683
+ ) -> str:
1684
+ """Generate completion using appropriate model."""
1685
+
1686
+ config = self.get_model_for(agent_id)
1687
+
1688
+ response = await completion(
1689
+ model=f"{config['provider']}/{config['model_id']}",
1690
+ messages=[{"role": "user", "content": prompt}],
1691
+ temperature=config.get("temperature", 0.7),
1692
+ max_tokens=config.get("max_tokens", 1024),
1693
+ **kwargs
1694
+ )
1695
+
1696
+ return response.choices[0].message.content
1697
+ ```
1698
+
1699
+ ---
1700
+
1701
+ ## Case Data Architecture
1702
+
1703
+ ### Dual Source: Real + Fallback
1704
+
1705
+ ```python
1706
+ class CaseLoader:
1707
+ """Load cases from real data or fallback to predefined."""
1708
+
1709
+ def __init__(
1710
+ self,
1711
+ real_data_path: str | None = None,
1712
+ fallback_path: str = "cases/predefined/"
1713
+ ):
1714
+ self.real_data_path = real_data_path
1715
+ self.fallback_path = fallback_path
1716
+
1717
+ # Try to load real data
1718
+ self.real_cases = self._load_real_cases() if real_data_path else []
1719
+ self.fallback_cases = self._load_fallback_cases()
1720
+
1721
+ def get_case(self, case_id: str = None, use_real: bool = True) -> CriminalCase:
1722
+ """Get a case, preferring real data if available."""
1723
+
1724
+ if case_id:
1725
+ # Specific case requested
1726
+ return self._find_case(case_id)
1727
+
1728
+ # Random case
1729
+ if use_real and self.real_cases:
1730
+ return random.choice(self.real_cases)
1731
+ return random.choice(self.fallback_cases)
1732
+
1733
+ def _load_real_cases(self) -> List[CriminalCase]:
1734
+ """Load from real case database (future: LlamaIndex over court records)."""
1735
+ # TODO: Integrate with real case API/database
1736
+ # For now, returns empty - falls back to predefined
1737
+ return []
1738
+
1739
+ def _load_fallback_cases(self) -> List[CriminalCase]:
1740
+ """Load predefined cases from YAML files."""
1741
+ cases = []
1742
+ for file in Path(self.fallback_path).glob("*.yaml"):
1743
+ case_data = yaml.safe_load(file.read_text())
1744
+ cases.append(CriminalCase(**case_data))
1745
+ return cases
1746
+
1747
+
1748
+ # Future: Real case integration
1749
+ class RealCaseConnector:
1750
+ """
1751
+ Connect to real case databases.
1752
+ Designed for easy integration later.
1753
+ """
1754
+
1755
+ def __init__(self):
1756
+ self.sources = {
1757
+ "court_listener": CourtListenerAPI(), # Future
1758
+ "justia": JustiaAPI(), # Future
1759
+ "local_files": LocalCaseFiles(), # CSV/JSON dumps
1760
+ }
1761
+
1762
+ async def search_cases(
1763
+ self,
1764
+ query: str,
1765
+ filters: Dict = None
1766
+ ) -> List[CriminalCase]:
1767
+ """Search across all connected sources."""
1768
+ pass
1769
+
1770
+ async def get_case_details(
1771
+ self,
1772
+ source: str,
1773
+ case_id: str
1774
+ ) -> CriminalCase:
1775
+ """Get full case from specific source."""
1776
+ pass
1777
+ ```
1778
+
1779
+ ---
1780
+
1781
+ ## Execution Environment
1782
+
1783
+ ### Local First, Blaxel Ready
1784
+
1785
+ ```python
1786
+ # config/execution.yaml
1787
+
1788
+ execution:
1789
+ mode: "local" # "local" | "blaxel" | "docker"
1790
+
1791
+ local:
1792
+ # No sandbox, runs in process
1793
+ timeout_seconds: 30
1794
+
1795
+ blaxel:
1796
+ api_key: "${BLAXEL_API_KEY}"
1797
+ sandbox_id: "12-angry-agents"
1798
+ persistent: true # Keep sandbox warm
1799
+
1800
+ docker:
1801
+ image: "12-angry-agents:latest"
1802
+ memory_limit: "2g"
1803
+
1804
+
1805
+ # Usage in code
1806
+ class ExecutionManager:
1807
+ """Swappable execution environment."""
1808
+
1809
+ def __init__(self, config_path: str = "config/execution.yaml"):
1810
+ self.config = load_yaml(config_path)
1811
+ self.mode = self.config["execution"]["mode"]
1812
+
1813
+ def get_executor(self) -> Executor:
1814
+ if self.mode == "local":
1815
+ return LocalExecutor()
1816
+ elif self.mode == "blaxel":
1817
+ return BlaxelExecutor(self.config["execution"]["blaxel"])
1818
+ elif self.mode == "docker":
1819
+ return DockerExecutor(self.config["execution"]["docker"])
1820
+
1821
+ async def run_agent_code(self, code: str, context: Dict) -> str:
1822
+ """Execute agent-generated code safely."""
1823
+ executor = self.get_executor()
1824
+ return await executor.run(code, context)
1825
+ ```
1826
+
1827
+ ---
1828
+
1829
+ ## Player Input: Strategy + Optional Free Text
1830
+
1831
+ ```python
1832
+ # Hybrid input: Low friction strategy selection + optional elaboration
1833
+
1834
+ ARGUMENT_STRATEGIES = [
1835
+ {
1836
+ "id": "challenge_evidence",
1837
+ "label": "Challenge Evidence",
1838
+ "prompt_hint": "Point out weaknesses in a specific piece of evidence",
1839
+ "allows_free_text": True,
1840
+ },
1841
+ {
1842
+ "id": "question_witness",
1843
+ "label": "Question Witness Credibility",
1844
+ "prompt_hint": "Raise doubts about a witness's reliability",
1845
+ "allows_free_text": True,
1846
+ },
1847
+ {
1848
+ "id": "reasonable_doubt",
1849
+ "label": "Appeal to Reasonable Doubt",
1850
+ "prompt_hint": "Emphasize the burden of proof",
1851
+ "allows_free_text": False, # AI handles this
1852
+ },
1853
+ {
1854
+ "id": "alternative_theory",
1855
+ "label": "Present Alternative Theory",
1856
+ "prompt_hint": "Suggest what might have really happened",
1857
+ "allows_free_text": True,
1858
+ },
1859
+ {
1860
+ "id": "address_juror",
1861
+ "label": "Address Specific Juror",
1862
+ "prompt_hint": "Respond to or persuade a specific juror",
1863
+ "requires_target": True,
1864
+ "allows_free_text": True,
1865
+ },
1866
+ {
1867
+ "id": "free_argument",
1868
+ "label": "Make Custom Argument",
1869
+ "prompt_hint": "Say whatever you want",
1870
+ "allows_free_text": True,
1871
+ "required_free_text": True,
1872
+ },
1873
+ ]
1874
+
1875
+
1876
+ # UI Component
1877
+ def player_input_ui():
1878
+ with gr.Row():
1879
+ strategy = gr.Dropdown(
1880
+ choices=[s["label"] for s in ARGUMENT_STRATEGIES],
1881
+ label="Your Strategy",
1882
+ value="Challenge Evidence"
1883
+ )
1884
+
1885
+ target_juror = gr.Dropdown(
1886
+ choices=["None"] + [f"Juror {i}" for i in range(1, 13) if i != 7],
1887
+ label="Target (optional)",
1888
+ visible=False # Show only for "address_juror"
1889
+ )
1890
+
1891
+ free_text = gr.Textbox(
1892
+ label="Add details (optional)",
1893
+ placeholder="e.g., 'Focus on the timeline inconsistency'",
1894
+ max_lines=2,
1895
+ visible=True
1896
+ )
1897
+
1898
+ return strategy, target_juror, free_text
1899
+ ```
1900
+
1901
+ ---
1902
+
1903
+ ## Open Questions
1904
+
1905
+ 1. Exact ElevenLabs voice ID for judge?
1906
+ 2. Should external AI participants see other AI jurors' internal conviction scores? yes configuablein code.
1907
+ 3. Max simultaneous external participants (performance)? 12
1908
+ 4. Case difficulty selector in UI? no/ random
requirements.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core
2
+ gradio==6.0.1
3
+ pydantic>=2.0.0
4
+ pydantic-settings>=2.0.0
5
+
6
+ # LLM Providers
7
+ google-genai>=1.0.0
8
+ openai>=1.0.0
9
+
10
+ # TTS
11
+ elevenlabs>=1.0.0
12
+
13
+ # Agents
14
+ smolagents>=1.0.0
15
+
16
+ # Utilities
17
+ httpx>=0.27.0
18
+ tenacity>=8.0.0
19
+ python-dotenv>=1.0.0