sxandie commited on
Commit
794fb9b
Β·
verified Β·
1 Parent(s): 8dd7e23

Update mcp_server.py

Browse files
Files changed (1) hide show
  1. mcp_server.py +1823 -24
mcp_server.py CHANGED
@@ -1,45 +1,1844 @@
1
- """MCP server exposing repo analysis and Q&A tools.
 
 
 
 
2
 
3
- This allows any MCP-capable client (e.g., Claude Desktop, Cursor, Windsurf)
4
- to reuse the same backend logic that powers the Gradio UI.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  """
6
 
7
- from mcp.server.fastmcp import FastMCP
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- from agent import analyze_github_repo, qa_on_repo, fetch_youtube_transcript
 
 
 
 
10
 
11
- server = FastMCP("github-doc-generator")
 
 
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- @server.tool()
15
- def analyze_repository(repo_url: str) -> dict:
16
- """Clone and inspect a GitHub repository.
17
 
18
- Returns the same payload as the Gradio analyzer, including documentation
19
- status, structure, and extracted doc content. Errors are surfaced as a
20
- dictionary with an ``error`` key.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- return analyze_github_repo(repo_url)
 
 
24
 
25
 
26
- @server.tool()
27
- def ask_repository_question(repo_url: str, question: str) -> str:
28
- """Answer a natural-language question about the given repository."""
 
 
29
 
30
- return qa_on_repo(repo_url, question)
 
 
 
 
31
 
 
 
 
 
 
 
32
 
33
- @server.tool()
34
- def get_youtube_transcript(video_url: str, lang: str = "en") -> dict:
35
- """Fetch a YouTube video's transcript via the RapidAPI backend.
36
 
37
- This delegates to the shared `fetch_youtube_transcript` helper used by the
38
- Gradio app. The response includes a raw transcript string and metadata.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  """
40
 
41
- return fetch_youtube_transcript(video_url, lang=lang)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
 
44
  if __name__ == "__main__":
45
- server.run()
 
 
 
 
 
1
+ from datetime import datetime, timezone
2
+ from pathlib import Path
3
+ import os
4
+ import re
5
+ import shutil
6
 
7
+ import gradio as gr
8
+
9
+ from agent import (
10
+ analyze_github_repo,
11
+ analyze_local_repo,
12
+ build_experiment_from_report,
13
+ build_repo_vector_store,
14
+ fetch_youtube_transcript,
15
+ generate_youtube_study_notes,
16
+ rag_answer_from_store,
17
+ summarize_youtube_chapters,
18
+ )
19
+ from bookmarks import (
20
+ bookmark_repo_from_analysis,
21
+ find_metadata_by_label,
22
+ get_cache_dirs,
23
+ get_dropdown_options,
24
+ )
25
+
26
+
27
+ # File extensions and folders to ignore for local uploads
28
+ IGNORE_PATTERNS = {
29
+ 'folders': {
30
+ '__pycache__', '.git', '.svn', '.hg', 'node_modules',
31
+ 'venv', 'env', '.venv', '.env', 'dist', 'build',
32
+ '.idea', '.vscode', '.pytest_cache', '.mypy_cache',
33
+ 'coverage', '.coverage', 'htmlcov', '.tox', 'eggs',
34
+ '.eggs', '*.egg-info', '.DS_Store'
35
+ },
36
+ 'extensions': {
37
+ '.pyc', '.pyo', '.pyd', '.so', '.dll', '.dylib',
38
+ '.class', '.o', '.obj', '.exe', '.bin', '.lock',
39
+ '.log', '.tmp', '.temp', '.cache', '.bak', '.swp',
40
+ '.swo', '.DS_Store', '.gitignore'
41
+ }
42
+ }
43
+
44
+
45
+ def _should_ignore_path(path: Path) -> bool:
46
+ """Check if a path should be ignored during local folder processing."""
47
+ for part in path.parts:
48
+ if part in IGNORE_PATTERNS['folders']:
49
+ return True
50
+ if part.startswith('.') and part not in {'.', '..'}:
51
+ return True
52
+ if path.suffix.lower() in IGNORE_PATTERNS['extensions']:
53
+ return True
54
+ return False
55
+
56
+
57
+ KNOWLEDGE_TRANSFER_ROOT = Path("Knowledge Transfer")
58
+
59
+
60
+ def _is_gradio_v6_or_newer() -> bool:
61
+ """Return True if the installed Gradio major version is >= 6."""
62
+ version_str = getattr(gr, "__version__", "0")
63
+ try:
64
+ major = int(version_str.split(".")[0])
65
+ return major >= 6
66
+ except (ValueError, IndexError):
67
+ return False
68
+
69
+
70
+ IS_GRADIO_V6 = _is_gradio_v6_or_newer()
71
+
72
+
73
+ def run_github_ingestion(repo_url: str):
74
+ """Analyze GitHub repository without indexing."""
75
+ repo_url = (repo_url or "").strip()
76
+ if not repo_url:
77
+ warning = "⚠️ Please paste a public GitHub repository URL to begin."
78
+ source_info = "**Source:** Not selected\n**Status:** ⏳ Pending\n**Chunks:** 0 vectors"
79
+ return warning, source_info, {
80
+ "analysis": None,
81
+ "vector_dir": "",
82
+ "vector_chunks": 0,
83
+ "summary_base": [],
84
+ }
85
+
86
+ result = analyze_github_repo(repo_url)
87
+ if "error" in result:
88
+ error_msg = f"❌ {result['error']}"
89
+ source_info = f"**Source:** {repo_url}\n**Status:** ❌ Error\n**Chunks:** 0 vectors"
90
+ return error_msg, source_info, {
91
+ "analysis": None,
92
+ "vector_dir": "",
93
+ "vector_chunks": 0,
94
+ "summary_base": [],
95
+ }
96
+
97
+ docs = result.get("documentation", [])
98
+ repo_name = result.get("repo_name", repo_url)
99
+ timestamp = datetime.now(timezone.utc).strftime("%d %b %Y, %H:%M UTC")
100
+
101
+ # Source info for preview panel
102
+ source_info = f"""**Source:** GitHub Repository
103
+ **Repository:** {repo_name}
104
+ **Status:** βœ… Analyzed
105
+ **Documents:** {len(docs)} files
106
+ **Analyzed:** {timestamp}"""
107
+
108
+ # Document preview content
109
+ preview_sections = []
110
+ for doc in docs[:5]:
111
+ content = (doc.get("content") or "").strip()
112
+ if not content:
113
+ continue
114
+ snippet = content[:600]
115
+ if len(content) > 600:
116
+ snippet = snippet.rstrip() + " ..."
117
+ preview_sections.append(
118
+ f"### πŸ“„ {doc.get('path', 'document')}\n\n{snippet}"
119
+ )
120
+
121
+ preview_content = (
122
+ "\n\n---\n\n".join(preview_sections)
123
+ if preview_sections
124
+ else "*No textual documentation snippets were found.*"
125
+ )
126
+
127
+ state_payload = {
128
+ "analysis": result,
129
+ "vector_dir": "",
130
+ "vector_chunks": 0,
131
+ "summary_base": [repo_name, str(len(docs))],
132
+ "processed_timestamp": timestamp,
133
+ "indexed": False,
134
+ }
135
+
136
+ return preview_content, source_info, state_payload
137
+
138
+
139
+ def index_github_repo(state_payload: dict | None):
140
+ """Index the analyzed GitHub repository for RAG."""
141
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
142
+ if not analysis_data:
143
+ return "⚠️ Run analysis before indexing.", state_payload
144
+
145
+ if state_payload.get("indexed"):
146
+ return "βœ… Repository already indexed and ready for RAG queries.", state_payload
147
+
148
+ docs = analysis_data.get("documentation", [])
149
+ repo_name = analysis_data.get("repo_name", "repo")
150
+ repo_url = analysis_data.get("repo_url", "")
151
+
152
+ slug, cache_dir, cache_vector_dir = get_cache_dirs(repo_url, repo_name)
153
+ if cache_dir.exists():
154
+ shutil.rmtree(cache_dir)
155
+ cache_dir.mkdir(parents=True, exist_ok=True)
156
+
157
+ vector_chunks = 0
158
+ if docs:
159
+ _, chunk_count = build_repo_vector_store(docs, persist_path=cache_vector_dir)
160
+ vector_chunks = chunk_count
161
+ else:
162
+ return "⚠️ No documentation found to index.", state_payload
163
+
164
+ new_state = {
165
+ **state_payload,
166
+ "vector_dir": str(cache_vector_dir),
167
+ "vector_chunks": vector_chunks,
168
+ "indexed": True,
169
+ }
170
+
171
+ return f"βœ… Indexed {vector_chunks} vector chunks. Ready for RAG queries!", new_state
172
+
173
+
174
+ def bookmark_github_repo(state_payload: dict | None):
175
+ """Bookmark and index the GitHub repository permanently."""
176
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
177
+ if not analysis_data:
178
+ return "⚠️ Run analysis before bookmarking.", state_payload, gr.Dropdown()
179
+
180
+ docs = analysis_data.get("documentation", [])
181
+ if not docs:
182
+ return "⚠️ No documentation to bookmark.", state_payload, gr.Dropdown()
183
+
184
+ repo_url = analysis_data.get("repo_url") or analysis_data.get("repo_name")
185
+
186
+ # Build vector store if not already done
187
+ vector_dir = state_payload.get("vector_dir") if state_payload else ""
188
+ if not vector_dir:
189
+ repo_name = analysis_data.get("repo_name", "repo")
190
+ slug, cache_dir, cache_vector_dir = get_cache_dirs(repo_url, repo_name)
191
+ if cache_dir.exists():
192
+ shutil.rmtree(cache_dir)
193
+ cache_dir.mkdir(parents=True, exist_ok=True)
194
+
195
+ if docs:
196
+ _, chunk_count = build_repo_vector_store(docs, persist_path=cache_vector_dir)
197
+ vector_dir = str(cache_vector_dir)
198
+ else:
199
+ chunk_count = 0
200
+ else:
201
+ chunk_count = state_payload.get("vector_chunks", 0)
202
+
203
+ metadata = bookmark_repo_from_analysis(
204
+ repo_url,
205
+ analysis_data,
206
+ prebuilt_vector_dir=Path(vector_dir) if vector_dir else None,
207
+ prebuilt_chunks=chunk_count,
208
+ )
209
+
210
+ choices, metadata_list = get_dropdown_options()
211
+ dropdown_update = gr.Dropdown(
212
+ choices=choices,
213
+ value=metadata.dropdown_label,
214
+ interactive=True,
215
+ )
216
+
217
+ new_state = {
218
+ **state_payload,
219
+ "vector_dir": vector_dir,
220
+ "vector_chunks": chunk_count,
221
+ "indexed": True,
222
+ }
223
+
224
+ return f"πŸ’Ύ Repository bookmarked on {metadata.last_pulled_display}. Access it in the Chat tab!", new_state, dropdown_update
225
+
226
+
227
+ def run_youtube_ingestion(youtube_url: str):
228
+ """Analyze YouTube video without indexing."""
229
+ youtube_url = (youtube_url or "").strip()
230
+ if not youtube_url:
231
+ warning = "⚠️ Paste a YouTube video URL to begin."
232
+ source_info = "**Source:** Not selected\n**Status:** ⏳ Pending\n**Chunks:** 0 vectors"
233
+ return warning, source_info, {
234
+ "analysis": None,
235
+ "vector_dir": "",
236
+ "vector_chunks": 0,
237
+ }
238
+
239
+ result = fetch_youtube_transcript(youtube_url)
240
+ if "error" in result:
241
+ error_msg = f"❌ {result['error']}"
242
+ source_info = f"**Source:** YouTube\n**Status:** ❌ Error\n**Chunks:** 0 vectors"
243
+ return error_msg, source_info, {
244
+ "analysis": None,
245
+ "vector_dir": "",
246
+ "vector_chunks": 0,
247
+ }
248
+
249
+ transcript = (result.get("raw_transcript") or "").strip()
250
+ if not transcript:
251
+ source_info = "**Source:** YouTube\n**Status:** ⚠️ No transcript\n**Chunks:** 0 vectors"
252
+ return "⚠️ No transcript text was returned.", source_info, {"analysis": None}
253
+
254
+ timestamp = datetime.now(timezone.utc).strftime("%d %b %Y, %H:%M UTC")
255
+ video_url = result.get("url", youtube_url)
256
+ lang = result.get("lang", "en")
257
+
258
+ # Source info for preview panel
259
+ source_info = f"""**Source:** YouTube Video
260
+ **URL:** {video_url}
261
+ **Language:** {lang}
262
+ **Status:** βœ… Analyzed
263
+ **Analyzed:** {timestamp}"""
264
+
265
+ # Generate chapter summaries
266
+ chapters = summarize_youtube_chapters(transcript, url=video_url)
267
+
268
+ # Preview content with chapters
269
+ preview_content = f"""### πŸ“Ί Video Transcript Analysis
270
+
271
+ {chapters}
272
+
273
+ ---
274
+
275
+ ### πŸ“ Transcript Preview
276
+
277
+ {transcript[:2000]}{"..." if len(transcript) > 2000 else ""}
278
  """
279
 
280
+ state_payload = {
281
+ "analysis": {
282
+ "transcript": transcript,
283
+ "url": video_url,
284
+ "lang": lang,
285
+ "chapters": chapters,
286
+ },
287
+ "vector_dir": "",
288
+ "vector_chunks": 0,
289
+ "indexed": False,
290
+ }
291
+
292
+ return preview_content, source_info, state_payload
293
+
294
+
295
+ def index_youtube_video(state_payload: dict | None):
296
+ """Index YouTube transcript for RAG."""
297
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
298
+ if not analysis_data:
299
+ return "⚠️ Run analysis before indexing.", state_payload
300
+
301
+ if state_payload.get("indexed"):
302
+ return "βœ… Video already indexed and ready for RAG queries.", state_payload
303
+
304
+ transcript = analysis_data.get("transcript", "")
305
+ if not transcript:
306
+ return "⚠️ No transcript found to index.", state_payload
307
+
308
+ # Create pseudo-documents from transcript
309
+ docs = [{
310
+ "path": "transcript.txt",
311
+ "content": transcript,
312
+ "type": "transcript",
313
+ }]
314
+
315
+ url = analysis_data.get("url", "youtube")
316
+ slug, cache_dir, cache_vector_dir = get_cache_dirs(url, "youtube")
317
+ if cache_dir.exists():
318
+ shutil.rmtree(cache_dir)
319
+ cache_dir.mkdir(parents=True, exist_ok=True)
320
+
321
+ _, chunk_count = build_repo_vector_store(docs, persist_path=cache_vector_dir)
322
+
323
+ new_state = {
324
+ **state_payload,
325
+ "vector_dir": str(cache_vector_dir),
326
+ "vector_chunks": chunk_count,
327
+ "indexed": True,
328
+ }
329
+
330
+ return f"βœ… Indexed {chunk_count} transcript chunks. Ready for RAG queries!", new_state
331
+
332
 
333
+ def bookmark_youtube_video(state_payload: dict | None):
334
+ """Bookmark YouTube video - persists transcript to bookmarks system."""
335
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
336
+ if not analysis_data:
337
+ return "⚠️ Run analysis before bookmarking.", state_payload, gr.Dropdown()
338
 
339
+ transcript = analysis_data.get("transcript", "")
340
+ if not transcript:
341
+ return "⚠️ No transcript found to bookmark.", state_payload, gr.Dropdown()
342
 
343
+ video_url = analysis_data.get("url", "youtube-video")
344
+ chapters = analysis_data.get("chapters", "")
345
+
346
+ # Create pseudo-analysis structure compatible with bookmark_repo_from_analysis
347
+ pseudo_analysis = {
348
+ "repo_name": f"YouTube: {video_url[:50]}",
349
+ "repo_url": video_url,
350
+ "documentation": [
351
+ {
352
+ "path": "transcript.txt",
353
+ "content": transcript,
354
+ },
355
+ {
356
+ "path": "chapters.md",
357
+ "content": chapters,
358
+ }
359
+ ],
360
+ }
361
+
362
+ # Use prebuilt vector store if already indexed
363
+ prebuilt_dir = None
364
+ prebuilt_chunks = None
365
+ if state_payload.get("indexed") and state_payload.get("vector_dir"):
366
+ prebuilt_dir = Path(state_payload["vector_dir"])
367
+ prebuilt_chunks = state_payload.get("vector_chunks", 0)
368
+
369
+ metadata = bookmark_repo_from_analysis(
370
+ video_url,
371
+ pseudo_analysis,
372
+ prebuilt_vector_dir=prebuilt_dir,
373
+ prebuilt_chunks=prebuilt_chunks,
374
+ )
375
+
376
+ choices, _ = get_dropdown_options()
377
+ dropdown_update = gr.Dropdown(choices=choices, value=metadata.dropdown_label)
378
+
379
+ new_state = {
380
+ **state_payload,
381
+ "vector_dir": metadata.vector_dir,
382
+ "vector_chunks": metadata.vector_chunks,
383
+ "indexed": True,
384
+ "bookmarked": True,
385
+ }
386
+
387
+ return f"πŸ”– YouTube video bookmarked! {metadata.vector_chunks} chunks indexed.", new_state, dropdown_update
388
 
 
 
 
389
 
390
+ def generate_youtube_transfer_report(youtube_url: str):
391
+ youtube_url = (youtube_url or "").strip()
392
+ if not youtube_url:
393
+ return "⚠️ Paste a YouTube video URL before generating a report."
394
+
395
+ result = fetch_youtube_transcript(youtube_url)
396
+ if "error" in result:
397
+ return f"❌ {result['error']}"
398
+
399
+ transcript = (result.get("raw_transcript") or "").strip()
400
+ if not transcript:
401
+ return "No transcript text was returned by the youtube-transcript MCP server; report generation was skipped."
402
+
403
+ chapters = summarize_youtube_chapters(transcript, url=result.get("url", youtube_url))
404
+ study_notes = generate_youtube_study_notes(chapters, url=result.get("url", youtube_url))
405
+
406
+ generated_at = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S %Z")
407
+ header_lines = [
408
+ "# YouTube Knowledge Transfer Report",
409
+ "",
410
+ f"- Source: {result.get('url', youtube_url)}",
411
+ f"- Language: {result.get('lang', 'en')}",
412
+ f"- Generated at: {generated_at}",
413
+ ]
414
+
415
+ lines: list[str] = []
416
+ lines.extend(header_lines)
417
+ lines.append("")
418
+ lines.append("## 1. Topic & Chapter Outline")
419
+ lines.append("")
420
+ lines.append(chapters)
421
+ lines.append("")
422
+ lines.append("## 2. Study & Interview Guidance")
423
+ lines.append("")
424
+ lines.append(study_notes)
425
+
426
+ report_markdown = "\n".join(lines)
427
+
428
+ root = _ensure_knowledge_root()
429
+ youtube_root = root / "Youtube Video"
430
+ youtube_root.mkdir(parents=True, exist_ok=True)
431
+ slug = _slugify_name(result.get("url", youtube_url))
432
+ ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
433
+ report_path = youtube_root / f"{slug}-youtube-knowledge-{ts}.md"
434
+ report_path.write_text(report_markdown, encoding="utf-8")
435
+
436
+ rel_path = report_path.relative_to(Path("."))
437
+ return f"πŸ“„ YouTube knowledge transfer report written to `{rel_path}`."
438
+
439
+
440
+ def _ensure_knowledge_root() -> Path:
441
+ KNOWLEDGE_TRANSFER_ROOT.mkdir(parents=True, exist_ok=True)
442
+ return KNOWLEDGE_TRANSFER_ROOT
443
+
444
+
445
+ def _slugify_name(name: str) -> str:
446
+ base = (name or "project").lower()
447
+ safe = re.sub(r"[^a-z0-9-]+", "-", base).strip("-")
448
+ return safe or "project"
449
+
450
+
451
+ def generate_knowledge_transfer_report(state_payload: dict | None):
452
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
453
+ if not analysis_data:
454
+ return "⚠️ Run an analysis before generating a report."
455
+
456
+ repo_name = analysis_data.get("repo_name") or "Project"
457
+ repo_url = analysis_data.get("repo_url") or "local upload"
458
+ docs = analysis_data.get("documentation") or []
459
+ doc_count = len(docs)
460
+ structure = analysis_data.get("structure") or []
461
+
462
+ vector_dir = state_payload.get("vector_dir") if state_payload else ""
463
+ vector_chunks = state_payload.get("vector_chunks", 0) if state_payload else 0
464
+ summary_base = state_payload.get("summary_base", []) if state_payload else []
465
+ processed_timestamp = state_payload.get("processed_timestamp") if state_payload else None
466
+
467
+ generated_at = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S %Z")
468
+ header_lines = [
469
+ f"# Knowledge Transfer Report – {repo_name}",
470
+ "",
471
+ f"- Source: {repo_url}",
472
+ f"- Generated at: {generated_at}",
473
+ f"- Documentation files: {doc_count}",
474
+ f"- Vector chunks: {vector_chunks}",
475
+ ]
476
+ if processed_timestamp:
477
+ header_lines.append(f"- Last analysis run: {processed_timestamp}")
478
+
479
+ overview_section = "\n".join(summary_base) if summary_base else "No high-level summary was captured during analysis."
480
+
481
+ llm_summary_section = ""
482
+ if vector_dir and vector_chunks:
483
+ try:
484
+ question = (
485
+ "Provide a detailed knowledge transfer summary of this repository. "
486
+ "Explain its purpose, main components, architecture, key dependencies, "
487
+ "and patterns that would be reusable in other projects. "
488
+ "Focus on actionable insights and how to extend or adapt this codebase."
489
+ )
490
+ llm_summary_section = rag_answer_from_store(Path(vector_dir), question, repo_summary=overview_section)
491
+ except Exception as err:
492
+ llm_summary_section = f"LLM summary unavailable due to error: {err}"
493
+ else:
494
+ llm_summary_section = "Vector store not available; running RAG-based summary was skipped."
495
+
496
+ max_items = 80
497
+ structure_snippet = "\n".join(structure[:max_items]) if structure else "No repository structure information was captured."
498
+ doc_paths = [d.get("path", "") for d in docs][:max_items]
499
+ docs_list_section = "\n".join(f"- {p}" for p in doc_paths) if doc_paths else "No documentation files were detected."
500
+
501
+ lines: list[str] = []
502
+ lines.extend(header_lines)
503
+ lines.append("")
504
+ lines.append("## 1. High-level Overview")
505
+ lines.append("")
506
+ lines.append(overview_section)
507
+ lines.append("")
508
+ lines.append("## 2. Repository Layout (snapshot)")
509
+ lines.append("")
510
+ lines.append("```")
511
+ lines.append(structure_snippet)
512
+ lines.append("```")
513
+ lines.append("")
514
+ lines.append("## 3. Documentation Files")
515
+ lines.append("")
516
+ lines.append(docs_list_section)
517
+ lines.append("")
518
+ lines.append("## 4. LLM Knowledge Summary")
519
+ lines.append("")
520
+ lines.append(llm_summary_section)
521
+ lines.append("")
522
+ lines.append("## 5. Notes for Future Reuse")
523
+ lines.append("")
524
+ lines.append(
525
+ "Use this report as a starting point when designing new projects. "
526
+ "Focus on reusing architecture patterns, utility modules, and any "
527
+ "documented best practices or workflows."
528
+ )
529
+
530
+ report_markdown = "\n".join(lines)
531
+
532
+ root = _ensure_knowledge_root()
533
+ slug = _slugify_name(repo_name)
534
+ ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
535
+ report_path = root / f"{slug}-knowledge-transfer-{ts}.md"
536
+ report_path.write_text(report_markdown, encoding="utf-8")
537
+
538
+ rel_path = report_path.relative_to(Path("."))
539
+ return f"πŸ“„ Knowledge transfer report written to `{rel_path}`."
540
+
541
+
542
+ def _list_knowledge_report_choices() -> list[str]:
543
+ root = _ensure_knowledge_root()
544
+ reports = sorted(root.glob("*.md"))
545
+ return [report.name for report in reports]
546
+
547
+
548
+ def _refresh_lab_reports_dropdown():
549
+ return gr.Dropdown(
550
+ label="Knowledge Transfer report (optional)",
551
+ choices=_list_knowledge_report_choices(),
552
+ value=None,
553
+ interactive=True,
554
+ )
555
+
556
+
557
+ def load_study_deck_from_report(report_name: str | None) -> str:
558
+ if not report_name:
559
+ return "Select a Knowledge Transfer report above to view its study deck-style summary."
560
+ root = _ensure_knowledge_root()
561
+ report_path = root / report_name
562
+ if not report_path.exists():
563
+ return f"Report `{report_name}` was not found in the Knowledge Transfer folder."
564
+ try:
565
+ raw = report_path.read_text(encoding="utf-8", errors="ignore")
566
+ except OSError as err:
567
+ return f"Unable to read report `{report_name}`: {err}"
568
+
569
+ max_chars = 6000
570
+ snippet = raw[:max_chars]
571
+ if len(raw) > max_chars:
572
+ snippet = snippet.rstrip() + "\n\n... (truncated)"
573
+
574
+ return (
575
+ f"### Study Deck Β· {report_name}\n\n"
576
+ "Scroll through this condensed report to refresh yourself on the key concepts, "
577
+ "architecture, and reusable patterns for this project.\n\n"
578
+ f"```markdown\n{snippet}\n```"
579
+ )
580
+
581
+
582
+ def _derive_local_repo_root(uploaded: str | list[str] | None) -> Path | None:
583
+ """Given a directory-style file upload, infer the repository root directory.
584
+
585
+ Gradio's File component with ``file_count="directory"`` returns a list of
586
+ filepaths under the uploaded folder (or a single filepath). We compute the
587
+ common parent directory and treat that as the repo root.
588
  """
589
+ if not uploaded:
590
+ return None
591
+ if isinstance(uploaded, str):
592
+ paths = [uploaded]
593
+ else:
594
+ paths = [p for p in uploaded if p]
595
+ if not paths:
596
+ return None
597
+ try:
598
+ common = os.path.commonpath(paths)
599
+ except ValueError:
600
+ return None
601
+ root = Path(common)
602
+ return root if root.exists() and root.is_dir() else None
603
+
604
+
605
+ def run_local_repo_ingestion(uploaded_folder):
606
+ """Analyze local repository folder, filtering irrelevant files."""
607
+ repo_root = _derive_local_repo_root(uploaded_folder)
608
+ if not repo_root:
609
+ warning = "⚠️ Upload a project folder before running analysis."
610
+ source_info = "**Source:** Not selected\n**Status:** ⏳ Pending\n**Chunks:** 0 vectors"
611
+ return warning, source_info, {
612
+ "analysis": None,
613
+ "vector_dir": "",
614
+ "vector_chunks": 0,
615
+ "summary_base": [],
616
+ }
617
+
618
+ # Filter out irrelevant files before analysis
619
+ if isinstance(uploaded_folder, list):
620
+ filtered_files = [f for f in uploaded_folder if not _should_ignore_path(Path(f))]
621
+ if not filtered_files:
622
+ warning = "⚠️ No relevant files found after filtering."
623
+ source_info = "**Source:** Local\n**Status:** ⚠️ No files\n**Chunks:** 0 vectors"
624
+ return warning, source_info, {
625
+ "analysis": None,
626
+ "vector_dir": "",
627
+ "vector_chunks": 0,
628
+ "summary_base": [],
629
+ }
630
+
631
+ result = analyze_local_repo(str(repo_root))
632
+ if "error" in result:
633
+ error_msg = f"❌ {result['error']}"
634
+ source_info = f"**Source:** Local\n**Status:** ❌ Error\n**Chunks:** 0 vectors"
635
+ return error_msg, source_info, {
636
+ "analysis": None,
637
+ "vector_dir": "",
638
+ "vector_chunks": 0,
639
+ "summary_base": [],
640
+ }
641
+
642
+ docs = result.get("documentation", [])
643
+ repo_name = result.get("repo_name", repo_root.name)
644
+ timestamp = datetime.now(timezone.utc).strftime("%d %b %Y, %H:%M UTC")
645
+
646
+ # Source info for preview panel
647
+ source_info = f"""**Source:** Local Project
648
+ **Folder:** {repo_name}
649
+ **Status:** βœ… Analyzed
650
+ **Documents:** {len(docs)} files
651
+ **Analyzed:** {timestamp}"""
652
+
653
+ # Document preview content
654
+ preview_sections = []
655
+ for doc in docs[:5]:
656
+ content = (doc.get("content") or "").strip()
657
+ if not content:
658
+ continue
659
+ snippet = content[:600]
660
+ if len(content) > 600:
661
+ snippet = snippet.rstrip() + " ..."
662
+ preview_sections.append(
663
+ f"### πŸ“„ {doc.get('path', 'document')}\n\n{snippet}"
664
+ )
665
+
666
+ preview_content = (
667
+ "\n\n---\n\n".join(preview_sections)
668
+ if preview_sections
669
+ else "*No textual documentation snippets were found.*"
670
+ )
671
+
672
+ state_payload = {
673
+ "analysis": result,
674
+ "vector_dir": "",
675
+ "vector_chunks": 0,
676
+ "summary_base": [repo_name, str(len(docs))],
677
+ "processed_timestamp": timestamp,
678
+ "indexed": False,
679
+ }
680
+
681
+ return preview_content, source_info, state_payload
682
+
683
+
684
+ def index_local_repo(state_payload: dict | None):
685
+ """Index the analyzed local repository for RAG."""
686
+ return index_github_repo(state_payload) # Same logic
687
+
688
+
689
+ def bookmark_local_repo(state_payload: dict | None):
690
+ """Bookmark and index the local repository permanently."""
691
+ return bookmark_github_repo(state_payload) # Same logic
692
+
693
+
694
+ def _format_bookmark_info(metadata: dict | None) -> str:
695
+ if not metadata:
696
+ return (
697
+ "No bookmarks yet. Process a repository in the *Process New Repository* tab, then bookmark it to enable RAG chat."
698
+ )
699
+ preview = (metadata.get("summary_preview") or "").strip()
700
+ if preview:
701
+ max_len = 600
702
+ if len(preview) > max_len:
703
+ preview_display = preview[:max_len].rstrip() + " ..."
704
+ else:
705
+ preview_display = preview
706
+ return (
707
+ f"### {metadata.get('repo_name', 'Saved Repository')}\n"
708
+ f"- URL: {metadata.get('repo_url', 'N/A')}\n"
709
+ f"- Last pulled: {metadata.get('last_pulled_display', '--/--/----')}\n"
710
+ f"- Documentation files: {metadata.get('docs_count', 0)}\n"
711
+ f"- Vector chunks: {metadata.get('vector_chunks', 0)}\n\n"
712
+ f"**Preview:**\n\n{preview_display}"
713
+ )
714
+ return (
715
+ f"### {metadata.get('repo_name', 'Saved Repository')}\n"
716
+ f"- URL: {metadata.get('repo_url', 'N/A')}\n"
717
+ f"- Last pulled: {metadata.get('last_pulled_display', '--/--/----')}\n"
718
+ f"- Documentation files: {metadata.get('docs_count', 0)}\n"
719
+ f"- Vector chunks: {metadata.get('vector_chunks', 0)}"
720
+ )
721
+
722
+
723
+ def _refresh_bookmarks(preselect: str | None = None):
724
+ choices, metadata_list = get_dropdown_options()
725
+ value = preselect if preselect and preselect in choices else (choices[0] if choices else None)
726
+ dropdown_update = gr.Dropdown(
727
+ choices=choices,
728
+ value=value,
729
+ interactive=bool(choices),
730
+ label="Bookmarked repositories",
731
+ allow_custom_value=True,
732
+ )
733
+ info = _format_bookmark_info(
734
+ find_metadata_by_label(value, metadata_list) if value else None
735
+ )
736
+ return dropdown_update, metadata_list, info
737
+
738
+
739
+ def load_bookmarks_on_start():
740
+ dropdown_update, metadata_list, info = _refresh_bookmarks()
741
+ status = "Bookmarks loaded." if metadata_list else "No bookmarks saved yet."
742
+ return dropdown_update, metadata_list, info, status
743
+
744
+
745
+ def _build_summary_from_base(base_lines: list[str], final_message: str) -> str:
746
+ if not base_lines:
747
+ return final_message
748
+ return "\n".join(base_lines + ["", final_message])
749
+
750
+
751
+ def bookmark_current_repo(state_payload: dict | None):
752
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
753
+ if not analysis_data or not analysis_data.get("documentation"):
754
+ return (
755
+ "⚠️ Run an analysis before bookmarking a repository.",
756
+ gr.Dropdown(choices=[], value=None, interactive=False, label="Bookmarked repositories"),
757
+ [],
758
+ _format_bookmark_info(None),
759
+ _build_summary_from_base(
760
+ (state_payload or {}).get("summary_base", []),
761
+ "⚠️ Bookmark failed because no analysis data is available.",
762
+ ),
763
+ state_payload,
764
+ )
765
+
766
+ repo_url = analysis_data.get("repo_url") or analysis_data.get("repo_name")
767
+ vector_dir = state_payload.get("vector_dir") if state_payload else ""
768
+ metadata = bookmark_repo_from_analysis(
769
+ repo_url,
770
+ analysis_data,
771
+ prebuilt_vector_dir=Path(vector_dir) if vector_dir else None,
772
+ prebuilt_chunks=state_payload.get("vector_chunks") if state_payload else None,
773
+ )
774
+ dropdown_update, metadata_list, info = _refresh_bookmarks(preselect=metadata.dropdown_label)
775
+ saved_msg = (
776
+ f"πŸ’Ύ Repo saved on {metadata.last_pulled_display}. Access it via the Bookmarked tab for RAG chat."
777
+ )
778
+ updated_summary = _build_summary_from_base(
779
+ state_payload.get("summary_base", []),
780
+ saved_msg,
781
+ )
782
+ new_state = {
783
+ **(state_payload or {}),
784
+ "summary_base": state_payload.get("summary_base", []),
785
+ "saved": True,
786
+ }
787
+ return saved_msg, dropdown_update, metadata_list, info, updated_summary, new_state
788
+
789
 
790
+ def update_selected_bookmark(label: str, metadata_list: list[dict]):
791
+ metadata = find_metadata_by_label(label, metadata_list or []) if label else None
792
+ return _format_bookmark_info(metadata)
793
 
794
 
795
+ def answer_bookmark_question(label: str, question: str, metadata_list: list[dict]):
796
+ if not label:
797
+ return "Select a bookmarked repository before asking a question."
798
+ if not question.strip():
799
+ return "Enter a question to query your bookmarked repository."
800
 
801
+ metadata = find_metadata_by_label(label, metadata_list or [])
802
+ if not metadata:
803
+ return "Bookmark metadata not found. Try refreshing bookmarks."
804
+ if metadata.get("vector_chunks", 0) == 0:
805
+ return "This bookmark has no vector store yet. Re-bookmark the repo to rebuild embeddings."
806
 
807
+ summary = (
808
+ f"Repository: {metadata.get('repo_name', label)}\n"
809
+ f"Docs: {metadata.get('docs_count', 0)} | Last pulled: {metadata.get('last_pulled_display', '--/--/----')}"
810
+ )
811
+ answer = rag_answer_from_store(Path(metadata["vector_dir"]), question, repo_summary=summary)
812
+ return answer
813
 
 
 
 
814
 
815
+ def placeholder_action_message(label: str):
816
+ if not label:
817
+ return "Select a bookmarked repository to use this action."
818
+ return f"Additional bookmark actions for **{label}** are coming soon."
819
+
820
+
821
+ def run_experimental_lab(intention: str, report_name: str | None):
822
+ text = (intention or "").strip()
823
+ if not text:
824
+ return "Describe what you want to build in the Experimental Lab to get started."
825
+
826
+ root = _ensure_knowledge_root()
827
+ report_markdown = ""
828
+ context_note = ""
829
+ if report_name:
830
+ report_path = root / report_name
831
+ if report_path.exists():
832
+ try:
833
+ raw = report_path.read_text(encoding="utf-8", errors="ignore")
834
+ report_markdown = raw
835
+ snippet = raw[:3000]
836
+ context_note = (
837
+ f"Using Knowledge Transfer report: `{report_name}` as reference.\n\n"
838
+ f"Snippet from report (truncated):\n\n```markdown\n{snippet}\n```\n\n"
839
+ )
840
+ except OSError:
841
+ context_note = f"Unable to read Knowledge Transfer report `{report_name}`. Proceeding without embedded context.\n\n"
842
+
843
+ build_result = build_experiment_from_report(text, report_markdown)
844
+ code = build_result.get("code", "")
845
+ stdout = build_result.get("stdout", "")
846
+ error = build_result.get("error", "")
847
+
848
+ base_intro = (
849
+ "Experimental Lab is a sandbox where future versions of MonkeyMind will use "
850
+ "Knowledge Transfer reports as context to plan and build small Gradio apps.\n\n"
851
+ )
852
+
853
+ code_section = "### Generated experiment code\n\n"
854
+ if code:
855
+ code_section += f"```python\n{code}\n```\n\n"
856
+ else:
857
+ code_section += "No code was generated.\n\n"
858
+
859
+ results_section = "### Sandbox output\n\n"
860
+ if stdout:
861
+ results_section += f"**Stdout / logs:**\n\n```text\n{stdout}\n```\n\n"
862
+ if error:
863
+ results_section += f"**Error:**\n\n```text\n{error}\n```\n\n"
864
+ if not stdout and not error:
865
+ results_section += "No errors encountered during sandbox test.\n\n"
866
+
867
+ return (
868
+ base_intro
869
+ + context_note
870
+ + f"You wrote:\n\n> {text}\n\n"
871
+ + code_section
872
+ + results_section
873
+ )
874
+
875
+
876
+ def lab_fix_bugs(intention: str, report_name: str | None):
877
+ base = run_experimental_lab(intention, report_name)
878
+ return (
879
+ base
880
+ + "\n\n---\n\n_This Fix bugs action will eventually trigger another build iteration to resolve errors in the generated app. "
881
+ "For now, it simply records another planning pass based on your intention and chosen report._"
882
+ )
883
+
884
+
885
+ def lab_mark_happy(intention: str, report_name: str | None):
886
+ text = (intention or "").strip()
887
+ return (
888
+ "Marking this experiment as complete.\n\n"
889
+ f"Final intention:\n\n> {text or 'N/A'}\n\n"
890
+ "You can now export or reuse this idea elsewhere. Future versions will attach concrete code artifacts here."
891
+ )
892
+
893
+
894
+ def lab_export_project(intention: str, report_name: str | None):
895
+ text = (intention or "").strip()
896
+ return (
897
+ "Export placeholder: a future version will bundle generated code, configuration, and a short README "
898
+ "into a downloadable package.\n\n"
899
+ f"Current experiment description:\n\n> {text or 'N/A'}\n\n"
900
+ f"Reference report: `{report_name or 'none selected'}`."
901
+ )
902
+
903
+
904
+ def answer_chat_question(question: str, github_state, local_state, youtube_state, selected_bookmark, metadata_list):
905
+ """Answer questions using RAG from any indexed source or bookmark."""
906
+ if not question.strip():
907
+ return "Please enter a question."
908
+
909
+ # Check if using a bookmark
910
+ if selected_bookmark:
911
+ metadata = find_metadata_by_label(selected_bookmark, metadata_list or [])
912
+ if metadata and metadata.get("vector_chunks", 0) > 0:
913
+ summary = (
914
+ f"Repository: {metadata.get('repo_name', selected_bookmark)}\n"
915
+ f"Last pulled: {metadata.get('last_pulled_display', '--/--/----')}"
916
+ )
917
+ answer = rag_answer_from_store(Path(metadata["vector_dir"]), question, repo_summary=summary)
918
+ return f"**[Bookmark: {selected_bookmark}]**\n\n{answer}"
919
+
920
+ # Check current session sources
921
+ for state, label in [
922
+ (github_state, "GitHub"),
923
+ (local_state, "Local"),
924
+ (youtube_state, "YouTube"),
925
+ ]:
926
+ if state and state.get("indexed") and state.get("vector_dir"):
927
+ vector_dir = Path(state["vector_dir"])
928
+ if vector_dir.exists():
929
+ answer = rag_answer_from_store(vector_dir, question)
930
+ return f"**[{label} Source]**\n\n{answer}"
931
+
932
+ return "⚠️ No indexed sources available. Please index a repository or select a bookmark first."
933
+
934
+
935
+ def generate_and_download_report(state_payload: dict | None, source_type: str):
936
+ """Generate markdown report and return file path for download."""
937
+ analysis_data = (state_payload or {}).get("analysis") if state_payload else None
938
+ if not analysis_data:
939
+ return None
940
+
941
+ if source_type == "youtube":
942
+ transcript = analysis_data.get("transcript", "")
943
+ chapters = analysis_data.get("chapters", "")
944
+ url = analysis_data.get("url", "youtube")
945
+ lang = analysis_data.get("lang", "en")
946
+
947
+ study_notes = generate_youtube_study_notes(chapters, url=url)
948
+
949
+ generated_at = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S %Z")
950
+ lines = [
951
+ "# YouTube Knowledge Transfer Report",
952
+ "",
953
+ f"- Source: {url}",
954
+ f"- Language: {lang}",
955
+ f"- Generated at: {generated_at}",
956
+ "",
957
+ "## 1. Chapter Outline",
958
+ "",
959
+ chapters,
960
+ "",
961
+ "## 2. Study Notes",
962
+ "",
963
+ study_notes,
964
+ ]
965
+ else:
966
+ # GitHub or Local repo
967
+ repo_name = analysis_data.get("repo_name", "Project")
968
+ repo_url = analysis_data.get("repo_url", "local")
969
+ docs = analysis_data.get("documentation", [])
970
+
971
+ generated_at = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S %Z")
972
+ lines = [
973
+ f"# Knowledge Transfer Report – {repo_name}",
974
+ "",
975
+ f"- Source: {repo_url}",
976
+ f"- Generated at: {generated_at}",
977
+ f"- Documentation files: {len(docs)}",
978
+ "",
979
+ "## Documentation Files",
980
+ "",
981
+ ]
982
+ for doc in docs[:50]:
983
+ lines.append(f"- {doc.get('path', 'unknown')}")
984
+
985
+ report_markdown = "\n".join(lines)
986
+
987
+ root = _ensure_knowledge_root()
988
+ slug = _slugify_name(analysis_data.get("repo_name", "project"))
989
+ ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
990
+ report_path = root / f"{slug}-report-{ts}.md"
991
+ report_path.write_text(report_markdown, encoding="utf-8")
992
+
993
+ return str(report_path)
994
+
995
+
996
+ def refresh_bookmarks_dropdown():
997
+ """Refresh the bookmarks dropdown."""
998
+ choices, metadata_list = get_dropdown_options()
999
+ return gr.Dropdown(choices=choices, value=None, interactive=True), metadata_list
1000
+
1001
+
1002
+ def build_interface() -> tuple[gr.Blocks, gr.Theme | None, str | None]:
1003
+ """Build the Gradio interface with improved UI/UX inspired by modern dashboard design."""
1004
+
1005
+ custom_css = """
1006
+ @import url('https://fonts.googleapis.com/css2?family=Outfit:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap');
1007
+
1008
+ :root {
1009
+ --primary: #34d399; /* Mint Green */
1010
+ --primary-glow: rgba(52, 211, 153, 0.4);
1011
+ --glass-bg: rgba(15, 23, 42, 0.6);
1012
+ --glass-border: rgba(255, 255, 255, 0.08);
1013
+ --text-main: #f8fafc;
1014
+ --text-muted: #94a3b8;
1015
+ }
1016
+
1017
+ body {
1018
+ background-color: #0f172a;
1019
+ color: var(--text-main);
1020
+ font-family: 'Outfit', sans-serif !important;
1021
+ }
1022
+
1023
+ /* Global container override */
1024
+ .gradio-container {
1025
+ max-width: 1400px !important;
1026
+ background: #0f172a !important;
1027
+ background-image:
1028
+ radial-gradient(circle at 0% 0%, rgba(52, 211, 153, 0.15) 0%, transparent 50%),
1029
+ radial-gradient(circle at 100% 100%, rgba(16, 185, 129, 0.1) 0%, transparent 50%) !important;
1030
+ border: none !important;
1031
+ }
1032
+
1033
+ /* Header styling */
1034
+ .header-container {
1035
+ background: rgba(15, 23, 42, 0.8);
1036
+ backdrop-filter: blur(12px);
1037
+ border-bottom: 1px solid var(--glass-border);
1038
+ padding: 20px 24px;
1039
+ margin: -16px -16px 24px -16px;
1040
+ border-radius: 0;
1041
+ }
1042
+
1043
+ /* Card/Panel styling */
1044
+ .source-card, .gradio-group, .tabs, .tabitem, .box-container {
1045
+ background: var(--glass-bg) !important;
1046
+ backdrop-filter: blur(12px);
1047
+ border: 1px solid var(--glass-border) !important;
1048
+ border-radius: 16px !important;
1049
+ box-shadow: 0 4px 20px rgba(0, 0, 0, 0.2);
1050
+ padding: 20px !important;
1051
+ margin-bottom: 20px !important;
1052
+ }
1053
+
1054
+ /* Inputs and Textareas */
1055
+ input, textarea, .gr-input, .gr-box, .dropdown-wrap {
1056
+ background-color: rgba(30, 41, 59, 0.6) !important;
1057
+ border: 1px solid rgba(255, 255, 255, 0.1) !important;
1058
+ color: var(--text-main) !important;
1059
+ border-radius: 10px !important;
1060
+ }
1061
+
1062
+ input:focus, textarea:focus {
1063
+ border-color: var(--primary) !important;
1064
+ box-shadow: 0 0 0 2px var(--primary-glow) !important;
1065
+ }
1066
+
1067
+ /* Buttons */
1068
+ button.primary {
1069
+ background: linear-gradient(135deg, #34d399 0%, #10b981 100%) !important;
1070
+ color: #0f172a !important;
1071
+ font-weight: 600 !important;
1072
+ border: none !important;
1073
+ box-shadow: 0 4px 15px rgba(52, 211, 153, 0.3) !important;
1074
+ }
1075
+ button.secondary {
1076
+ background: rgba(30, 41, 59, 0.8) !important;
1077
+ border: 1px solid rgba(255, 255, 255, 0.1) !important;
1078
+ color: var(--text-muted) !important;
1079
+ }
1080
+ button.secondary:hover {
1081
+ color: var(--text-main) !important;
1082
+ border-color: var(--primary) !important;
1083
+ }
1084
+ button.stop {
1085
+ background: linear-gradient(135deg, #f87171 0%, #ef4444 100%) !important;
1086
+ color: white !important;
1087
+ }
1088
+
1089
+ /* Status indicators */
1090
+ .status-ready {
1091
+ background: rgba(16, 185, 129, 0.1);
1092
+ color: #34d399;
1093
+ padding: 6px 12px;
1094
+ border-radius: 20px;
1095
+ font-size: 12px;
1096
+ border: 1px solid rgba(16, 185, 129, 0.2);
1097
+ }
1098
+
1099
+ .status-pending {
1100
+ background: rgba(251, 191, 36, 0.1);
1101
+ color: #fbbf24;
1102
+ padding: 6px 12px;
1103
+ border-radius: 20px;
1104
+ font-size: 12px;
1105
+ border: 1px solid rgba(251, 191, 36, 0.2);
1106
+ }
1107
+
1108
+ /* Keep copy/like buttons always visible and styled */
1109
+ .message-buttons {
1110
+ opacity: 1 !important;
1111
+ display: flex !important;
1112
+ gap: 4px !important;
1113
+ }
1114
+ .message-buttons button {
1115
+ color: #94a3b8 !important; /* text-muted */
1116
+ background: transparent !important;
1117
+ border: none !important;
1118
+ box-shadow: none !important;
1119
+ }
1120
+ .message-buttons button:hover {
1121
+ color: #34d399 !important; /* primary */
1122
+ background: rgba(52, 211, 153, 0.1) !important;
1123
+ }
1124
+
1125
+ /* Chat bubbles */
1126
+ .chat-assistant {
1127
+ background: rgba(30, 41, 59, 0.8) !important;
1128
+ border: 1px solid var(--glass-border);
1129
+ border-radius: 18px 18px 18px 4px !important;
1130
+ color: var(--text-main) !important;
1131
+ }
1132
+ .chat-user {
1133
+ background: linear-gradient(135deg, #34d399 0%, #10b981 100%) !important;
1134
+ color: #022c22 !important;
1135
+ border-radius: 18px 18px 4px 18px !important;
1136
+ font-weight: 500;
1137
+ }
1138
+
1139
+ /* Chat message text size and font */
1140
+ .message-wrap .message {
1141
+ font-size: 0.9rem !important;
1142
+ font-family: 'JetBrains Mono', monospace !important;
1143
+ line-height: 1.5 !important;
1144
+ }
1145
+
1146
+ /* Typography overrides */
1147
+ .prose, .prose h1, .prose h2, .prose h3, .prose p, .prose strong {
1148
+ color: var(--text-main) !important;
1149
+ }
1150
+
1151
+ /* Scrollbar */
1152
+ ::-webkit-scrollbar { width: 6px; height: 6px; }
1153
+ ::-webkit-scrollbar-track { background: transparent; }
1154
+ ::-webkit-scrollbar-thumb { background: #334155; border-radius: 10px; }
1155
+ ::-webkit-scrollbar-thumb:hover { background: #475569; }
1156
  """
1157
 
1158
+ app_theme = gr.themes.Soft(
1159
+ primary_hue="emerald",
1160
+ secondary_hue="slate",
1161
+ neutral_hue="slate",
1162
+ ).set(
1163
+ body_background_fill="#0f172a",
1164
+ block_background_fill="#1e293b",
1165
+ block_border_color="rgba(255,255,255,0.1)",
1166
+ input_background_fill="#0f172a",
1167
+ button_primary_background_fill="#34d399",
1168
+ button_primary_text_color="#0f172a",
1169
+ )
1170
+
1171
+ blocks_kwargs = {"title": "πŸ’ MonkeyMind - Knowledge Transfer Agent"}
1172
+ if not IS_GRADIO_V6:
1173
+ blocks_kwargs.update(theme=app_theme, css=custom_css)
1174
+
1175
+ with gr.Blocks(**blocks_kwargs) as demo:
1176
+
1177
+ # State variables
1178
+ github_state = gr.State({})
1179
+ local_state = gr.State({})
1180
+ youtube_state = gr.State({})
1181
+ bookmarks_metadata = gr.State([])
1182
+ chat_history = gr.State([])
1183
+ notepad_content = gr.State("") # For notepad feature
1184
+ copied_texts = gr.State(set()) # Track copied texts to prevent duplicates
1185
+
1186
+ # ===== HEADER =====
1187
+ with gr.Row(elem_classes=["header-container"]):
1188
+ gr.HTML("""
1189
+ <div style="display: flex; align-items: center; gap: 16px;">
1190
+ <div style="width: 48px; height: 48px; border-radius: 12px; background: linear-gradient(135deg, #34d399 0%, #10b981 100%); display: flex; align-items: center; justify-content: center; box-shadow: 0 0 20px rgba(52, 211, 153, 0.3);">
1191
+ <span style="font-size: 28px;">πŸ’</span>
1192
+ </div>
1193
+ <div>
1194
+ <h1 style="margin: 0; font-size: 1.5rem; font-weight: 700; color: #f8fafc; letter-spacing: -0.5px;">
1195
+ MonkeyMind
1196
+ </h1>
1197
+ <div style="display: flex; align-items: center; gap: 8px;">
1198
+ <span style="width: 8px; height: 8px; border-radius: 50%; background: #34d399; box-shadow: 0 0 10px #34d399;"></span>
1199
+ <p style="margin: 0; font-size: 0.8rem; color: #94a3b8; font-family: 'JetBrains Mono', monospace;">AGENT ACTIVE</p>
1200
+ </div>
1201
+ </div>
1202
+ </div>
1203
+ """)
1204
+
1205
+ # ===== APP DESCRIPTION =====
1206
+ gr.Markdown("""
1207
+ > **πŸ’ MonkeyMind** is your Knowledge Transfer & RAG Agent. Analyze GitHub repos, local projects, or YouTube videos
1208
+ > to build a personal knowledge base. Chat with your sources using AI-powered retrieval.
1209
+ >
1210
+ > **Quick Start:** Paste a GitHub URL β†’ Click **Analyze** β†’ Click **Index** β†’ Start chatting!
1211
+ > Bookmark important sources to access them anytime.
1212
+ """)
1213
+
1214
+ # ===== MAIN LAYOUT: 1/3 Left Panel + 2/3 Right Panel =====
1215
+ with gr.Row():
1216
+ # ===== LEFT PANEL: Data Sources + Bookmarks =====
1217
+ with gr.Column(scale=1, min_width=350):
1218
+
1219
+ # Data Sources Section
1220
+ gr.Markdown("### πŸ“‘ Data Ingestion")
1221
+
1222
+ # Source Type Selector (pill-style tabs)
1223
+ source_type = gr.Radio(
1224
+ choices=["πŸ”— GitHub", "πŸ“ Local", "πŸŽ₯ YouTube"],
1225
+ value="πŸ”— GitHub",
1226
+ label="",
1227
+ container=False,
1228
+ interactive=True
1229
+ )
1230
+
1231
+ # Dynamic source input based on selection
1232
+ with gr.Group(visible=True) as github_group:
1233
+ gr.Markdown("*Paste a GitHub repo URL and click **Analyze** to extract docs. Then **Index** to enable RAG chat. You can see retrieved contents in the **Preview** tab on the right.*")
1234
+ github_url = gr.Textbox(
1235
+ label="Repository URL",
1236
+ placeholder="https://github.com/username/repo",
1237
+ show_label=True,
1238
+ )
1239
+ github_analyze_btn = gr.Button("πŸ” Analyze Repository", variant="secondary")
1240
+ github_index_btn = gr.Button("⚑ Index for RAG", variant="primary")
1241
+ github_status = gr.Markdown("Ready to analyze.", elem_classes=["status-pending"])
1242
+ github_bookmark_btn = gr.Button("πŸ”– Bookmark & Save to Knowledge Base", variant="secondary")
1243
+ gr.Markdown("---")
1244
+ gr.Markdown("**πŸ“ Notepad** *(copy useful info here)*")
1245
+ github_notepad = gr.Textbox(
1246
+ label="",
1247
+ placeholder="Paste or write notes here...",
1248
+ lines=3,
1249
+ container=False,
1250
+ show_copy_button=True,
1251
+ )
1252
+ with gr.Row():
1253
+ github_notepad_download = gr.DownloadButton("πŸ“₯ Download as .md", variant="secondary", size="sm")
1254
+
1255
+ with gr.Group(visible=False) as local_group:
1256
+ gr.Markdown("*Upload a project folder. Irrelevant files are auto-filtered. You can see retrieved contents in the **Preview** tab on the right.*")
1257
+ local_folder = gr.File(
1258
+ label="Upload Project Folder",
1259
+ file_count="directory",
1260
+ type="filepath",
1261
+ )
1262
+ local_analyze_btn = gr.Button("πŸ” Analyze Project", variant="secondary")
1263
+ local_index_btn = gr.Button("⚑ Index for RAG", variant="primary")
1264
+ local_status = gr.Markdown("Ready to analyze.", elem_classes=["status-pending"])
1265
+ local_bookmark_btn = gr.Button("πŸ”– Bookmark & Save to Knowledge Base", variant="secondary")
1266
+ gr.Markdown("---")
1267
+ gr.Markdown("**πŸ“ Notepad** *(copy useful info here)*")
1268
+ local_notepad = gr.Textbox(
1269
+ label="",
1270
+ placeholder="Paste or write notes here...",
1271
+ lines=3,
1272
+ container=False,
1273
+ show_copy_button=True,
1274
+ )
1275
+ with gr.Row():
1276
+ local_notepad_download = gr.DownloadButton("πŸ“₯ Download as .md", variant="secondary", size="sm")
1277
+
1278
+ with gr.Group(visible=False) as youtube_group:
1279
+ gr.Markdown("*Paste a YouTube video URL to extract and analyze the transcript. You can see retrieved contents in the **Preview** tab on the right.*")
1280
+ youtube_url = gr.Textbox(
1281
+ label="Video URL",
1282
+ placeholder="https://www.youtube.com/watch?v=...",
1283
+ )
1284
+ youtube_analyze_btn = gr.Button("πŸ” Analyze Video", variant="secondary")
1285
+ youtube_index_btn = gr.Button("⚑ Index for RAG", variant="primary")
1286
+ youtube_status = gr.Markdown("Ready to analyze.", elem_classes=["status-pending"])
1287
+ youtube_bookmark_btn = gr.Button("πŸ”– Bookmark & Save to Knowledge Base", variant="secondary")
1288
+ gr.Markdown("---")
1289
+ gr.Markdown("**πŸ“ Notepad** *(copy useful info here)*")
1290
+ youtube_notepad = gr.Textbox(
1291
+ label="",
1292
+ placeholder="Paste or write notes here...",
1293
+ lines=3,
1294
+ container=False,
1295
+ show_copy_button=True,
1296
+ )
1297
+ with gr.Row():
1298
+ youtube_notepad_download = gr.DownloadButton("πŸ“₯ Download as .md", variant="secondary", size="sm")
1299
+
1300
+ # Source type switching logic
1301
+ def switch_source(choice):
1302
+ return (
1303
+ gr.Group(visible=("GitHub" in choice)),
1304
+ gr.Group(visible=("Local" in choice)),
1305
+ gr.Group(visible=("YouTube" in choice)),
1306
+ )
1307
+
1308
+ source_type.change(
1309
+ fn=switch_source,
1310
+ inputs=[source_type],
1311
+ outputs=[github_group, local_group, youtube_group],
1312
+ )
1313
+
1314
+ gr.Markdown("---")
1315
+
1316
+ # Bookmarks Quick Access Section
1317
+ gr.Markdown("### 🧠 Knowledge Base")
1318
+ gr.Markdown("*Select sources to use for chat. Multiple selections allowed.*")
1319
+
1320
+ # Use CheckboxGroup for multi-select
1321
+ bookmarks_checkboxes = gr.CheckboxGroup(
1322
+ label="Active Sources",
1323
+ choices=[],
1324
+ value=[],
1325
+ interactive=True,
1326
+ info="Check sources to include in chat context"
1327
+ )
1328
+
1329
+ # Keep dropdown for compatibility (hidden, used internally)
1330
+ bookmarks_dropdown = gr.Dropdown(
1331
+ label="",
1332
+ choices=[],
1333
+ value=None,
1334
+ interactive=True,
1335
+ visible=False,
1336
+ )
1337
+
1338
+ with gr.Row():
1339
+ refresh_bookmarks_btn = gr.Button("πŸ”„ Refresh", variant="secondary", size="sm", scale=1)
1340
+ view_all_btn = gr.Button("πŸ“‹ View All", variant="secondary", size="sm", scale=1)
1341
+
1342
+ # Bookmark info display
1343
+ bookmark_info = gr.Markdown(
1344
+ value="*No sources selected. Bookmark repos to add them here.*",
1345
+ elem_classes=["info-box"]
1346
+ )
1347
+
1348
+ # ===== RIGHT PANEL: Chat & Preview =====
1349
+ with gr.Column(scale=2, min_width=500):
1350
+
1351
+ # Chat vs Preview Toggle
1352
+ right_panel_mode = gr.Radio(
1353
+ choices=["πŸ’¬ Chat & RAG", "πŸ“„ Preview"],
1354
+ value="πŸ’¬ Chat & RAG",
1355
+ label="",
1356
+ container=False,
1357
+ )
1358
+
1359
+ # Chat Interface
1360
+ with gr.Group(visible=True) as chat_panel:
1361
+ with gr.Group(elem_classes=["box-container"]):
1362
+ gr.Markdown("### πŸ’ Knowledge Assistant")
1363
+ gr.Markdown("*Hover over messages to see copy icon. Use toggles below to augment with web/wiki search.*")
1364
+
1365
+ chatbot_kwargs = dict(
1366
+ value=[],
1367
+ height=450,
1368
+ show_label=False,
1369
+ avatar_images=["images/user.png", "images/monkey.png"],
1370
+ elem_classes=["chat-container"],
1371
+ type="messages",
1372
+ )
1373
+
1374
+ chatbot = gr.Chatbot(**chatbot_kwargs)
1375
+
1376
+ # Toolbar row: Web Search, Wiki Search (Clear Chat removed)
1377
+ with gr.Row():
1378
+ # clear_chat_btn removed as requested
1379
+ web_search_toggle = gr.Checkbox(
1380
+ label="🌐 Web Search",
1381
+ value=False,
1382
+ interactive=True,
1383
+ )
1384
+ wiki_search_toggle = gr.Checkbox(
1385
+ label="πŸ“š Wikipedia",
1386
+ value=False,
1387
+ interactive=True,
1388
+ )
1389
+
1390
+ with gr.Row():
1391
+ question_box = gr.Textbox(
1392
+ label="",
1393
+ placeholder="Ask anything about your indexed sources...",
1394
+ lines=1,
1395
+ scale=5,
1396
+ container=False,
1397
+ )
1398
+ send_btn = gr.Button("Send", variant="primary", scale=1)
1399
+
1400
+ gr.Examples(
1401
+ examples=[
1402
+ "Summarize the main architecture patterns",
1403
+ "What are the key dependencies?",
1404
+ "Explain the core functionality",
1405
+ "What patterns can I reuse?",
1406
+ ],
1407
+ inputs=question_box,
1408
+ label="Quick Questions"
1409
+ )
1410
+
1411
+ # ===== EXPERIMENTAL LAB (Collapsible) - Moved here =====
1412
+ with gr.Group(elem_classes=["box-container"]):
1413
+ gr.Markdown("### πŸ§ͺ Experimental Lab")
1414
+ gr.Markdown("Use this lab to prototype small apps based on your knowledge base.")
1415
+ with gr.Accordion("Open Lab", open=False):
1416
+ gr.Markdown("*The agent generates Gradio code and tests it in a sandbox.*")
1417
+ with gr.Row():
1418
+ with gr.Column(scale=1):
1419
+ gr.Markdown("### 🎯 Build an Experiment")
1420
+ lab_intention = gr.Textbox(
1421
+ label="What do you want to build?",
1422
+ placeholder="e.g., A Gradio app that visualizes patterns from a Knowledge Transfer report.",
1423
+ lines=3,
1424
+ )
1425
+ lab_report_dropdown = gr.Dropdown(
1426
+ label="Reference Material (bookmarked report)",
1427
+ choices=_list_knowledge_report_choices(),
1428
+ value=None,
1429
+ interactive=True,
1430
+ info="Select a Knowledge Transfer report as context"
1431
+ )
1432
+ with gr.Row():
1433
+ lab_refresh_reports_btn = gr.Button("πŸ”„ Refresh", variant="secondary", size="sm")
1434
+ lab_start_btn = gr.Button("▢️ Start Build", variant="primary", size="sm")
1435
+
1436
+ gr.Markdown("---")
1437
+ gr.Markdown("**πŸ”§ Fix Issues**")
1438
+ lab_fix_instruction = gr.Textbox(
1439
+ label="",
1440
+ placeholder="Describe what needs to be fixed (e.g., 'The button click handler is not working')...",
1441
+ lines=2,
1442
+ container=False,
1443
+ )
1444
+ with gr.Row():
1445
+ lab_fix_btn = gr.Button("πŸ”§ Apply Fix", variant="secondary", size="sm")
1446
+ lab_happy_btn = gr.Button("βœ… Done", variant="secondary", size="sm")
1447
+
1448
+ lab_export_btn = gr.Button("πŸ“₯ Export Code", variant="primary")
1449
+ lab_download = gr.File(label="Download", visible=False)
1450
+
1451
+ with gr.Column(scale=2):
1452
+ gr.Markdown("### πŸ”¬ Experiment Output")
1453
+ lab_output = gr.Markdown(
1454
+ "Describe what you want to build, select a reference report from your bookmarks, then click **Start Build**.\n\n"
1455
+ "If there are errors, describe the issue in the fix box and click **Apply Fix**."
1456
+ )
1457
+ lab_code_display = gr.Code(
1458
+ label="Generated Code",
1459
+ language="python",
1460
+ visible=False,
1461
+ )
1462
+
1463
+ # Preview Interface
1464
+ with gr.Group(visible=False) as preview_panel:
1465
+ gr.Markdown("### πŸ“„ Document Preview")
1466
+
1467
+ with gr.Row():
1468
+ with gr.Column(scale=1):
1469
+ preview_source_info = gr.Markdown("""
1470
+ **Source:** Not selected
1471
+ **Status:** ⏳ Pending
1472
+ **Chunks:** 0 vectors
1473
+ """, elem_classes=["info-box"])
1474
+ with gr.Column(scale=1, min_width=100):
1475
+ with gr.Row():
1476
+ preview_copy_btn = gr.Button("πŸ“‹ Copy", variant="secondary", size="sm", scale=1)
1477
+ preview_download_btn = gr.DownloadButton("πŸ“₯ Download", variant="secondary", size="sm", scale=1)
1478
+ preview_download_file = gr.File(visible=False)
1479
+
1480
+ preview_content = gr.Markdown(
1481
+ value="Select a source and analyze it to see the preview here.",
1482
+ elem_classes=["preview-card"]
1483
+ )
1484
+
1485
+ # Panel switching logic
1486
+ def switch_panel(choice):
1487
+ return (
1488
+ gr.Group(visible=("Chat" in choice)),
1489
+ gr.Group(visible=("Preview" in choice)),
1490
+ )
1491
+
1492
+ right_panel_mode.change(
1493
+ fn=switch_panel,
1494
+ inputs=[right_panel_mode],
1495
+ outputs=[chat_panel, preview_panel],
1496
+ )
1497
+
1498
+ gr.Markdown("---")
1499
+
1500
+
1501
+
1502
+ # ===== HIDDEN STATE COMPONENTS FOR DOWNLOADS =====
1503
+ github_download = gr.File(label="Download", visible=False)
1504
+ local_download = gr.File(label="Download", visible=False)
1505
+ youtube_download = gr.File(label="Download", visible=False)
1506
+
1507
+ # ===== EVENT HANDLERS =====
1508
+
1509
+ # GitHub handlers
1510
+ github_analyze_btn.click(
1511
+ fn=lambda: "⏳ **Analyzing repository...** Please wait.",
1512
+ outputs=[github_status],
1513
+ ).then(
1514
+ fn=run_github_ingestion,
1515
+ inputs=[github_url],
1516
+ outputs=[preview_content, preview_source_info, github_state],
1517
+ ).then(
1518
+ fn=lambda: "βœ… **Analysis complete!** Click **Index for RAG** to enable chat.",
1519
+ outputs=[github_status],
1520
+ )
1521
+
1522
+ def index_github_with_status(state):
1523
+ status, new_state = index_github_repo(state)
1524
+ chunks = new_state.get("vector_chunks", 0) if new_state else 0
1525
+ if "βœ…" in status or chunks > 0:
1526
+ return f"βœ… **Indexed {chunks} vector chunks.** Ready for RAG queries!", new_state
1527
+ return status, new_state
1528
+
1529
+ def index_local_with_status(state):
1530
+ status, new_state = index_local_repo(state)
1531
+ chunks = new_state.get("vector_chunks", 0) if new_state else 0
1532
+ if "βœ…" in status or chunks > 0:
1533
+ return f"βœ… **Indexed {chunks} vector chunks.** Ready for RAG queries!", new_state
1534
+ return status, new_state
1535
+
1536
+ def index_youtube_with_status(state):
1537
+ status, new_state = index_youtube_video(state)
1538
+ chunks = new_state.get("vector_chunks", 0) if new_state else 0
1539
+ if "βœ…" in status or chunks > 0:
1540
+ return f"βœ… **Indexed {chunks} transcript chunks.** Ready for RAG queries!", new_state
1541
+ return status, new_state
1542
+
1543
+ github_index_btn.click(
1544
+ fn=lambda: "⏳ **Indexing...** Building vector embeddings.",
1545
+ outputs=[github_status],
1546
+ ).then(
1547
+ fn=index_github_with_status,
1548
+ inputs=[github_state],
1549
+ outputs=[github_status, github_state],
1550
+ )
1551
+
1552
+ def bookmark_with_refresh(state):
1553
+ status, new_state, dropdown = bookmark_github_repo(state)
1554
+ choices, meta = get_dropdown_options()
1555
+ return (
1556
+ "πŸ”– **Bookmarked!** Added to Knowledge Base for future chat sessions.",
1557
+ new_state,
1558
+ gr.Dropdown(choices=choices, value=None),
1559
+ gr.CheckboxGroup(choices=choices, value=[]),
1560
+ meta
1561
+ )
1562
+
1563
+ github_bookmark_btn.click(
1564
+ fn=bookmark_with_refresh,
1565
+ inputs=[github_state],
1566
+ outputs=[github_status, github_state, bookmarks_dropdown, bookmarks_checkboxes, bookmarks_metadata],
1567
+ )
1568
+
1569
+ # Local handlers
1570
+ local_analyze_btn.click(
1571
+ fn=lambda: "⏳ **Analyzing project...** Please wait.",
1572
+ outputs=[local_status],
1573
+ ).then(
1574
+ fn=run_local_repo_ingestion,
1575
+ inputs=[local_folder],
1576
+ outputs=[preview_content, preview_source_info, local_state],
1577
+ ).then(
1578
+ fn=lambda: "βœ… **Analysis complete!** Click **Index for RAG** to enable chat.",
1579
+ outputs=[local_status],
1580
+ )
1581
+
1582
+ local_index_btn.click(
1583
+ fn=lambda: "⏳ **Indexing...** Building vector embeddings.",
1584
+ outputs=[local_status],
1585
+ ).then(
1586
+ fn=index_local_with_status,
1587
+ inputs=[local_state],
1588
+ outputs=[local_status, local_state],
1589
+ )
1590
+
1591
+ def bookmark_local_with_refresh(state):
1592
+ status, new_state, dropdown = bookmark_local_repo(state)
1593
+ choices, meta = get_dropdown_options()
1594
+ return (
1595
+ "πŸ”– **Bookmarked!** Added to Knowledge Base for future chat sessions.",
1596
+ new_state,
1597
+ gr.Dropdown(choices=choices, value=None),
1598
+ gr.CheckboxGroup(choices=choices, value=[]),
1599
+ meta
1600
+ )
1601
+
1602
+ local_bookmark_btn.click(
1603
+ fn=bookmark_local_with_refresh,
1604
+ inputs=[local_state],
1605
+ outputs=[local_status, local_state, bookmarks_dropdown, bookmarks_checkboxes, bookmarks_metadata],
1606
+ )
1607
+
1608
+ # YouTube handlers
1609
+ youtube_analyze_btn.click(
1610
+ fn=lambda: "⏳ **Fetching transcript...** Please wait.",
1611
+ outputs=[youtube_status],
1612
+ ).then(
1613
+ fn=run_youtube_ingestion,
1614
+ inputs=[youtube_url],
1615
+ outputs=[preview_content, preview_source_info, youtube_state],
1616
+ ).then(
1617
+ fn=lambda: "βœ… **Analysis complete!** Click **Index for RAG** to enable chat.",
1618
+ outputs=[youtube_status],
1619
+ )
1620
+
1621
+ youtube_index_btn.click(
1622
+ fn=lambda: "⏳ **Indexing...** Building vector embeddings.",
1623
+ outputs=[youtube_status],
1624
+ ).then(
1625
+ fn=index_youtube_with_status,
1626
+ inputs=[youtube_state],
1627
+ outputs=[youtube_status, youtube_state],
1628
+ )
1629
+
1630
+ def bookmark_youtube_with_refresh(state):
1631
+ status, new_state, dropdown = bookmark_youtube_video(state)
1632
+ choices, meta = get_dropdown_options()
1633
+ return (
1634
+ "πŸ”– **Bookmarked!** Added to Knowledge Base for future chat sessions.",
1635
+ new_state,
1636
+ gr.Dropdown(choices=choices, value=None),
1637
+ gr.CheckboxGroup(choices=choices, value=[]),
1638
+ meta
1639
+ )
1640
+
1641
+ youtube_bookmark_btn.click(
1642
+ fn=bookmark_youtube_with_refresh,
1643
+ inputs=[youtube_state],
1644
+ outputs=[youtube_status, youtube_state, bookmarks_dropdown, bookmarks_checkboxes, bookmarks_metadata],
1645
+ )
1646
+
1647
+ # Chat handlers
1648
+ def handle_chat_with_history(question, history, github_s, local_s, youtube_s, selected_sources, meta):
1649
+ # Use first selected source from checkboxes, or fall back to indexed sources
1650
+ bookmark = selected_sources[0] if selected_sources else None
1651
+ answer = answer_chat_question(question, github_s, local_s, youtube_s, bookmark, meta)
1652
+ history = history or []
1653
+ history = history or []
1654
+ history.append({"role": "user", "content": question})
1655
+ history.append({"role": "assistant", "content": answer})
1656
+ return history, ""
1657
+
1658
+ send_btn.click(
1659
+ fn=handle_chat_with_history,
1660
+ inputs=[question_box, chatbot, github_state, local_state, youtube_state, bookmarks_checkboxes, bookmarks_metadata],
1661
+ outputs=[chatbot, question_box],
1662
+ )
1663
+
1664
+ question_box.submit(
1665
+ fn=handle_chat_with_history,
1666
+ inputs=[question_box, chatbot, github_state, local_state, youtube_state, bookmarks_checkboxes, bookmarks_metadata],
1667
+ outputs=[chatbot, question_box],
1668
+ )
1669
+
1670
+
1671
+
1672
+ # Chatbot Like/Dislike Handler
1673
+ def handle_chatbot_like(data: gr.LikeData):
1674
+ # Placeholder for future feedback logging
1675
+ print(f"User feedback: {'Liked' if data.liked else 'Disliked'} message index {data.index}")
1676
+ return None
1677
+
1678
+ chatbot.like(
1679
+ fn=handle_chatbot_like,
1680
+ outputs=None
1681
+ )
1682
+
1683
+ # Bookmark handlers - refresh both dropdown and checkboxes
1684
+ def refresh_all_bookmarks():
1685
+ choices, meta = get_dropdown_options()
1686
+ return (
1687
+ gr.Dropdown(choices=choices, value=None),
1688
+ gr.CheckboxGroup(choices=choices, value=[]),
1689
+ meta
1690
+ )
1691
+
1692
+ refresh_bookmarks_btn.click(
1693
+ fn=refresh_all_bookmarks,
1694
+ outputs=[bookmarks_dropdown, bookmarks_checkboxes, bookmarks_metadata],
1695
+ )
1696
+
1697
+ # View All button - show bookmark details
1698
+ def view_all_bookmarks(meta):
1699
+ if not meta:
1700
+ return "*No bookmarks found. Analyze and bookmark repositories to see them here.*"
1701
+ lines = ["**πŸ“š All Bookmarked Sources:**\n"]
1702
+ for m in meta:
1703
+ name = m.get("repo_name", "Unknown")
1704
+ date = m.get("last_pulled_display", "--")
1705
+ chunks = m.get("vector_chunks", 0)
1706
+ lines.append(f"- **{name}** β€” {date} β€” {chunks} chunks")
1707
+ return "\n".join(lines)
1708
+
1709
+ view_all_btn.click(
1710
+ fn=view_all_bookmarks,
1711
+ inputs=[bookmarks_metadata],
1712
+ outputs=[bookmark_info],
1713
+ )
1714
+
1715
+ # Update bookmark info when checkboxes change
1716
+ def update_checkbox_info(selected, meta):
1717
+ if not selected:
1718
+ return "*No sources selected. Check sources above to include in chat.*"
1719
+ lines = [f"**{len(selected)} source(s) selected:**\n"]
1720
+ for label in selected:
1721
+ m = find_metadata_by_label(label, meta or [])
1722
+ if m:
1723
+ lines.append(f"- {m.get('repo_name', label)} ({m.get('vector_chunks', 0)} chunks)")
1724
+ return "\n".join(lines)
1725
+
1726
+ bookmarks_checkboxes.change(
1727
+ fn=update_checkbox_info,
1728
+ inputs=[bookmarks_checkboxes, bookmarks_metadata],
1729
+ outputs=[bookmark_info],
1730
+ )
1731
+
1732
+ # Notepad download handlers
1733
+ def download_notepad_as_md(content):
1734
+ if not content or not content.strip():
1735
+ return None
1736
+ import tempfile
1737
+ with tempfile.NamedTemporaryFile(mode='w', suffix='.md', delete=False, encoding='utf-8') as f:
1738
+ f.write(content)
1739
+ return f.name
1740
+
1741
+ github_notepad_download.click(
1742
+ fn=download_notepad_as_md,
1743
+ inputs=[github_notepad],
1744
+ outputs=[github_notepad_download],
1745
+ )
1746
+
1747
+ local_notepad_download.click(
1748
+ fn=download_notepad_as_md,
1749
+ inputs=[local_notepad],
1750
+ outputs=[local_notepad_download],
1751
+ )
1752
+
1753
+ youtube_notepad_download.click(
1754
+ fn=download_notepad_as_md,
1755
+ inputs=[youtube_notepad],
1756
+ outputs=[youtube_notepad_download],
1757
+ )
1758
+
1759
+ # Preview Copy Handler (JS)
1760
+ preview_copy_btn.click(
1761
+ fn=None,
1762
+ inputs=[preview_content],
1763
+ js="(content) => { navigator.clipboard.writeText(content); return 'Copied!'; }",
1764
+ )
1765
+
1766
+ # Preview Download Handler
1767
+ preview_download_btn.click(
1768
+ fn=download_notepad_as_md, # Reusing the md download function
1769
+ inputs=[preview_content],
1770
+ outputs=[preview_download_btn],
1771
+ )
1772
+
1773
+ # Lab handlers with improved output
1774
+ def run_lab_with_code(intention, report):
1775
+ result = run_experimental_lab(intention, report)
1776
+ # Extract code if present in result
1777
+ if "```python" in result:
1778
+ code_start = result.find("```python") + 9
1779
+ code_end = result.find("```", code_start)
1780
+ code = result[code_start:code_end].strip() if code_end > code_start else ""
1781
+ return result, gr.Code(value=code, visible=True)
1782
+ return result, gr.Code(visible=False)
1783
+
1784
+ lab_start_btn.click(
1785
+ fn=run_lab_with_code,
1786
+ inputs=[lab_intention, lab_report_dropdown],
1787
+ outputs=[lab_output, lab_code_display],
1788
+ )
1789
+
1790
+ def fix_lab_with_instruction(intention, report, fix_instruction):
1791
+ # Pass the fix instruction to the fix function
1792
+ combined = f"{intention}\n\nFIX REQUEST: {fix_instruction}" if fix_instruction else intention
1793
+ return lab_fix_bugs(combined, report)
1794
+
1795
+ lab_fix_btn.click(
1796
+ fn=fix_lab_with_instruction,
1797
+ inputs=[lab_intention, lab_report_dropdown, lab_fix_instruction],
1798
+ outputs=[lab_output],
1799
+ )
1800
+
1801
+ lab_happy_btn.click(
1802
+ fn=lab_mark_happy,
1803
+ inputs=[lab_intention, lab_report_dropdown],
1804
+ outputs=[lab_output],
1805
+ )
1806
+
1807
+ # Export lab code to file
1808
+ def export_lab_code(code_content):
1809
+ if not code_content:
1810
+ return None
1811
+ import tempfile
1812
+ with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
1813
+ f.write(code_content)
1814
+ return f.name
1815
+
1816
+ lab_export_btn.click(
1817
+ fn=export_lab_code,
1818
+ inputs=[lab_code_display],
1819
+ outputs=[lab_download],
1820
+ ).then(
1821
+ fn=lambda: gr.File(visible=True),
1822
+ outputs=[lab_download],
1823
+ )
1824
+
1825
+ lab_refresh_reports_btn.click(
1826
+ fn=_refresh_lab_reports_dropdown,
1827
+ outputs=[lab_report_dropdown],
1828
+ )
1829
+
1830
+ # Load bookmarks on startup (refresh both dropdown and checkboxes)
1831
+ demo.load(
1832
+ fn=refresh_all_bookmarks,
1833
+ outputs=[bookmarks_dropdown, bookmarks_checkboxes, bookmarks_metadata],
1834
+ )
1835
+
1836
+ return demo, (app_theme if IS_GRADIO_V6 else None), (custom_css if IS_GRADIO_V6 else None)
1837
 
1838
 
1839
  if __name__ == "__main__":
1840
+ demo, app_theme, custom_css = build_interface()
1841
+ launch_kwargs = {"share": False}
1842
+ if IS_GRADIO_V6:
1843
+ launch_kwargs.update(theme=app_theme, css=custom_css)
1844
+ demo.launch(**launch_kwargs)