# BhashyamAI Project Skills & Development Guidelines This document outlines the operational patterns and developer skills for the BhashyamAI Research Portal. ## 🏗️ Architecture & Stack - **Backend**: FastAPI (Python). Serves static production frontend and handles Graph/Vector search. - **Frontend**: React (TypeScript/Vite/Tailwind). - **Database**: ArcadeDB (Graph/FTS) and ChromaDB (Vector). - **Schema**: See [DB_DESIGN.md](./DB_DESIGN.md) for the authoritative graph structure and canonical metadata schema. ## 🔍 Search & Research Logic ### 1. Hybrid Search Paradigm - **Precision FTS**: Combined with vector search for optimal relevance. - **Node-Based Graph Search**: Prioritize specialized traversal tools for high-efficiency entity discovery: - `search_by_topic`, `search_by_character`, `search_by_author`, `search_by_location`. - **Smart Disambiguation**: Use `check_entity_type(name)` to verify entity types (Character vs. Topic) before executing a search. - **Robust Randomization**: Set `is_random=True` in search tools to offload sampling to the database. ### 2. Scholarly Comparative Analysis - **System Prompt**: Dynamically generated to guide LLM handling. - **Entity Linking**: Use `entity://` protocol for citations. - **Cross-Scripture Search**: Utilize `POST /api/search/entity` for paginated discovery across the entire graph. ### 3. Data-Driven Hierarchical Search - **Dynamic Discovery**: Hierarchy prefixes and scripture metadata discovered via `SanatanConfig`. - **Regex-based Filtering**: `hierarchical_path` filters are permissive regex. - **Consistent Ordering**: All retrieval methods strictly ordered by `v._global_index ASC`. - **Graph-based TOC**: Uses hierarchical graph traversal ([DB_DESIGN.md](./DB_DESIGN.md)). ## 🛠️ Development Practices - **Cypher Centralization**: Always use `modules/db/cypher_templates.py` for queries. - **Observability**: Decorate all tool functions with `@log_tool_entry` (in `modules/db/logger_utils.py`) to capture invocations and tracebacks. - **Standardization First**: Always canonicalize documents using `SanatanConfig().canonicalize_document` before returning API responses. - **API Speed**: Use `include_transliteration=False` in `canonicalize_document` for high-volume or lean API endpoints. ## 📋 Standard Workflow 1. **Frontend Changes**: Update code in `frontend/src`. 2. **Build**: Run `npm run build` in the `frontend/` directory. 3. **Backend Changes**: Add/modify endpoints in `server.py` or modules. 4. **Validation**: - `uv run python tests/test_api_endpoints.py` - `uv run python tests/debug_...py` (Graph consistency checks). 5. **Serving**: Ensure production build is served by FastAPI.