Textilindo-AI / DEPLOYMENT.md
harismlnaslm's picture
Configure to use HUGGINGFAC_API_KEY_2 and Meta Llama 3.1 8B as default model
4c10d31

Textilindo AI Assistant - Hugging Face Spaces Deployment Guide

πŸš€ Quick Setup for Hugging Face Spaces

1. Environment Variables Required

Set these environment variables in your Hugging Face Space settings:

# Required: Hugging Face API Key (use your secret variable name)
HUGGINGFAC_API_KEY_2=your_huggingface_api_key_here

# Optional: Default model (defaults to Llama 3.1 8B Instruct)
DEFAULT_MODEL=meta-llama/Llama-3.1-8B-Instruct

# Optional: Alternative models
# DEFAULT_MODEL=meta-llama/Llama-3.2-1B-Instruct
# DEFAULT_MODEL=meta-llama/Llama-3.2-3B-Instruct
# DEFAULT_MODEL=gpt2
# DEFAULT_MODEL=distilgpt2

2. Files Structure

Your Hugging Face Space should contain:

β”œβ”€β”€ app.py                    # Main FastAPI application
β”œβ”€β”€ Dockerfile               # Docker configuration
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ README.md               # Space description
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ system_prompt.md  # AI system prompt
β”‚   └── training_config.yaml # Training configuration
β”œβ”€β”€ data/
β”‚   └── lora_dataset_*.jsonl  # Training datasets
└── templates/
    └── chat.html           # Chat interface (optional)

3. Deployment Steps

  1. Create a new Hugging Face Space:

  2. Upload files:

    • Clone your space repository
    • Copy all files from this project
    • Commit and push to your space
  3. Set environment variables:

    • Go to your space settings
    • Add the required environment variables
    • Make sure to set HUGGINGFACE_API_KEY
  4. Deploy:

    • Your space will automatically build and deploy
    • Check the logs for any issues

4. API Endpoints

Once deployed, your space will have these endpoints:

  • GET / - Main chat interface
  • POST /chat - Chat API endpoint
  • GET /health - Health check
  • GET /info - Application information

5. Usage Examples

Chat API

curl -X POST "https://your-space-name.hf.space/chat" \
  -H "Content-Type: application/json" \
  -d '{"message": "dimana lokasi textilindo?"}'

Health Check

curl "https://your-space-name.hf.space/health"

6. Troubleshooting

Common Issues:

  1. "HUGGINGFACE_API_KEY not found"

    • Make sure you've set the environment variable in your space settings
    • The app will use mock responses if no API key is provided
  2. Model loading errors

    • Check if the model name is correct
    • Try using a lighter model like meta-llama/Llama-3.2-1B-Instruct
  3. Memory issues

    • Hugging Face Spaces have limited memory
    • Use smaller models or reduce batch sizes
  4. Build failures

    • Check the build logs in your space
    • Ensure all dependencies are in requirements.txt

7. Customization

Change the Model:

Update the DEFAULT_MODEL environment variable:

DEFAULT_MODEL=meta-llama/Llama-3.2-1B-Instruct

Update System Prompt:

Edit configs/system_prompt.md and redeploy.

Add More Training Data:

Add more JSONL files to the data/ directory.

8. Performance Optimization

For better performance on Hugging Face Spaces:

  1. Use smaller models:

    • meta-llama/Llama-3.2-1B-Instruct (1B parameters)
    • microsoft/DialoGPT-medium (355M parameters)
  2. Optimize system prompt:

    • Keep it concise
    • Remove unnecessary instructions
  3. Monitor resource usage:

    • Check the space logs
    • Use the /health endpoint

9. Security Notes

  • Never commit API keys to your repository
  • Use environment variables for sensitive data
  • The app includes CORS middleware for web access
  • All user inputs are logged (check logs for debugging)

10. Support

If you encounter issues:

  1. Check the space logs
  2. Verify environment variables
  3. Test with the /health endpoint
  4. Try the mock responses (without API key)

For more help, check the Hugging Face Spaces documentation: https://huggingface.co/docs/hub/spaces