Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

.gitignore +4 -4
Readme.md +198 -0
docker-compose.yml +33 -0

.gitignore CHANGED Viewed

@@ -1,5 +1,5 @@
 myvenv/
-*.ipynb
-**/__pycache__/
-**/myvenv/
-*.md

 myvenv/
+data/
+__pycache__/
+*.gguf
+*.ipynb

Readme.md ADDED Viewed

	@@ -0,0 +1,198 @@

+# AI Chatbot System Technical Documentation
+---
+## 1. Executive Summary
+This document specifies the architecture, operational components, and deployment workflow for the AI-driven chatbot system. It is intended for engineering teams responsible for system integration, maintenance, and scalability.
+---
+## 2. System Capabilities
+- **Natural Language Understanding**: Implements advanced parsing to interpret user intents and entities.
+- **Policy Enforcement**: Utilizes Colang-defined guardrails to ensure compliance with domain-specific and safety requirements.
+- **Low-Latency Responses**: Achieves sub-second turnaround via event-based orchestration.
+- **Modular Extensibility**: Supports pluggable integrations with external APIs, databases, and analytics pipelines.
+---
+## 3. Architectural Components
+### 3.1 Custom Language Model
+- **Model Architecture**: Fine-tuned Mistral 7B large language model, optimized for dialogue tasks.
+- **Configuration File**: Defined using Ollama’s ModelFile format (`model.yaml`), specifying base checkpoint, sampling parameters, and role-based prompt templates.
+- **Artifact Packaging**: Converted to `.gguf` (GPT-Generated Unified Format) to facilitate efficient loading and inference.
+   ``` bash
+      git clone https://github.com/mattjamo/OllamaToGGUF.git
+      cd OllamaToGGUF
+      python OllamaToGGUF.py
+   ```
+- **Repository Deployment**: Published to Hugging Face Model Hub via automated CLI processes, with commit metadata linked to JIRA issue tracking.
+   ``` bash
+      huggingface-cli upload <your-username>/<your-model-name> . .
+   ```
+### 3.2 NVIDIA NeMo Guardrails
+- **Function**: Applies programmable constraints to user-system interactions to enforce safe and contextually appropriate dialogues.
+- **Colang Files**: All `.co` artifacts define the Colang modeling language syntax, including blocks, statements, expressions, keywords, and variables. The primary block types are:
+  - **User Message Block** (`define user ...`)
+  - **Flow Block** (`define flow ...`)
+  - **Bot Message Block** (`define bot ...`)
+- **Directory Layout**:
+  ```plaintext
+  config/
+  ├── rails/          # Colang flow definitions (.co)
+  ├── prompts.yml     # Prompt templates and trigger mappings
+  ├── config.yml      # Guardrails engine settings and routing rules
+  └── actions.py      # Custom callbacks for external services
+   ```
+### 3.3 Orchestration with n8n
+* **Webhook Listener**: Exposes HTTP POST endpoint to receive JSON-formatted user queries.
+* **Policy Validation Node**: Routes incoming content to the Guardrails engine; invalid or unsafe inputs are replaced with safe completions.
+* **Inference Node**: Forwards validated prompts to the Mistral 7B inference API and awaits generated output.
+* **Response Dispatcher**: Consolidates model outputs and returns them to clients in standardized JSON responses.
+### 3.4 Open WebUI Front-End
+* **UI Framework**: Based on the Open WebUI library, providing a reactive chat interface.
+* **Features**:
+  * Real-time streaming of text and multimedia.
+  * Quick-reply button generation.
+  * Resilient error handling for network or validation interruptions.
+---
+## 4. Deployment Workflow
+<!-- ### 4.1 Prerequisites
+* Docker Engine & Docker Compose
+* Node.js (v16+) and npm
+* Python 3.10+ with `nemo-guardrails`
+* Ollama CLI for model export
+### 4.2 Model Preparation
+1. **ModelFile Definition**: Create `model.yaml` with base model reference (`mistral-7b`), sampling hyperparameters, and role-based prompts.
+2. **Model Conversion**:
+   ```bash
+   ollama export mistral-7b --output model.gguf
+   ```
+3. **Artifact Publication**:
+   ```bash
+   git clone https://huggingface.co/<org>/mistral-7b-gguf
+   cp model.gguf mistral-7b-gguf/
+   cd mistral-7b-gguf
+   git add model.gguf
+   git commit -m "JIRA-###: Add Mistral 7B gguf model"
+   git push
+   ```
+### 4.3 Guardrails Initialization
+1. Construct the `config/` directory structure as outlined in Section 3.2.
+2. Populate `rails/` with Colang `.co` definitions.
+3. Install dependencies:
+   ```bash
+   pip install nemo-guardrails
+   ```
+4. Launch the Guardrails engine:
+   ```bash
+   guardrails run --config config/config.yml
+   ```
+### 4.4 n8n Orchestration Deployment
+1. Place `chatbot.json` workflow definition in `n8n/workflows/`.
+2. Start n8n via Docker Compose:
+   ```bash
+   docker-compose up -d n8n
+   ```
+### 4.5 Front-End Deployment
+```bash
+cd open-webui
+npm install
+# Update API endpoint in config
+npm run dev
+``` -->
+### 4.6 FastAPI Integration
+Integrate the model and guardrails engine behind a FastAPI service:
+```python
+from pydantic import BaseModel
+from nemoguardrails import RailsConfig, LLMRails
+from fastapi import FastAPI
+# FastAPI
+app = FastAPI(title = "modelkai")
+# Configuration of guardrails
+config = RailsConfig.from_path("./config")
+rails = LLMRails(config, verbose=True)
+class ChatRequest(BaseModel):
+    message: str
+@app.post("/chat")
+async def chat_endpoint(request: ChatRequest):
+    response = await rails.generate_async(
+        messages=[{"role": "user", "content": request.message}]
+    )
+    return {"response": response["content"]}
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=5000)
+```
+<!-- ---
+## 5. Operational Procedures
+1. **Receive User Input**: Front-end transmits message to n8n.
+2. **Enforce Policies**: Guardrails engine evaluates content; unsafe inputs invoke fallback dialogues.
+3. **Generate Response**: Sanitized prompts are processed by the LLM inference endpoint.
+4. **Deliver Output**: n8n returns the structured response to the client.
+---
+## 6. Maintenance and Diagnostics
+* **Model Updates**: Re-export `.gguf` artifacts and update repository as per Section 4.2.
+* **Guardrail Tuning**: Modify Colang `.co` definitions, test via CLI, and redeploy engine.
+* **Workflow Monitoring**: Utilize n8n’s built-in analytics dashboard for node-level logs.
+* **UI Troubleshooting**: Inspect browser developer console for errors and verify API endpoint configurations.
+---
+*Document generated based on source materials.*
+```
+ -->

docker-compose.yml CHANGED Viewed

@@ -11,3 +11,36 @@ services:
       - "5678:5678"
     depends_on:
       - api

       - "5678:5678"
     depends_on:
       - api
+    environment:
+      - N8N_SECURE_COOKIE=false
+      - N8N_PROTOCOL=http
+      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=false
+      - DB_POSTGRESDB_PASSWORD=dbpass
+      - N8N_OWNER_EMAIL=[email protected]
+      - N8N_OWNER_PASSWORD=yourStrongPassword
+      - N8N_ENCRYPTION_KEY=yourEncryptionKey
+  openweb:
+    image: ghcr.io/open-webui/open-webui:main
+    container_name: open-webui
+    ports:
+      - "3000:8080"
+    volumes:
+      - openwebui_data:/app/backend/data
+    environment:
+    # Disable multi-user login (optional)
+      - WEBUI_AUTH=False
+      # If you want Open WebUI to hit your FastAPI or n8n endpoints,
+      # you can point it here, e.g.:
+      # - API_BASE_URL=http://fastapi:8000
+    depends_on:
+      - api
+      - n8n
+volumes:
+  openwebui_data:
+networks:
+  default:
+    driver: bridge