File size: 6,155 Bytes

bfc2180


# AI Chatbot System Technical Documentation

---

## 1. Executive Summary

This document specifies the architecture, operational components, and deployment workflow for the AI-driven chatbot system. It is intended for engineering teams responsible for system integration, maintenance, and scalability.

---

## 2. System Capabilities

- **Natural Language Understanding**: Implements advanced parsing to interpret user intents and entities.
- **Policy Enforcement**: Utilizes Colang-defined guardrails to ensure compliance with domain-specific and safety requirements.
- **Low-Latency Responses**: Achieves sub-second turnaround via event-based orchestration.
- **Modular Extensibility**: Supports pluggable integrations with external APIs, databases, and analytics pipelines.

---

## 3. Architectural Components

### 3.1 Custom Language Model

- **Model Architecture**: Fine-tuned Mistral 7B large language model, optimized for dialogue tasks.
- **Configuration File**: Defined using Ollama’s ModelFile format (`model.yaml`), specifying base checkpoint, sampling parameters, and role-based prompt templates.
- **Artifact Packaging**: Converted to `.gguf` (GPT-Generated Unified Format) to facilitate efficient loading and inference.

   ``` bash
   
      git clone https://github.com/mattjamo/OllamaToGGUF.git
      cd OllamaToGGUF
      python OllamaToGGUF.py

   ```

- **Repository Deployment**: Published to Hugging Face Model Hub via automated CLI processes, with commit metadata linked to JIRA issue tracking.

   ``` bash
   
      huggingface-cli upload <your-username>/<your-model-name> . .
   
   ```

### 3.2 NVIDIA NeMo Guardrails

- **Function**: Applies programmable constraints to user-system interactions to enforce safe and contextually appropriate dialogues.
- **Colang Files**: All `.co` artifacts define the Colang modeling language syntax, including blocks, statements, expressions, keywords, and variables. The primary block types are:
  - **User Message Block** (`define user ...`)
  - **Flow Block** (`define flow ...`)
  - **Bot Message Block** (`define bot ...`)
- **Directory Layout**:

  ```plaintext
  
  config/
  ├── rails/          # Colang flow definitions (.co)
  ├── prompts.yml     # Prompt templates and trigger mappings
  ├── config.yml      # Guardrails engine settings and routing rules
  └── actions.py      # Custom callbacks for external services
   ```


### 3.3 Orchestration with n8n

* **Webhook Listener**: Exposes HTTP POST endpoint to receive JSON-formatted user queries.
* **Policy Validation Node**: Routes incoming content to the Guardrails engine; invalid or unsafe inputs are replaced with safe completions.
* **Inference Node**: Forwards validated prompts to the Mistral 7B inference API and awaits generated output.
* **Response Dispatcher**: Consolidates model outputs and returns them to clients in standardized JSON responses.

### 3.4 Open WebUI Front-End

* **UI Framework**: Based on the Open WebUI library, providing a reactive chat interface.
* **Features**:

  * Real-time streaming of text and multimedia.
  * Quick-reply button generation.
  * Resilient error handling for network or validation interruptions.

---

## 4. Deployment Workflow

<!-- ### 4.1 Prerequisites

* Docker Engine & Docker Compose
* Node.js (v16+) and npm
* Python 3.10+ with `nemo-guardrails`
* Ollama CLI for model export

### 4.2 Model Preparation

1. **ModelFile Definition**: Create `model.yaml` with base model reference (`mistral-7b`), sampling hyperparameters, and role-based prompts.
2. **Model Conversion**:

   ```bash
   ollama export mistral-7b --output model.gguf
   ```
3. **Artifact Publication**:

   ```bash
   git clone https://huggingface.co/<org>/mistral-7b-gguf
   cp model.gguf mistral-7b-gguf/
   cd mistral-7b-gguf
   git add model.gguf
   git commit -m "JIRA-###: Add Mistral 7B gguf model"
   git push
   ```

### 4.3 Guardrails Initialization

1. Construct the `config/` directory structure as outlined in Section 3.2.
2. Populate `rails/` with Colang `.co` definitions.
3. Install dependencies:

   ```bash
   pip install nemo-guardrails
   ```
4. Launch the Guardrails engine:

   ```bash
   guardrails run --config config/config.yml
   ```

### 4.4 n8n Orchestration Deployment

1. Place `chatbot.json` workflow definition in `n8n/workflows/`.
2. Start n8n via Docker Compose:

   ```bash
   docker-compose up -d n8n
   ```

### 4.5 Front-End Deployment

```bash
cd open-webui
npm install
# Update API endpoint in config
npm run dev
``` -->

### 4.6 FastAPI Integration

Integrate the model and guardrails engine behind a FastAPI service:

```python
from pydantic import BaseModel
from nemoguardrails import RailsConfig, LLMRails
from fastapi import FastAPI

# FastAPI
app = FastAPI(title = "modelkai")

# Configuration of guardrails
config = RailsConfig.from_path("./config")
rails = LLMRails(config, verbose=True)

class ChatRequest(BaseModel):
    message: str

@app.post("/chat")
async def chat_endpoint(request: ChatRequest):
    response = await rails.generate_async(
        messages=[{"role": "user", "content": request.message}]
    )
    return {"response": response["content"]}

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=5000)

```

<!-- ---

## 5. Operational Procedures

1. **Receive User Input**: Front-end transmits message to n8n.
2. **Enforce Policies**: Guardrails engine evaluates content; unsafe inputs invoke fallback dialogues.
3. **Generate Response**: Sanitized prompts are processed by the LLM inference endpoint.
4. **Deliver Output**: n8n returns the structured response to the client.

---

## 6. Maintenance and Diagnostics

* **Model Updates**: Re-export `.gguf` artifacts and update repository as per Section 4.2.
* **Guardrail Tuning**: Modify Colang `.co` definitions, test via CLI, and redeploy engine.
* **Workflow Monitoring**: Utilize n8n’s built-in analytics dashboard for node-level logs.
* **UI Troubleshooting**: Inspect browser developer console for errors and verify API endpoint configurations.

---

*Document generated based on source materials.*

```
 -->