Instructions to use kimdonghwanAIengineer/coflux-ai-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use kimdonghwanAIengineer/coflux-ai-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="kimdonghwanAIengineer/coflux-ai-gguf", filename="model-q8_0.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use kimdonghwanAIengineer/coflux-ai-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0 # Run inference directly in the terminal: llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0 # Run inference directly in the terminal: llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
Use Docker
docker model run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
- LM Studio
- Jan
- Ollama
How to use kimdonghwanAIengineer/coflux-ai-gguf with Ollama:
ollama run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
- Unsloth Studio
How to use kimdonghwanAIengineer/coflux-ai-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting
- Docker Model Runner
How to use kimdonghwanAIengineer/coflux-ai-gguf with Docker Model Runner:
docker model run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
- Lemonade
How to use kimdonghwanAIengineer/coflux-ai-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
Run and chat with the model
lemonade run user.coflux-ai-gguf-Q8_0
List all available models
lemonade list
license: llama3 language: - en library_name: gguf tags: - cybersecurity - security-scanning - workflow-automation - p2p - edge-ai - llama-3 - lora - gguf base_model: meta-llama/Meta-Llama-3-8B-Instruct pipeline_tag: text-generation model-index: - name: CoFlux AI results: []
CoFlux AI — Edge Security & Workflow Model
CoFlux AI is a lightweight, privacy-first AI model designed for the P2P AI Bridge System — a serverless collaboration platform where the host PC acts as the central hub. This model runs entirely on the host PC with zero cloud dependency, providing real-time security scanning and workflow automation within P2P tunnels.
Model Details
| Base Model | Meta Llama 3 8B Instruct |
| Fine-tuning | LoRA (Low-Rank Adaptation), 3-stage training |
| Format | GGUF Q8_0 (8-bit quantization) |
| Size | ~8.54 GB |
| Parameters | 8B |
| Runtime | Host PC only (mobile devices send requests via P2P tunnel) |
| License | Llama 3 Community License |
Intended Use
CoFlux AI is purpose-built for two core tasks within a P2P collaboration environment:
🔒 Security Scanning (Edge Security)
- Detect malicious payloads (code, text, binary) from connected peers
- Identify vulnerability patterns based on CVE data
- Score risk levels and enforce allow/deny decisions
- All scanning happens locally — no data leaves the host PC
⚡ Workflow Automation
- Summarize collaboration messages and code reviews
- Auto-tag and classify incoming content
- Context-aware routing for task management
- Template-based document generation
Training Pipeline
CoFlux AI was trained in 3 stages using LoRA fine-tuning on a 16GB VRAM GPU:
Stage 1: Security Domain Pretraining
- Dataset: Primus-Seed (Trend Micro AI Lab)
- Content: Curated cybersecurity text from MITRE, Wikipedia, and security company websites
- Purpose: Inject cybersecurity domain knowledge into the base model
Stage 2: Instruction Tuning
- Datasets:
- Primus-Instruct — Expert-curated cybersecurity QA tasks
- DetectVul/CVEFixes — Python vulnerability detection at statement level (21,571 functions, 7 vulnerability types)
- Purpose: Teach the model to follow security scanning and workflow automation instructions
Stage 3: Reasoning Distillation
- Dataset: Primus-Reasoning (distilled from o1-preview and DeepSeek-R1)
- Purpose: Enhance multi-step security reasoning for complex threat analysis
Training Configuration
LoRA rank: 16
LoRA alpha: 32
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization: 4-bit (NF4) during training
Optimizer: paged_adamw_8bit
Gradient checkpointing: enabled
How to Use
With llama.cpp
# Download the model
huggingface-cli download kimdonghwanAIengineer/coflux-ai-gguf --local-dir ./coflux-ai
# Run with llama.cpp
./llama-cli -m ./coflux-ai/coflux-ai-q8_0.gguf -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a security scanning module for a P2P collaboration system.<|eot_id|><|start_header_id|>user<|end_header_id|>
Scan this code for vulnerabilities:
\`\`\`python
import subprocess
user_input = input()
subprocess.call(user_input, shell=True)
\`\`\`<|eot_id|><|start_header_id|>assistant<|end_header_id|>" -n 256
With llama-cpp-python
from llama_cpp import Llama
llm = Llama(
model_path="./coflux-ai-q8_0.gguf",
n_ctx=2048,
n_gpu_layers=-1,
)
# Security scanning
output = llm.create_chat_completion(
messages=[
{
"role": "system",
"content": "You are a security scanning module for a P2P collaboration system. Analyze code for vulnerabilities and respond with a risk assessment."
},
{
"role": "user",
"content": "Scan this code:\n```python\nimport os\nos.system(input('cmd: '))\n```"
}
],
temperature=0.1,
max_tokens=512,
)
print(output["choices"][0]["message"]["content"])
# Workflow - Summarization
output = llm.create_chat_completion(
messages=[
{
"role": "system",
"content": "You are a workflow automation module. Summarize the given content concisely."
},
{
"role": "user",
"content": "Summarize: The team discussed Q3 roadmap. Backend focuses on API optimization. Frontend is redesigning the dashboard. Security audit next week."
}
],
temperature=0.3,
max_tokens=256,
)
print(output["choices"][0]["message"]["content"])
With Transformers + PEFT (for further fine-tuning)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
# Load LoRA adapter (if available separately)
# model = PeftModel.from_pretrained(base_model, "path/to/adapter")
System Architecture
CoFlux AI operates within the P2P AI Bridge System:
┌─────────────────────────────────────────────────┐
│ Host PC (Hub) │
│ │
│ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Rust │ │ CoFlux AI │ │ TypeScript │ │
│ │ Core │→ │ (This │→ │ AI Router │ │
│ │ Security │ │ Model) │ │ + Workflow │ │
│ │ Scan │ │ │ │ │ │
│ └──────────┘ └───────────┘ └──────────────┘ │
│ ↑ ↑ ↑ │
│ └──────── Tauri IPC ──────────┘ │
│ ↑ │
│ WebRTC DataChannel │
│ (P2P, no central server) │
└──────────┬───────────┬──────────────────────────┘
│ │
┌──────┴──┐ ┌─────┴───┐
│ Mobile │ │ Guest │
│ Client │ │ Device │
│ (React │ │ │
│ Native) │ │ │
└─────────┘ └─────────┘
- Mobile & Guest devices send requests via P2P tunnel
- CoFlux AI processes all inference locally on the host PC
- Zero data leakage — nothing leaves the P2P network
Limitations
- Model size: 8B parameters limits complex generation tasks (e.g., full page creation). For advanced generation, the system routes to external APIs via BYOK (Bring Your Own Key).
- Language: Primarily trained on English cybersecurity data. Performance on other languages may vary.
- Scope: Optimized for security scanning and workflow automation. Not intended as a general-purpose chatbot.
- Code coverage: Vulnerability detection is strongest for Python (CVEFixes training data). Other languages rely on pattern-based detection from Primus datasets.
Privacy & Security
This model is designed with a privacy-first philosophy:
- Runs 100% locally on the host PC — no cloud API calls for inference
- All data stays within the P2P tunnel (WebRTC DataChannel)
- Part of a Defense in Depth security architecture with 5 protection layers
- Supports user-controlled privacy toggles (opt-out of conversation storage)
Citation
If you use CoFlux AI in your research or project, please cite:
@misc{coflux-ai-2025,
title={CoFlux AI: Edge Security and Workflow Model for P2P Collaboration},
author={Kim Donghwan},
year={2025},
url={https://huggingface.co/kimdonghwanAIengineer/coflux-ai-gguf}
}
Acknowledgments
- Downloads last month
- 8
8-bit
Model tree for kimdonghwanAIengineer/coflux-ai-gguf
Base model
meta-llama/Meta-Llama-3-8B