chrisrutherford
/

pumlGenV2

@@ -1,125 +0,0 @@
----
-library_name: transformers
-license: apache-2.0
-base_model: Qwen/Qwen3-8B-Base
-tags:
-- llama-factory
-- full
-- generated_from_trainer
-- plantuml
-- troubleshooting
-- conversational-ai
-model-index:
-- name: pumlChat
-  results: []
----
-# pumlChat
-This model is a fine-tuned version of [Qwen/Qwen3-8B-Base](https://huggingface.co/Qwen/Qwen3-8B-Base) on a PlantUML-based troubleshooting dataset. It specializes in providing conversational fault-finding assistance based on PlantUML system diagrams.
-## Model description
-pumlChat is a conversational AI model that ingests PlantUML diagrams describing complex systems and provides interactive troubleshooting support. The model understands system components, their relationships, failure modes, and dependencies from PlantUML specifications, then engages in natural dialogue to help diagnose and resolve issues.
-Key features:
-- Interprets PlantUML system diagrams with component specifications
-- Provides step-by-step troubleshooting guidance
-- Understands component interdependencies and cascading failures
-- Offers risk-aware diagnostic suggestions
-- Maintains conversational context throughout the troubleshooting process
-## Intended uses & limitations
-### Intended uses
-- **Technical support automation**: Interactive troubleshooting for complex systems
-- **Maintenance assistance**: Guided fault diagnosis based on system specifications
-- **Training and education**: Teaching systematic troubleshooting approaches
-- **Documentation companion**: Converting static system diagrams into interactive help
-### Limitations
-- Requires properly formatted PlantUML diagrams with detailed component specifications
-- Best suited for systems with well-defined components and failure modes
-- Troubleshooting quality depends on the completeness of the PlantUML specification
-- Should not replace critical safety systems or professional maintenance where required
-## Training and evaluation data
-The model was trained on conversational datasets where:
-- **System context**: Detailed PlantUML diagrams describing system components, their functions, specifications, failure modes, and interdependencies
-- **Conversations**: Multi-turn dialogues demonstrating systematic troubleshooting approaches
-- **Domain**: Focus on electromechanical systems (e.g., Segway personal transporters)
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 1
-- eval_batch_size: 8
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 16
-- total_train_batch_size: 128
-- total_eval_batch_size: 64
-- optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
-- num_epochs: 3.0
-### Training results
-The model demonstrates strong capabilities in:
-- Understanding complex system relationships from PlantUML
-- Systematic troubleshooting progression
-- Risk-aware diagnostic recommendations
-- Natural conversational flow while maintaining technical accuracy
-### Framework versions
-- Transformers 4.52.3
-- Pytorch 2.6.0+cu124
-- Datasets 3.6.0
-- Tokenizers 0.21.1
-## Usage Example
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-# Load model and tokenizer
-model = AutoModelForCausalLM.from_pretrained("your-username/pumlChat")
-tokenizer = AutoTokenizer.from_pretrained("your-username/pumlChat")
-# Prepare the PlantUML diagram and user query
-plantuml_diagram = """@startuml
-... your system PlantUML diagram ...
-@enduml"""
-messages = [
-    {"from": "human", "value": plantuml_diagram},
-    {"from": "gpt", "value": "system understood"},
-    {"from": "human", "value": "My device is making a clicking noise and won't start. Can you help?"}
-]
-# Format and generate response
-input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-inputs = tokenizer(input_text, return_tensors="pt")
-outputs = model.generate(
-    **inputs,
-    max_length=512,
-    temperature=0.7,
-    do_sample=True,
-    pad_token_id=tokenizer.eos_token_id
-)
-response = tokenizer.decode(outputs[0], skip_special_tokens=True)
-print(response)
-```
-Example output
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/65be16980a0c57943fbe8b00/xNJOszw2AGvXbpIBXrIiU.png)