gpt-oss-pumlGenV1 / README.md

chrisrutherford

Update README.md

5fc75ae verified about 1 month ago

preview code

raw

history blame

4.19 kB

metadata

license: apache-2.0
base_model:
  - openai/gpt-oss-20b

GPT-OSS PlantUML Generation Model V1

Model Description

GPT-OSS PlantUML Generation Model V1 is a fine-tuned language model specialised in generating PlantUML diagrams from natural language descriptions. The model excels at creating complex conceptual diagrams that map philosophical, mathematical, and scientific concepts across different domains and historical periods.

Model Details

Base Model: GPT-OSS architecture
Model Type: Causal Language Model
Language(s): English
License: Apache 2.0
Fine-tuned from: openai/gpt-oss-20b (abliterated by huihui.ai)

Training Details

Training Data

The model was fine-tuned on the PumlGenV1 dataset of natural language descriptions paired with corresponding PlantUML diagram code.

Training Configuration

Optimiser: AdamW 8-bit
Learning Rate Schedule: LoRA (Low-Rank Adaptation)
- LoRA Rank: 1000
- LoRA Alpha: 2000
Training Epochs: 3
Batch Size: 1
Gradient Accumulation Steps: 16
Effective Batch Size: 16

Training Infrastructure

Fine-tuning approach: Parameter-efficient fine-tuning with LoRA
Memory optimisation: 8-bit AdamW optimiser

Intended Use

Primary Use Cases

Academic Research: Visualising complex philosophical and scientific concepts
Educational Content: Creating diagrams for teaching abstract ideas
Documentation: Generating visual representations of conceptual frameworks
Knowledge Mapping: Illustrating relationships between ideas across disciplines

Example Usage

Input Prompt:

Map the evolution of the concept of 'nothing' from Parmenides through Buddhist śūnyatā to quantum vacuum fluctuations, showing philosophical, mathematical, and physical interpretations

Expected Output: The model generates PlantUML code that creates a comprehensive diagram.

Usage Examples

Basic Usage

Python

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("chrisrutherford/gpt-oss-pumlGenV1")
model = AutoModelForCausalLM.from_pretrained("chrisrutherford/gpt-oss-pumlGenV1")

prompt = "Map the evolution of the concept of 'nothing' from Parmenides through Buddhist śūnyatā to quantum vacuum fluctuations, showing philosophical, mathematical, and physical interpretations"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=1000)
puml_code = tokenizer.decode(outputs[0], skip_special_tokens=True)

Python with Chat Template

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("chrisrutherford/gpt-oss-pumlGenV1")
model = AutoModelForCausalLM.from_pretrained("chrisrutherford/gpt-oss-pumlGenV1")

# Format with chat template
messages = [
    {"role": "user", "content": "Map the evolution of the concept of 'nothing' from Parmenides through Buddhist śūnyatā to quantum vacuum fluctuations, showing philosophical, mathematical, and physical interpretations"}
]

# Apply chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")

# Generate with stop tokens
outputs = model.generate(
    **inputs, 
    max_length=1000,
    do_sample=True,
    temperature=0.7,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.eos_token_id
)

puml_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract only the generated portion (after the prompt)
generated_text = puml_code[len(prompt):]

Citation

@misc{rutherford2024gptossplantumv1,
  title={GPT-OSS PlantUML Generation Model V1},
  author={Chris Rutherford},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/chrisrutherford/gpt-oss-pumlGenV1}
}

Contact

For questions, issues, or collaboration opportunities, please contact [contact information].

Changelog

Version 1.0

Initial release
Fine-tuned on philosophical and scientific concept mapping dataset
Optimised for cross-disciplinary diagram generation