chrisrutherford commited on
Commit
c343ec8
·
verified ·
1 Parent(s): af8eb39

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -4
README.md CHANGED
@@ -1,12 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
- Eval
3
- Q1
4
- Can artificial intelligence ever achieve true understanding, or is it limited to sophisticated pattern recognition? Break this down by examining the nature of consciousness, the semantics of 'understanding,' the boundaries of computational logic, and the role of embodiment in cognition—then map these components into a coherent framework
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
 
7
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65be16980a0c57943fbe8b00/npfHv8F3MzHHWlvmwRge0.png)
8
 
9
- Q2
10
  **Question:**
11
  *"Generate a PlantUML diagram that visualizes a microservices-based e-commerce architecture with the following components and relationships:
12
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: Qwen/Qwen3-8B-Base
5
+ tags:
6
+ - llama-factory
7
+ - full
8
+ - generated_from_trainer
9
+ - text2diagram
10
+ - plantuml
11
+ - code-generation
12
+ model-index:
13
+ - name: pumlGenV2-1
14
+ results: []
15
+ ---
16
 
17
+ # pumlGenV1-1
18
+
19
+ This model is a fine-tuned version of [Qwen/Qwen3-8B-Base](https://huggingface.co/Qwen/Qwen3-8B-Base) on the pumlGen dataset. It specializes in generating PlantUML diagrams from natural language questions.
20
+
21
+ ## Model description
22
+
23
+ pumlGenV1-1 is a specialized language model that converts complex questions into structured PlantUML diagrams. The model takes philosophical, historical, legal, or analytical questions as input and generates comprehensive PlantUML code that visualizes the relationships, hierarchies, and connections between concepts mentioned in the question.
24
+
25
+ Key features:
26
+ - Generates syntactically correct PlantUML diagrams
27
+ - Creates structured visualizations with packages, entities, and relationships
28
+ - Adds contextual notes and annotations
29
+ - Handles complex domain-specific topics across various fields
30
+
31
+ ## Intended uses & limitations
32
+
33
+ ### Intended uses
34
+ - **Educational purposes**: Creating visual diagrams to explain complex concepts
35
+ - **Research visualization**: Mapping relationships between ideas, theories, or historical events
36
+ - **Documentation**: Generating diagrams for technical or conceptual documentation
37
+ - **Analysis tools**: Visualizing interconnections in philosophical, legal, or social topics
38
+
39
+ ### Limitations
40
+ - The model is specifically trained for PlantUML output format
41
+ - Best performance on analytical, philosophical, historical, and conceptual questions
42
+ - May require post-processing for specific PlantUML styling preferences
43
+ - Generated diagrams should be reviewed for accuracy and completeness
44
+
45
+ ## Training and evaluation data
46
+
47
+ The model was trained on the pumlGen dataset, which consists of question-answer pairs where:
48
+ - **Input**: Complex analytical questions about various topics (philosophy, history, law, social sciences)
49
+ - **Output**: Corresponding PlantUML diagram code that visualizes the concepts and relationships
50
+
51
+ ## Training procedure
52
+
53
+ ### Training hyperparameters
54
+
55
+ The following hyperparameters were used during training:
56
+ - learning_rate: 5e-05
57
+ - train_batch_size: 1
58
+ - eval_batch_size: 8
59
+ - seed: 42
60
+ - distributed_type: multi-GPU
61
+ - num_devices: 8
62
+ - gradient_accumulation_steps: 16
63
+ - total_train_batch_size: 128
64
+ - total_eval_batch_size: 64
65
+ - optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
+ - lr_scheduler_type: cosine
67
+ - num_epochs: 3.0
68
+
69
+ ### Training results
70
+
71
+ The model demonstrates strong capabilities in:
72
+ - Generating valid PlantUML syntax
73
+ - Creating meaningful entity relationships
74
+ - Adding appropriate annotations and notes
75
+ - Structuring complex information hierarchically
76
+
77
+ ### Framework versions
78
+
79
+ - Transformers 4.52.3
80
+ - Pytorch 2.6.0+cu124
81
+ - Datasets 3.6.0
82
+ - Tokenizers 0.21.1
83
+
84
+ ## Usage Example
85
+
86
+ ```python
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer
88
+
89
+ # Load model and tokenizer
90
+ model = AutoModelForCausalLM.from_pretrained("your-username/pumlGenV1-1")
91
+ tokenizer = AutoTokenizer.from_pretrained("your-username/pumlGenV1-1")
92
+
93
+ # Prepare the input in conversation format
94
+ question = "What role does the annual flooding of the Nile play in the overall agricultural success and survival of the kingdoms along its banks?"
95
+
96
+ messages = [
97
+ {"from": "human", "value": question},
98
+ ]
99
+
100
+ # Format the input (adjust based on your specific tokenizer's chat template)
101
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
102
+ inputs = tokenizer(input_text, return_tensors="pt")
103
+
104
+ # Generate PlantUML diagram
105
+ outputs = model.generate(
106
+ **inputs,
107
+ max_length=2048,
108
+ temperature=0.7,
109
+ do_sample=True,
110
+ pad_token_id=tokenizer.eos_token_id
111
+ )
112
+
113
+ # Decode and extract the PlantUML code
114
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
115
+ # Extract the PlantUML code from the response (between @startuml and @enduml)
116
+ plantuml_code = response.split("@startuml")[-1].split("@enduml")[0]
117
+ plantuml_code = "@startuml" + plantuml_code + "@enduml"
118
+
119
+ print(plantuml_code)
120
+ ```
121
+
122
+
123
+ #Eval Q1
124
+ Can artificial intelligence ever achieve true understanding, or is it limited to sophisticated pattern recognition? Break this down by examining the nature of consciousness, the semantics of 'understanding,' the boundaries of computational logic, and the role of embodiment in cognition—then map these components into a coherent framework
125
 
126
 
127
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65be16980a0c57943fbe8b00/npfHv8F3MzHHWlvmwRge0.png)
128
 
129
+ #Eval Q2
130
  **Question:**
131
  *"Generate a PlantUML diagram that visualizes a microservices-based e-commerce architecture with the following components and relationships:
132