prithivMLmods commited on
Commit
f1c4854
·
verified ·
1 Parent(s): f9db35f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -1
README.md CHANGED
@@ -13,4 +13,108 @@ pipeline_tag: text-generation
13
  library_name: transformers
14
  tags:
15
  - text-generation-inference
16
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  library_name: transformers
14
  tags:
15
  - text-generation-inference
16
+ - math
17
+ - code
18
+ ---
19
+
20
+ # Magpie-Qwen-CortexDual-0.6B
21
+
22
+ > **Magpie-Qwen-CortexDual-0.6B** is a specialized, general-purpose model designed for **math**, **code**, and **structured reasoning**. Built with **CortexDual thinking mode**, it dynamically adapts to the complexity of a problem, automatically shifting into a stepwise reasoning mode for intricate logic or math tasks. This 0.6B parameter model leverages **80% of the Magpie Pro 330k dataset** and a modular blend of datasets for general-purpose proficiency and domain versatility.
23
+
24
+ > \[!note]
25
+ > GGUF : [https://huggingface.co/prithivMLmods/Magpie-Qwen-CortexDual-0.6B-GGUF](https://huggingface.co/prithivMLmods/Magpie-Qwen-CortexDual-0.6B-GGUF)
26
+
27
+ ---
28
+
29
+ ## Key Features
30
+
31
+ 1. **Adaptive Reasoning via CortexDual**
32
+ Automatically switches into a deeper thinking mode for complex problems, simulating trace-style deduction for higher-order tasks in math and code.
33
+
34
+ 2. **Efficient and Compact**
35
+ At 0.6B parameters, it is optimized for deployment in constrained environments while retaining high fidelity in logic, computation, and structural formatting.
36
+
37
+ 3. **Magpie-Driven Data Synthesis**
38
+ Trained using 80% of **Magpie Pro 330k**—a high-quality alignment and reasoning dataset—complemented with curated modular datasets for enhanced general-purpose capabilities.
39
+
40
+ 4. **Mathematical Precision**
41
+ Fine-tuned for arithmetic, algebra, calculus, and symbolic logic; ideal for STEM learning platforms, math solvers, and step-by-step tutoring.
42
+
43
+ 5. **Lightweight Code Assistance**
44
+ Understands and generates code in Python, JavaScript, and other common languages with contextual accuracy and explanation support.
45
+
46
+ 6. **Structured Output Generation**
47
+ Specializes in Markdown, JSON, and table outputs, suitable for technical documentation, instruction generation, and structured reasoning.
48
+
49
+ 7. **Multilingual Competence**
50
+ Supports over 20 languages with reasoning and translation support, expanding its reach for global educational and development use.
51
+
52
+ ---
53
+
54
+ ## Quickstart with Transformers
55
+
56
+ ```python
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer
58
+
59
+ model_name = "prithivMLmods/Magpie-Qwen-CortexDual-0.6B"
60
+
61
+ model = AutoModelForCausalLM.from_pretrained(
62
+ model_name,
63
+ torch_dtype="auto",
64
+ device_map="auto"
65
+ )
66
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
67
+
68
+ prompt = "Write a Python function to check if a number is prime. Explain each step."
69
+
70
+ messages = [
71
+ {"role": "system", "content": "You are an AI tutor skilled in both math and code."},
72
+ {"role": "user", "content": prompt}
73
+ ]
74
+
75
+ text = tokenizer.apply_chat_template(
76
+ messages,
77
+ tokenize=False,
78
+ add_generation_prompt=True
79
+ )
80
+
81
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
82
+
83
+ generated_ids = model.generate(
84
+ **model_inputs,
85
+ max_new_tokens=512
86
+ )
87
+ generated_ids = [
88
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
89
+ ]
90
+
91
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
92
+ print(response)
93
+ ```
94
+
95
+ ---
96
+
97
+ ## Intended Use
98
+
99
+ * General-purpose problem solving in math, logic, and code
100
+ * Interactive STEM tutoring and reasoning explanation
101
+ * Compact assistant for technical documentation and structured data tasks
102
+ * Multilingual applications with a focus on accurate technical reasoning
103
+ * Efficient offline deployment on low-resource devices
104
+
105
+ ---
106
+
107
+ ## Limitations
108
+
109
+ * Lower creativity and open-domain generation due to reasoning-focused tuning
110
+ * Limited context window size due to compact model size
111
+ * May produce simplified logic paths in highly abstract domains
112
+ * Trade-offs in diversity and expressiveness compared to larger instruction-tuned models
113
+
114
+ ---
115
+
116
+ ## References
117
+
118
+ 1. [Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing](https://arxiv.org/pdf/2406.08464)
119
+ 2. [Qwen2.5 Technical Report](https://arxiv.org/pdf/2412.15115)
120
+ 3. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)