klei1
/

bleta-logjike-27b-gguf

@@ -1,21 +1,109 @@
 ---
-base_model: unsloth/gemma-3-27b-it-unsloth-bnb-4bit
 tags:
 - text-generation-inference
-- transformers
-- unsloth
 - gemma3
 license: apache-2.0
 language:
-- sq
 ---
-# Uploaded finetuned  model
-- **Developed by:** klei1
 - **License:** apache-2.0
-- **Finetuned from model :** unsloth/gemma-3-27b-it-unsloth-bnb-4bit
-This gemma3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+base_model: bleta-logjike-27b
 tags:
 - text-generation-inference
+- llama.cpp
+- gguf
+- albanian
 - gemma3
+- reasoning
+- logical-reasoning
+- grpo
+- gsm8k
+- mathematics
+- llm
 license: apache-2.0
 language:
+- al
+inference:
+  parameters:
+    temperature: 0.7
+    top_p: 0.95
+    top_k: 64
+    max_new_tokens: 512
 ---
+# Bleta-Logjike 27B Albanian Logical Reasoning Model (GGUF)
+## Model Description
+- **Developed by:** klei aliaj
+- **Model type:** Bleta-Logjike 27B optimized for Albanian logical reasoning
 - **License:** apache-2.0
+- **Format:** GGUF 8-bit quantized for llama.cpp
+- **Language:** Albanian
+- **Base architecture:** Based on Gemma 3 27B
+This model is a GGUF quantized version of the Bleta-Logjike 27B model, specifically optimized for logical reasoning tasks in the Albanian language. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture, with this version focused on enhancing logical reasoning and problem-solving capabilities.
+## Capabilities & Features
+### Logical Reasoning Focus
+This Albanian language model excels at:
+1. Logical analysis and deduction in Albanian
+2. Step-by-step problem solving
+3. Structured reasoning for complex problems
+4. Understanding logical relationships and dependencies
+5. Mathematical reasoning for grade-school level problems
+### GGUF Quantization Benefits
+- **Efficient inference:** Optimized for use with llama.cpp and similar frameworks
+- **Reduced memory usage:** 8-bit quantization substantially reduces RAM requirements
+- **Faster inference:** More efficient processing for consumer hardware
+- **Compatible with:** llama.cpp, Jan AI, LM Studio, and other GGUF-compatible applications
+### Albanian Language Optimization
+- Native support for Albanian grammar and vocabulary
+- Understanding of Albanian cultural context
+- Handling of Albanian-specific logical expressions and constructs
+## Training Methodology
+### GRPO Approach
+This model was fine-tuned using Generative Rejection Policy Optimization (GRPO), a reinforcement learning technique that trains models to optimize for specific reward functions. GRPO allows the model to learn from feedback on its generated responses, improving reasoning quality over time by:
+1. Generating multiple candidate responses
+2. Evaluating responses against specific reward criteria
+3. Learning to prefer high-quality reasoning patterns
+4. Optimizing for step-by-step problem solving
+### GSM8K Dataset
+The training utilized the GSM8K (Grade School Math 8K) dataset, which contains over 8,000 high-quality grade school math problems, requiring step-by-step reasoning to solve. The dataset provides:
+- Diverse mathematical problem types
+- Multi-step reasoning challenges
+- Clear step-by-step solutions
+- Grade-school level complexity
+This dataset was adapted for Albanian language training to ensure the model can handle mathematical reasoning tasks in Albanian.
+## Technical Specifications
+### Model Architecture
+- 27B parameters
+- Based on Gemma 3 architecture with Albanian adaptations
+- 128K context window
+- QK normalization
+- 5 sliding + 1 global attention pattern
+- 1024 sliding window attention
+### Usage Requirements
+- Recommended minimum 16GB RAM for inference
+- Compatible with CPU inference but GPU recommended
+- Works with llama.cpp and compatible UIs
+## Limitations
+The current model is an 8-bit quantized version of the 27B parameter model. This quantization offers advantages in terms of size and speed, but comes with some limitations:
+- Reduced precision compared to the original 16-bit or 32-bit model
+- May exhibit occasional numerical instabilities in complex reasoning chains
+- While optimized for logical reasoning in Albanian, complex or ambiguous problems may produce inconsistent results
+- As with all language models, it may occasionally hallucinate or provide incorrect information
+- Performance may vary depending on the complexity and clarity of the input prompts
+## Acknowledgments
+- Google for developing the Gemma 3 architecture
+- llama.cpp team for the GGUF format and inference engine
+- OpenAI for the GSM8K dataset
+- Hugging Face for their TRL library and GRPO implementation