Daemontatox
/

Sphinx2.0

@@ -2,7 +2,7 @@
 tags:
 - long-cot-reasoning
 - transformers
-- mamba2
 - llms
 - chain-of-thought
 license: apache-2.0
@@ -19,46 +19,54 @@ library_name: transformers
 ![Sphinx of Reasoning](./image.webp)
-# **Sphinx: A Long Chain-of-Thought Reasoning Model**
-- **Developed by:** Daemontatox
-- **License:** Apache-2.0
-- **Base Model:** Fine-tuned from `unsloth/qwen2.5-14b-instruct-bnb-4bit`
-- **Accelerated by:** [Unsloth Framework](https://github.com/unslothai/unsloth)
-- **TRL-Optimized:** Integrated with Huggingface's TRL library for enhanced performance.
-## **Overview**
-Sphinx is a state-of-the-art Long Chain-of-Thought (CoT) reasoning model designed to address complex, multi-step reasoning tasks with precision and clarity. Built on the Qwen2.5 architecture, Sphinx excels in generating coherent, logical thought processes while maintaining high levels of interpretability and explainability.
-> _"Decoding complexity into clarity."_
-### **Key Features**
-- **Enhanced CoT Reasoning:** Fine-tuned for generating multi-step solutions with deep logical consistency.
-- **Efficient Performance:** Powered by Unsloth, achieving 2x faster training without compromising accuracy.
-- **4-bit Quantization:** Optimized for resource-constrained environments while maintaining robust performance.
-- **Multi-Task Versatility:** Excels in diverse domains, including mathematical proofs, legal reasoning, and advanced scientific problem-solving.
-- **TRL Integration:** Employs reinforcement learning to improve generation quality through continuous feedback loops.
-## **Model Details**
-### **Architecture**
-- **Base Model:** Qwen2.5-14B
-- **Parameters:** 14 billion
-- **Quantization:** 4-bit precision using BitsAndBytes (bnb).
-- **Token Window:** Supports long-form inputs with a context window of up to 16k tokens, ideal for extensive reasoning tasks.
-### **Training Details**
-- **Frameworks:** Huggingface Transformers + TRL + Unsloth.
-- **Data Sources:** Curated datasets emphasizing reasoning tasks, including academic, legal, and logical contexts.
-- **Optimization:** LoRA for parameter-efficient fine-tuning; RLHF for enhanced response alignment.
-### **Capabilities**
-1. **Long-CoT Generation:** Capable of breaking down and solving complex, multi-layered problems.
-2. **Explainable AI (XAI):** Provides clear, step-by-step reasoning for outputs.
-3. **Customizability:** Easily adaptable to niche reasoning tasks via lightweight fine-tuning.
-## **Applications**
-- **Academic Research:** Generating detailed, structured analyses for scientific problems.
-- **Legal Assistance:** Drafting and explaining multi-step legal arguments.
-- **STEM Education:** Guiding students through intricate mathematical and logical problems.
-- **Cognitive AI Systems:** Seamless integration into systems requiring transparent decision-making.

 tags:
 - long-cot-reasoning
 - transformers
+- mamba2 # Consider updating if this isn't the architecture
 - llms
 - chain-of-thought
 license: apache-2.0
 ![Sphinx of Reasoning](./image.webp)
+# **Sphinx: The Apex of Logical Deduction and Chain-of-Thought Reasoning**
+- **Developed by:** Daemontatox
+- **License:** Apache-2.0
+- **Base Model:** Fine-tuned from `unsloth/qwen2.5-14b-instruct-bnb-4bit`
+- **Accelerated by:** [Unsloth Framework](https://github.com/unslothai/unsloth)
+- **TRL-Optimized:** Integrated with Huggingface's TRL library for enhanced performance in logical reasoning.
+## **Unveiling Sphinx: Master of Reasoned Thought**
+Sphinx is a cutting-edge Long Chain-of-Thought (CoT) reasoning model meticulously crafted to unravel complex challenges requiring rigorous logical analysis. Built upon the robust foundation of the Qwen2.5 architecture, Sphinx excels at constructing coherent, step-by-step thought processes, providing unparalleled insight into its reasoning and ensuring clarity in its conclusions.
+> _"Where complexity yields to logical clarity."_
+### **Core Strengths: Reasoning, Logic, and CoT**
+- **Unrivaled Chain-of-Thought (CoT) Mastery:** Engineered for dissecting intricate problems, Sphinx meticulously constructs each step of its reasoning, offering a transparent and verifiable pathway to the solution.
+- **Deep Logical Reasoning Capabilities:**  Sphinx is adept at navigating complex logical structures, drawing valid inferences and forming sound conclusions through multi-layered analysis.
+- **Exceptional Reasoning Fidelity:** Fine-tuned to maintain the highest standards of logical consistency, Sphinx delivers outputs that are not only correct but also demonstrably well-reasoned.
+- **Efficient Long-Context Reasoning:** Leveraging the power of Unsloth, Sphinx processes extensive information efficiently, maintaining logical coherence across extended reasoning chains.
+- **Explainable AI through Transparent Logic:** Sphinx's inherent CoT approach provides explicit and understandable reasoning, making its decision-making process transparent and trustworthy.
+## **Model Architecture and Fine-tuning for Logical Prowess**
+### **Architectural Foundation**
+- **Base Model:** Qwen2.5-14B - Renowned for its strong general language understanding, forming a solid basis for specialized reasoning.
+- **Parameters:** 14 billion - Providing the capacity to model intricate reasoning patterns.
+- **Quantization:** 4-bit precision using BitsAndBytes (bnb) - Optimizing for accessibility without sacrificing logical reasoning accuracy.
+- **Extended Reasoning Window:**  Supports inputs up to 16k tokens, crucial for accommodating the detailed context required for complex logical deductions.
+### **Training Methodology: Honing Logical Acumen**
+- **Frameworks:** Huggingface Transformers + TRL + Unsloth - A powerful combination for efficient training and reinforcement learning.
+- **Data Sources:**  A meticulously curated collection of datasets specifically designed to challenge and refine logical reasoning skills, encompassing academic, legal, and formal logic domains.
+- **Optimization Strategies:**
+    - **LoRA (Low-Rank Adaptation):**  Enabling parameter-efficient fine-tuning, focusing on adapting the model for superior logical inference.
+    - **Reinforcement Learning from Human Feedback (RLHF):**  Guiding the model towards generating more logically sound and human-aligned reasoning steps.
+## **Sphinx's Reasoning Toolkit: Capabilities in Action**
+1. **Masterful Long-CoT Generation:** Deconstructs and conquers multi-layered problems by constructing detailed, logically interconnected reasoning sequences.
+2. **Explanatory Power through Logic:** Provides clear, step-by-step logical derivations for its outputs, enhancing trust and understanding.
+3. **Adaptable Logical Framework:** Easily tailored to specialized reasoning tasks through targeted fine-tuning, enabling application in diverse logical domains.
+## **Unlocking Potential: Applications Driven by Logic**
+- **Advanced Academic Research:**  Generating in-depth, logically structured analyses for complex scientific and philosophical inquiries.
+- **Robust Legal Reasoning Assistance:**  Constructing and articulating multi-step legal arguments with precision and logical rigor.
+- **Transformative STEM Education:**  Guiding learners through intricate mathematical and logical problems with clear, step-by-step explanations.
+- **Transparent Cognitive AI Systems:**  Powering AI systems where explainability and logical justification are paramount for decision-making.