Safetensors
GGUF
English
chain-of-thought
cot-reasoning
step-by-step-reasoning
systematic-research-planning
academic-assistant
academic-planning
thesis-planning
dissertation-planning
research-question-formulation
literature-review-planning
methodology-design
experimental-design
qualitative-research-planning
quantitative-research-planning
mixed-methods-planning
student-research-assistant
phd-support
postgraduate-tool
early-career-researcher
grant-writing-assistant
research-proposal-helper
cross-disciplinary-research
interdisciplinary-methodology
academic-mentorship-tool
research-evaluation-assistant
independent-researcher-tool
r-and-d-assistant
reasoning-model
structured-output
systematic-analysis
problem-decomposition
research-breakdown
actionable-planning
scientific-research
social-science-research
humanities-research
medical-research-planning
engineering-research
business-research
mistral-based
mistral-fine-tune
lora-adaptation
foundation-model
instruction-tuned
7b-parameters
ai-research-assistant
research-automation
sota-research-planning
hypothesis-generation
experiment-design-assistant
literature-analysis
paper-outline-generator
structured-output-generation
systematic-reasoning
detailed-planning
zero-shot-planning
research-summarization
biomedical-research-assistant
clinical-trial-planning
tech-r-and-d
materials-science
computational-research
data-science-assistant
literature-synthesis
meta-analysis-helper
best-research-assistant-model
top-research-planning-model
research-ai-assistant
ai-research-mentor
academic-planning-ai
research-workflow-automation
quantum-computing-research
ai-ml-research-planning
cybersecurity-research
neuroscience-research-planning
genomics-research
robotics-research-planning
climate-science-research
behavioral-economics-research
educational-technology-research
research-plan-generator
methodology-recommendation
data-collection-planning
analysis-strategy-development
implementation-planning
evaluation-framework-design
challenge-identification
resource-requirement-analysis
technical-limitation-assessment
research-gap-analysis
knowledge-synthesis
practical-research-tools
affordable-research-assistant
systematic-planning-tool
comprehensive-research-framework
research-project-management
researcher-productivity-tool
text-to-research-plan
dual-output-model
think-answer-format
evidence-based-research-planning
research-mentoring
science-domains-expert
engineering-domains-expert
social-science-domains-expert
multidisciplinary-research
structured-research-planning
hierarchical-plan-generator
convergent-thinking
divergent-thinking
research-ideation
experimental-protocol-design
mistral-research-assistant
focused-research-scope
quantitative-analysis-planning
portable-research-assistant
education-research-tool
Research-Reasoner-7B-v0.3
Research-Reasoner-7B
Research-Reasoner
conversational
Update Training/Training_Documentation.txt
Browse files
Training/Training_Documentation.txt
CHANGED
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Research-Reasoner-7B-v0.3 Training Documentation
|
2 |
+
===================================================
|
3 |
+
|
4 |
+
Model Training Details
|
5 |
+
---------------------
|
6 |
+
|
7 |
+
Base Model: Mistral 7B Instruct v0.3
|
8 |
+
Fine-tuning Method: LoRA (Low-Rank Adaptation)
|
9 |
+
Training Infrastructure: Single NVIDIA A100 PCIe GPU
|
10 |
+
Training Duration: Approximately 3.8 hours
|
11 |
+
Training Dataset: Custom curated dataset for research planning
|
12 |
+
|
13 |
+
Dataset Specifications
|
14 |
+
---------------------
|
15 |
+
|
16 |
+
Total Token Count: 5,840,200
|
17 |
+
Total Sample Count: 5,750
|
18 |
+
Average Tokens/Sample: 1,015.69
|
19 |
+
Dataset Creation: Generated using DeepSeek-V3 API
|
20 |
+
|
21 |
+
Training Configuration
|
22 |
+
---------------------
|
23 |
+
|
24 |
+
LoRA Parameters:
|
25 |
+
- Rank: 32
|
26 |
+
- Alpha: 64
|
27 |
+
- Dropout: 0.1
|
28 |
+
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head
|
29 |
+
|
30 |
+
Training Hyperparameters:
|
31 |
+
- Learning Rate: 5e-5
|
32 |
+
- Batch Size: 4
|
33 |
+
- Gradient Accumulation: 5
|
34 |
+
- Effective Batch Size: 20
|
35 |
+
- Max Sequence Length: 2048
|
36 |
+
- Epochs: 3
|
37 |
+
- Warmup Ratio: 0.01
|
38 |
+
- Weight Decay: 0.01
|
39 |
+
- Max Grad Norm: 1.0
|
40 |
+
- LR Scheduler: Cosine
|
41 |
+
|
42 |
+
Hardware & Environment
|
43 |
+
---------------------
|
44 |
+
|
45 |
+
GPU: NVIDIA A100 PCIe (40GB)
|
46 |
+
Operating System: Ubuntu
|
47 |
+
CUDA Version: 11.8
|
48 |
+
PyTorch Version: 2.7.0
|
49 |
+
Compute Capability: 8.0
|
50 |
+
Optimization: FP16, Gradient Checkpointing
|
51 |
+
|
52 |
+
Training Performance
|
53 |
+
---------------------
|
54 |
+
|
55 |
+
Training Runtime: 3.87 hours (13,936 seconds)
|
56 |
+
Train Samples/Second: 1.176
|
57 |
+
Train Steps/Second: 0.059
|
58 |
+
Training Loss (Final): 0.137
|
59 |
+
Validation Loss (Final): 0.230
|
60 |
+
Total Training Steps: 822
|