i once announced i crossed 4k on X, celebrated it with the community. just to come back in 3 days and see it down to 3600! 😅 i mean, bots are bad and don't make the platform look good so they should be removed. but took me a long time to go back to 4k! 😊
Maziyar Panahi PRO
MaziyarPanahi
AI & ML interests
Post-Training, RLHF, RL, model merging, quantization, synthetic datasets, AI in Health
Recent Activity
replied to
their
post
about 21 hours ago
🚨 Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE
Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.
THE 7 SOURCE MODELS (Original Sample Counts):
1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples
TOTAL BEFORE DEDUPLICATION: 2,919,699 samples
TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%
Quick Start:
```
from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")
```
All datasets Apache 2.0 licensed. Free for research and commercial use.
Thank you for following OpenMed's release series. I can't wait to see what you build. 🔥
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Mega
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Trinity-Mini
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
https://huggingface.co/datasets/OpenMed/Medical-Reasonin
https://huggingface.co/collections/OpenMed/medical-datasets
posted
an
update
about 21 hours ago
🚨 Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE
Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.
THE 7 SOURCE MODELS (Original Sample Counts):
1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples
TOTAL BEFORE DEDUPLICATION: 2,919,699 samples
TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%
Quick Start:
```
from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")
```
All datasets Apache 2.0 licensed. Free for research and commercial use.
Thank you for following OpenMed's release series. I can't wait to see what you build. 🔥
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Mega
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Trinity-Mini
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
https://huggingface.co/datasets/OpenMed/Medical-Reasonin
https://huggingface.co/collections/OpenMed/medical-datasets