Burnt-Toast
/

ms3.2-24b-longform

Model card Files Files and versions

ToastyPigeon commited on Jun 22

Commit

96666da

·

verified ·

1 Parent(s): 0d6b87f

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -3,9 +3,8 @@ base_model:
 - mistralai/Mistral-Small-3.2-24B-Instruct-2506
 ---
-**Modified Small 3.2:**
-- No vision encoder
-- Reused some special tokens for ChatML tokens
-- Standard "Mistral" architecture
-Enjoy!

 - mistralai/Mistral-Small-3.2-24B-Instruct-2506
 ---
+Trained from [anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-ChatML](https://huggingface.co/anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-ChatML) for convenience (no Pixtral compatibility needed). No vision adapter currently.
+Test model trained at 16k context on 50M tokens of long-form human writing (mostly books).
+Haven't tested yet but regular Tekken v7 instruct will work and samplers are probably the same as you'd use for 3.2 Instruct.