GPT OS3 Beta 8B A3B

  • Developed by: qingy2024
  • Base model: AmanPriyanshu/gpt-oss-8.4b-specialized-all-pruned-moe-only-11-experts

GPT OSS Small (OS3) is a project to create usable and intelligent language models based on pruned gpt-oss variants by @AmanPriyanshu. These are post trained with LoRA on the qingy2024/GPT-OS3-Dataset-v1 dataset to revert some of the "brain damage" due to the expert pruning.

(This is the Beta release, step 2172 checkpoint, so please don't use it unless you know what you're doing)

Built with Axolotl

Downloads last month
14
Safetensors
Model size
8.37B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qingy2024/GPT-OS3-Beta-8B-A3B

Dataset used to train qingy2024/GPT-OS3-Beta-8B-A3B