Please, someone distill this model!
#53
by
treehugg3
- opened
This model is far and away better in writing style and smart enough that it blows away basically anything else. If someone would please find a way to distill it into a 70B-100B "Kimi-K2-Mini" type model, that would really help us to be able to run it at home with more than one token per second.