Please, someone distill this model!

#53
by treehugg3 - opened

This model is far and away better in writing style and smart enough that it blows away basically anything else. If someone would please find a way to distill it into a 70B-100B "Kimi-K2-Mini" type model, that would really help us to be able to run it at home with more than one token per second.

Sign up or log in to comment