Adjust number of reserved tokens to match the model
#15
by
dzhulgakov
- opened
>>> transformers.AutoTokenizer.from_pretrained("moonshotai/Kimi-K2-Instruct", trust_remote_code=True).vocab_size
163842
But the model has 163840
. I think this +2 is not needed then. It doesn't affect tokenization as those are regular reserved tokens