8bit quantization error
model = AutoModelForCausalLM.from_pretrained(DEFAULT_CKPT_PATH, device_map="auto", load_in_8bit=True, max_memory=max_memory_mapping)
when the prompt is only test, it will cause the errror:
File "/home/devbrain/miniconda3/lib/python3.11/site-packages/transformers/generation/utils.py", line 2897, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: probability tensor contains either inf
, nan
or element < 0
Hi, I encountered the same issue, may I know did you manage to find the solution for this error?
hi @lovelyfrog you might find Impulse AI (https://www.impulselabs.ai/) useful. we make it super easy to fine-tune and deploy open source models. hopefully you find it helpful! i know not relevant to your problem above but might be easier to use us to fine tune and deploy
docs: https://docs.impulselabs.ai/introduction
python sdk: https://pypi.org/project/impulse-api-sdk-python/