Update Documentation to remove A100 support

#10

by ndurkee - opened Mar 25

Mar 25

Hi there, I’m trying to run the FP8/NVFP4 quants using vLLM on an A100 (Compute Capability 80), but I'm hitting an error stating that CC 89+ is required.

I noticed the documentation mentions A100 support, but it seems this might only apply to the FP16 version.

Request: Could you clarify if FP8/NVFP4 is intended to be supported on Ampere (CC 80) via vLLM?
Suggestion: If not, it might be worth updating the README to specify the CC 89+ requirement for these specific quants to save other users some troubleshooting time.
Question: Are there plans for an FP8(Q4) version compatible with Ampere, or should I stick to bitsandbytes/GPTQ for this hardware?

Thanks for the great work on these models!

Kunal0137

27 days ago

Agreed - unclear why the readme clearly states A100 if the model is not compatible.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment