Update Documentation to remove A100 support
#10
by ndurkee - opened
Hi there, I’m trying to run the FP8/NVFP4 quants using vLLM on an A100 (Compute Capability 80), but I'm hitting an error stating that CC 89+ is required.
I noticed the documentation mentions A100 support, but it seems this might only apply to the FP16 version.
- Request: Could you clarify if FP8/NVFP4 is intended to be supported on Ampere (CC 80) via vLLM?
- Suggestion: If not, it might be worth updating the README to specify the CC 89+ requirement for these specific quants to save other users some troubleshooting time.
- Question: Are there plans for an FP8(Q4) version compatible with Ampere, or should I stick to bitsandbytes/GPTQ for this hardware?
Thanks for the great work on these models!
Agreed - unclear why the readme clearly states A100 if the model is not compatible.