| tags: | |
| - fp8 | |
| - vllm | |
| Run with `vllm==0.6.2` on 4xH100: | |
| ``` | |
| vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4 | |
| ``` |
| tags: | |
| - fp8 | |
| - vllm | |
| Run with `vllm==0.6.2` on 4xH100: | |
| ``` | |
| vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4 | |
| ``` |