CPU / VLLM / LLAMA and GGUF in a boat

by AchrafB67 - opened Jul 4

Jul 4

It seems GGUF is not compatible with CPU only on VLLM...
Furthermore, llama.cpp can run GGUF model but wil not provide comptability with multimodal with this model...

Any suggestion for using the multi modal model quantized ? (I run on old CPU only..)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment