How to make gguf inference videos?
#1
by
a463724055
- opened
llama.cpp only supports images, while vllm does not support gguf of this model. I would like to ask how to make it infer videos?
I did wonder how this might be done. I do not know how models encode video sequences so I am totally guessing an answer. If a video is a sequence of images then if you gave the model a sequence of images (before your text prompt) would it then infer that as a video. depends how it was trained and what special tokens it uses to indicate the difference between an image and video , if any. Worth a try as there is no video support from llama.cpp and vLlm is not supporting this model. You could also put in a feature https://github.com/ggml-org/llama.cpp/issues