Is this supported on ollama/llama.cpp?
#1
by
larrywang233
- opened
support on ollama?
Thanks, this is model is not supported in llama.cpp or ollama cause it is for NPU, the runtime logic is very different from GPU, which is ollama and llama.cpp's main target.