How can we run this model on mac OS or any machine that does not have GPU?

by merhanjan - opened 2 days ago

Discussion

merhanjan

2 days ago

I am trying to run this model on my Mac, but I cannot install flash_attn because the machine does not have a GPU.

ThetaCursed

2 days ago

Without flash attention, you won't be able to run the model, since in the custom model code this is a necessary import.

I run it without problems on RTX A6000, it's very cheap on all sorts of Cloud GPUs

xxyyy123

AIDC-AI org 2 days ago

Hi, we’ve added SDPA support in the code, so it can now run without flash-attention.

merhanjan

about 18 hours ago

Hi, we’ve added SDPA support in the code, so it can now run without flash-attention.

Thanks for the update, shall I do anything in my code?

merhanjan

about 15 hours ago

I think there is something wrong with this mode. Even the Google Colab example you provided does not work and throws an error. Loading the model directly without the transformer also throws an error.

stoefln

about 5 hours ago

Can someone recommend a way to run this cost-efficiently?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment