docs: Readme Updated for optimized Usage with transformers library

#60
by sayed99 - opened

python code for transformers usage updated to use flash-attn as attention implementation to boost the performance and reduce memory usage.

PaddlePaddle org

@sayed99 Great work on this, and thank you for your contribution!

To ensure a smooth out-of-the-box experience for all users, we think it’s better to make flash-attn optional instead of default. To save you some time, I’ll go ahead and push a small commit to this PR to make that change.

Also, I suggest removing these two steps. They don’t seem necessary for a code example and removing them would simplify it.

from google.colab import files
...
# 2- Upload image (drag & drop any PNG/JPG)
...
# 3. Resize max-2048 preserving aspect ratio
...

Thanks again for the excellent work!

@xiaohei66
Thank you for the suggestions and for helping improve the PR! I agree that making flash-attn optional and removing the extra Colab steps will simplify the example and make it more user-friendly. I appreciate your help in pushing the small commit, looking forward to reviewing the changes.

@xiaohei66
Hello, Thanks for your efforts,
I wonder if that merge would be merged to the main model card automatically soon?

ChengCui changed pull request status to merged

Sign up or log in to comment