kernels-community
/

flash-attn3

Model card Files Files and versions

flash-attn3 / README.md

danieldk's picture

danieldk HF Staff

Add warning, still testing things

745fbe7 4 months ago

|

470 Bytes

	---
	license: bsd-3-clause
	tags:
	- kernel
	---

	Warning: do not use yet! We are still ironing out the last few issues.

	# Flash Attention 3

	Flash Attention is a fast and memory-efficient implementation of the
	attention mechanism, designed to work with large models and long sequences.
	This is a Hugging Face compliant kernel build of Flash Attention.

	Original code here [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention).