kernel
No description provided.

Updated the Triton implementation of rotary to support platforms such as XPU.

YangKai0616 changed pull request status to open

Hi @danieldk , please help review, thank you!

Intel is gradually phasing out IPEX. This kernel currently only supports CUDA. This PR implements calling the rotary kernel on XPU without modifying the CUDA interface or calling method.

@danieldk ,could you help review it? we prefer to use same kernel repo with cuda, which is used in TGI. Since this kernel only supports cuda, we add triton support so that it could run in xpu as well.

Hi, I discussed this with Daniel internally and they suggest moving this kernel to a separate repo, would be cool to host it on the Intel org.
The idea is that we keep universal kernels in separate repos, we do not mix them with cuda/rocm/sycl/metal specific kernels (check the universal tag https://huggingface.co/kernels-community/triton-layer-norm/blob/main/build.toml#L3 for example).
The PR/code suggested is also missing build/validation components and needs proper torch-ext/build.toml. It's really important that the kernel goes through the build cycle to avoid any issue from non-unique ops identifier.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment