amd-shark
/

sdxl-quant-fp8

Model card Files Files and versions

4 contributors

History: 19 commits

nickfraser's picture

Added models that are fully quantized with FP8.

cfd94d7 about 1 year ago

all_linear_sym_8_calib8
Fix names about 1 year ago
all_sym_8_calib10
MI250 QKV fused and all layers sym, FP8 attention, guidance scale 8, calib steps 10 about 1 year ago
brevitas
updated quant_params with QKV fusion about 1 year ago
linear_conv_fp8_sdpa_fp16_eq_bl
Added models that are fully quantized with FP8. about 1 year ago
linear_conv_fp8_sdpa_fp16_no_eq_bl
Added models that are fully quantized with FP8. about 1 year ago
linear_conv_fp8_sdpa_fp8_eq_bl
Added models that are fully quantized with FP8. about 1 year ago
linear_conv_fp8_sdpa_fp8_no_eq_bl
Added models that are fully quantized with FP8. about 1 year ago
.gitattributes

2.08 kB

Added models that are fully quantized with FP8. about 1 year ago
attn.py

6.26 kB

Added SDPA math model & test about 1 year ago
sdxl.json

2.19 MB

Upload sdxl.json with huggingface_hub over 1 year ago
sdxl.safetensors

5.14 GB
xet

Upload sdxl.safetensors with huggingface_hub over 1 year ago
test_attn.py

1.29 kB

Added SDPA math model & test about 1 year ago