jusjinuk
/

Llama-2-70b-hf-2bit-GuidedQuant-QTIP

Model card Files Files and versions

jusjinuk commited on Jun 19

Commit

926de46

·

verified ·

1 Parent(s): b2707fd

Create README.md

Files changed (1) hide show

README.md +21 -0

README.md ADDED Viewed

	@@ -0,0 +1,21 @@

+---
+base_model:
+- meta-llama/Llama-2-70b-hf
+base_model_relation: quantized
+license: llama2
+---
+# Model Card
+- Base model: `meta-llama/Llama-2-70b-hf`
+- Quantization method: BlockLDLQ with GuidedQuant Hessian
+- Target bit-width: 2
+- Backend kernel: QTIP kernel (HYB variant)
+- Calibration data: RedPajama (1024 sentences / 4096 tokens)
+- Calibration objective: Next-token prediction
+- num_groups (for GuidedQuant Hessian): 2
+# How to run
+- Follow the instruction in https://github.com/snu-mllab/GuidedQuant and https://github.com/Cornell-RelaxML/qtip
+# References
+- [Model Paper](https://arxiv.org/abs/2505.07004)