Intel/DeepSeek-V3.1-Base-int4-mixed-AutoRound

Model Details

This model is an mixed int4 model with group_size 128 and symmetric quantization of deepseek-ai/DeepSeek-V3.1-Base generated by intel/auto-round via RTN(no algorithm tuning). Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details. Please follow the license of the original model.

How To Use

INT4 Inference

from transformers import AutoModelForCausalLM, AutoTokenizer

import torch

quantized_model_dir = "Intel/DeepSeek-V3.1-Base-int4-mixed-AutoRound"

model = AutoModelForCausalLM.from_pretrained(
    quantized_model_dir,
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
prompt = "There is a girl who likes adventure,"


inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(model.device)

outputs = model.generate(
    **inputs,
    max_length=512,  ##change this to align with the official usage
    do_sample=False  ##change this to align with the official usage
)
decoded_outputs = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(decoded_outputs)
"""
There is a girl who likes adventure, and she is me. I love to travel and explore new places. I have been to many different countries, and I have seen some amazing things. I have also had some great adventures, and I have made some wonderful friends. I am always looking for new places to explore, and I am always up for a new adventure.
 
## What is the girl who likes adventure?
 
The girl who likes adventure is someone who is always up for a new challenge. She loves to explore new places and try new things. She is always looking for new ways to push herself and test her limits. She is never afraid to take risks and is always up for a good time.
 
## Why does she like adventure?
 
There are many reasons why someone might enjoy adventure. For some, it may be the thrill of the unknown or the excitement of trying something new. For others, it may be the opportunity to test their limits and see what they are capable of. And for some, it may simply be the chance to get away from the everyday routine and explore the world around them.
 
Whatever the reason, there is no doubt that adventure can be a great way to add excitement and variety to your life. It can also be a great way to meet new people and learn new things. So if you’re looking for a way to add a little more excitement to your life, why not give adventure a try?
 
## What are some of her favorite adventures?
 
There are many different types of adventures that people can enjoy. Some people like to go on physical adventures, such as hiking or climbing. Others prefer more mental challenges, such as solving puzzles or exploring new places. And still others enjoy a mix of both physical and mental challenges.
 
Some of the most popular adventures include:
 
1. Hiking: Hiking is a great way to get some exercise while enjoying the beauty of nature. There are many different trails to choose from, so you can find one that is perfect for your fitness level.
 
2. Climbing: Climbing is another great way to get some exercise while enjoying the outdoors. There are many different types of climbing, so you can find one that is perfect for your skill level.
 
3. Exploring: Exploring new places is a great way to learn about different cultures and see new things. It can be done by yourself or with a group of friends.
 
4. Solving puzzles: Solving puzzles is a great way to challenge your mind and have some fun. There are many different types of puzzles, so you can find one
"""

Generate the model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
from auto_round import AutoRound

model_name = "deepseek-ai/DeepSeek-V3.1-Base"

layer_config = {}
for n, m in model.named_modules():
    if isinstance(m, torch.nn.Linear):
        if "expert" in n and "shared_experts" not in n:
            layer_config[n] = {"bits": 4}
            print(n, 4)
        elif n != "lm_head":
            layer_config[n] = {"bits": 8}
            print(n, 8)

autoround = AutoRound(model_name, iters=0, layer_config=layer_config)
autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

Intel
/

DeepSeek-V3.1-Base-int4-mixed-AutoRound