Text Generation
Transformers
Safetensors
qwen2
code-generation
nl2python
java2python
code2doc
multitask
conversational
text-generation-inference
Instructions to use Saikrishna2511/qwen-multitask with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Saikrishna2511/qwen-multitask with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Saikrishna2511/qwen-multitask") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Saikrishna2511/qwen-multitask") model = AutoModelForCausalLM.from_pretrained("Saikrishna2511/qwen-multitask") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Saikrishna2511/qwen-multitask with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Saikrishna2511/qwen-multitask" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Saikrishna2511/qwen-multitask", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Saikrishna2511/qwen-multitask
- SGLang
How to use Saikrishna2511/qwen-multitask with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Saikrishna2511/qwen-multitask" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Saikrishna2511/qwen-multitask", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Saikrishna2511/qwen-multitask" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Saikrishna2511/qwen-multitask", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Saikrishna2511/qwen-multitask with Docker Model Runner:
docker model run hf.co/Saikrishna2511/qwen-multitask
Saikrishna2511/qwen-multitask
Multi-task fine-tuned Qwen2.5-Coder-0.5B-Instruct checkpoint for code generation and documentation.
Demo
Try the model in the browser: https://huggingface.co/spaces/Saikrishna2511/qwen-multitask-demo
Tasks
This single checkpoint handles three tasks via different prompt prefixes:
NL → Python (nl2py)
### Instruction: Write Python for: {natural language description}
### Response:
Java → Python (java2py)
### Translate Java to Python:
```java
{java code}
Python:
Code → Documentation (code2doc)
### Generate documentation for this Python code:
```python
{python code}
Documentation:
## Training
- **Base model:** [Qwen/Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct)
- **Stage 1:** Java→Python LoRA fine-tune on AVATAR-TC
- **Stage 2:** Multi-task LoRA on NL2Py, Code2Doc, code comments, and Java2Py replay
- **Method:** LoRA (r=16, alpha=32), merged weights for inference
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Saikrishna2511/qwen-multitask"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=torch.float16,
device_map="auto",
)
prompt = "### Instruction: Write Python for: return the factorial of n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
For post-processing and all three task templates, see the project repo or the linked Gradio Space.
Limitations
- Small 0.5B model; quality varies by task and input complexity
- Trained primarily on Python; Java translation quality depends on training coverage
- Not intended for production use without further evaluation
- Downloads last month
- 15
Model tree for Saikrishna2511/qwen-multitask
Base model
Qwen/Qwen2.5-0.5B Finetuned
Qwen/Qwen2.5-Coder-0.5B Finetuned
Qwen/Qwen2.5-Coder-0.5B-Instruct