Instructions to use zaddyzaddy/Nemo-Recwnt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use zaddyzaddy/Nemo-Recwnt with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Nemo-Recwnt — Checkpoint 200
This repository contains an intermediate LoRA/PEFT adapter checkpoint for mistralai/Mistral-Nemo-Instruct-2407. It is not a fully merged standalone model; it is an adapter intended to be loaded on top of the base model. :contentReference[oaicite:0]{index=0}
Model Summary
- Base model:
mistralai/Mistral-Nemo-Instruct-2407:contentReference[oaicite:1]{index=1} - Adapter type: PEFT / LoRA :contentReference[oaicite:2]{index=2}
- Task type: Causal language modeling (
CAUSAL_LM) :contentReference[oaicite:3]{index=3} - Checkpoint step: 200 (
global_step = 200) :contentReference[oaicite:4]{index=4} - LoRA rank (
r): 16 :contentReference[oaicite:5]{index=5} - LoRA alpha: 16 :contentReference[oaicite:6]{index=6}
- LoRA dropout: 0.05 :contentReference[oaicite:7]{index=7}
What’s in this Folder
This checkpoint directory includes:
adapter_model.safetensors— the LoRA adapter weightsadapter_config.json— PEFT adapter configurationtokenizer.json,tokenizer_config.json,special_tokens_map.json— tokenizer assetstrainer_state.json— training state metadatatraining_args.bin— saved training argumentsscheduler.pt,rng_state_*.pth— optimizer/runtime state fileszero_to_fp32.py— utility script often produced in distributed training exports :contentReference[oaicite:8]{index=8}
LoRA Configuration
The adapter targets the following projection layers:
q_projk_projv_projo_projup_projdown_projgate_proj:contentReference[oaicite:9]{index=9}
This suggests the model was fine-tuned broadly across both attention and MLP projection modules rather than a minimal attention-only LoRA setup. :contentReference[oaicite:10]{index=10}
Training Status
This appears to be an intermediate training checkpoint, not a final fully documented release:
- the folder is named
checkpoint-200 trainer_state.jsonshowsglobal_step: 200max_stepsis5716best_model_checkpointisnull:contentReference[oaicite:11]{index=11}
So this checkpoint looks like an early saved snapshot from a longer run rather than the final selected model. :contentReference[oaicite:12]{index=12}
Usage
Load with Transformers + PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "mistralai/Mistral-Nemo-Instruct-2407"
adapter_id = "zaddyzaddy/Nemo-Recwnt/checkpoint-200"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype="auto",
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_id)
prompt = "Explain what this model is and how it should be loaded."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- -
Model tree for zaddyzaddy/Nemo-Recwnt
Base model
mistralai/Mistral-Nemo-Base-2407