Nemo-Recwnt — Checkpoint 200

This repository contains an intermediate LoRA/PEFT adapter checkpoint for mistralai/Mistral-Nemo-Instruct-2407. It is not a fully merged standalone model; it is an adapter intended to be loaded on top of the base model. :contentReference[oaicite:0]{index=0}

Model Summary

Base model: mistralai/Mistral-Nemo-Instruct-2407 :contentReference[oaicite:1]{index=1}
Adapter type: PEFT / LoRA :contentReference[oaicite:2]{index=2}
Task type: Causal language modeling (CAUSAL_LM) :contentReference[oaicite:3]{index=3}
Checkpoint step: 200 (global_step = 200) :contentReference[oaicite:4]{index=4}
LoRA rank (r): 16 :contentReference[oaicite:5]{index=5}
LoRA alpha: 16 :contentReference[oaicite:6]{index=6}
LoRA dropout: 0.05 :contentReference[oaicite:7]{index=7}

What’s in this Folder

This checkpoint directory includes:

adapter_model.safetensors — the LoRA adapter weights
adapter_config.json — PEFT adapter configuration
tokenizer.json, tokenizer_config.json, special_tokens_map.json — tokenizer assets
trainer_state.json — training state metadata
training_args.bin — saved training arguments
scheduler.pt, rng_state_*.pth — optimizer/runtime state files
zero_to_fp32.py — utility script often produced in distributed training exports :contentReference[oaicite:8]{index=8}

LoRA Configuration

The adapter targets the following projection layers:

q_proj
k_proj
v_proj
o_proj
up_proj
down_proj
gate_proj :contentReference[oaicite:9]{index=9}

This suggests the model was fine-tuned broadly across both attention and MLP projection modules rather than a minimal attention-only LoRA setup. :contentReference[oaicite:10]{index=10}

Training Status

This appears to be an intermediate training checkpoint, not a final fully documented release:

the folder is named checkpoint-200
trainer_state.json shows global_step: 200
max_steps is 5716
best_model_checkpoint is null :contentReference[oaicite:11]{index=11}

So this checkpoint looks like an early saved snapshot from a longer run rather than the final selected model. :contentReference[oaicite:12]{index=12}

Usage

Load with Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "mistralai/Mistral-Nemo-Instruct-2407"
adapter_id = "zaddyzaddy/Nemo-Recwnt/checkpoint-200"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype="auto",
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_id)

prompt = "Explain what this model is and how it should be loaded."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zaddyzaddy/Nemo-Recwnt

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Adapter

(95)

this model