Nemo-Recwnt — Checkpoint 200

This repository contains an intermediate LoRA/PEFT adapter checkpoint for mistralai/Mistral-Nemo-Instruct-2407. It is not a fully merged standalone model; it is an adapter intended to be loaded on top of the base model. :contentReference[oaicite:0]{index=0}

Model Summary

  • Base model: mistralai/Mistral-Nemo-Instruct-2407 :contentReference[oaicite:1]{index=1}
  • Adapter type: PEFT / LoRA :contentReference[oaicite:2]{index=2}
  • Task type: Causal language modeling (CAUSAL_LM) :contentReference[oaicite:3]{index=3}
  • Checkpoint step: 200 (global_step = 200) :contentReference[oaicite:4]{index=4}
  • LoRA rank (r): 16 :contentReference[oaicite:5]{index=5}
  • LoRA alpha: 16 :contentReference[oaicite:6]{index=6}
  • LoRA dropout: 0.05 :contentReference[oaicite:7]{index=7}

What’s in this Folder

This checkpoint directory includes:

  • adapter_model.safetensors — the LoRA adapter weights
  • adapter_config.json — PEFT adapter configuration
  • tokenizer.json, tokenizer_config.json, special_tokens_map.json — tokenizer assets
  • trainer_state.json — training state metadata
  • training_args.bin — saved training arguments
  • scheduler.pt, rng_state_*.pth — optimizer/runtime state files
  • zero_to_fp32.py — utility script often produced in distributed training exports :contentReference[oaicite:8]{index=8}

LoRA Configuration

The adapter targets the following projection layers:

  • q_proj
  • k_proj
  • v_proj
  • o_proj
  • up_proj
  • down_proj
  • gate_proj :contentReference[oaicite:9]{index=9}

This suggests the model was fine-tuned broadly across both attention and MLP projection modules rather than a minimal attention-only LoRA setup. :contentReference[oaicite:10]{index=10}

Training Status

This appears to be an intermediate training checkpoint, not a final fully documented release:

  • the folder is named checkpoint-200
  • trainer_state.json shows global_step: 200
  • max_steps is 5716
  • best_model_checkpoint is null :contentReference[oaicite:11]{index=11}

So this checkpoint looks like an early saved snapshot from a longer run rather than the final selected model. :contentReference[oaicite:12]{index=12}

Usage

Load with Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "mistralai/Mistral-Nemo-Instruct-2407"
adapter_id = "zaddyzaddy/Nemo-Recwnt/checkpoint-200"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype="auto",
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_id)

prompt = "Explain what this model is and how it should be loaded."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zaddyzaddy/Nemo-Recwnt

Free AI Image Generator No sign-up. Instant results. Open Now