|
--- |
|
base_model: HuggingFaceTB/SmolLM2-360M |
|
library_name: transformers |
|
model_name: SmolLM2-360M-tldr-sft-2025-02-12_15-13 |
|
tags: |
|
- generated_from_trainer |
|
- trl |
|
- sft |
|
license: mit |
|
--- |
|
|
|
# Smol-Hub-tldr |
|
|
|
<div style="float: right; margin-left: 1em;"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/60107b385ac3e86b3ea4fc34/dD9vx3VOPB0Tf6C_ZjJT2.png" alt="Model visualization" width="200"/> |
|
</div> |
|
|
|
This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-360M](https://huggingface.co/HuggingFaceTB/SmolLM2-360M). The model is focused on generating concise, one-sentence summaries of model and dataset cards from the Hugging Face Hub. These summaries are intended to be used for: |
|
|
|
- creating useful tl;dr descriptions that can give you a quick sense of what a dataset or model is for |
|
- as input text for creating embeddings for semantic search. You can see a demo of this in [librarian-bots/huggingface-datasets-semantic-search](https://huggingface.co/spaces/librarian-bots/huggingface-datasets-semantic-search). |
|
|
|
The model was trained using supervised fine-tuning (SFT) with [TRL](https://github.com/huggingface/trl). |
|
|
|
## Intended Use |
|
|
|
The model is designed to generate brief, informative summaries of: |
|
- Model cards: Focusing on key capabilities and characteristics |
|
- Dataset cards: Capturing essential dataset characteristics and purposes |
|
|
|
## Training Data |
|
|
|
The model was trained on: |
|
- Model card summaries generated by Llama 3.3 70B |
|
- Dataset card summaries generated by Llama 3.3 70B |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
generator = pipeline("text-generation", model="davanstrien/SmolLM2-360M-tldr-sft-2025-02-12_15-13", device="cuda") |
|
output = generator(input_text, max_new_tokens=128, return_full_text=False)[0] |
|
``` |
|
|
|
## Framework Versions |
|
- TRL 0.14.0 |
|
- Transformers 4.48.3 |
|
- PyTorch 2.6.0 |
|
- Datasets 3.2.0 |
|
- Tokenizers 0.21.0 |