router_mini_lm_l6 / README.md
Totole's picture
Add SetFit model
b69c2c9 verified
|
raw
history blame
24.5 kB
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
base_model: sentence-transformers/all-MiniLM-L12-v2
metrics:
  - accuracy
widget:
  - text: Could you provide the average temperature, annual rainfall in Paris?
  - text: >-
      Can you provide a summary of the key points discussed about urban
      development?
  - text: Compare ces deux documents
  - text: What are the steps required to apply for a passport?
  - text: What is the basic definition of seismic design?
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit with sentence-transformers/all-MiniLM-L12-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.7333333333333333
            name: Accuracy

SetFit with sentence-transformers/all-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
sub_queries
  • 'How can I use 3D print to build a bridge and how much would it be?'
  • 'Pourriez-vous détailler les critères spécifiques utilisés pour évaluer la durabilité des matériaux de construction, les types de systèmes HVAC les plus efficaces actuellement en usage dans les bâtiments verts, et les différentes méthodes employées pour réduire les déchets pendant la phase de construction ?'
  • 'Comment faire une etude de marche? Quelles sont les meilleures sources?'
summary
  • 'Quelles informations primordiales me conseillez-vous de mémoriser de ce document'
  • 'Quels sont les points principaux à retenir'
  • 'What is the primary theme of the document ?'
exchange
  • 'Pourriez-vous me fournir un résumé des points clés abordés dans notre discussion précédente ?'
  • 'Quels sont les points clés abordés dans notre discussion précédente ?'
  • 'Could you restate the main points discussed about acoustic engineering?'
simple_questions
  • 'Quelle est le principal moteur de la croissance économique ? Fais un post linkedin sur le sujet'
  • 'Pourriez-vous résumer les bénéfices que les utilisateurs peuvent tirer des récentes avancées en matériel informatique ?'
  • 'What is the purpose of environmental impact assessments?'
compare
  • 'Compare the methodologies'
  • 'Compare the nutritional information provided on these food labels'
  • 'Analysez comment la structure narrative de ces manuscrits influence leur message'

Evaluation

Metrics

Label Accuracy
all 0.7333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("egis-group/router_mini_lm_l6")
# Run inference
preds = model("Compare ces deux documents")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 13.4636 48
Label Training Sample Count
negative 0
positive 0

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3239 -
0.0152 50 0.3443 -
0.0304 100 0.2282 -
0.0456 150 0.2576 -
0.0608 200 0.2587 -
0.0760 250 0.1747 -
0.0912 300 0.1916 -
0.1064 350 0.1638 -
0.1216 400 0.1459 -
0.1368 450 0.1322 -
0.1520 500 0.038 -
0.1672 550 0.0636 -
0.1824 600 0.0613 -
0.1976 650 0.0322 -
0.2128 700 0.0159 -
0.2280 750 0.0029 -
0.2432 800 0.0012 -
0.2584 850 0.0019 -
0.2736 900 0.0025 -
0.2888 950 0.0028 -
0.3040 1000 0.001 -
0.3192 1050 0.0014 -
0.3344 1100 0.0007 -
0.3497 1150 0.001 -
0.3649 1200 0.0014 -
0.3801 1250 0.0003 -
0.3953 1300 0.0005 -
0.4105 1350 0.0003 -
0.4257 1400 0.0004 -
0.4409 1450 0.0003 -
0.4561 1500 0.0004 -
0.4713 1550 0.0003 -
0.4865 1600 0.0002 -
0.5017 1650 0.0004 -
0.5169 1700 0.0003 -
0.5321 1750 0.0003 -
0.5473 1800 0.0004 -
0.5625 1850 0.0002 -
0.5777 1900 0.0001 -
0.5929 1950 0.0001 -
0.6081 2000 0.0003 -
0.6233 2050 0.0002 -
0.6385 2100 0.0001 -
0.6537 2150 0.0002 -
0.6689 2200 0.0002 -
0.6841 2250 0.0001 -
0.6993 2300 0.0002 -
0.7145 2350 0.0003 -
0.7297 2400 0.0002 -
0.7449 2450 0.0002 -
0.7601 2500 0.0001 -
0.7753 2550 0.0002 -
0.7905 2600 0.0001 -
0.8057 2650 0.0001 -
0.8209 2700 0.0001 -
0.8361 2750 0.0001 -
0.8513 2800 0.0001 -
0.8665 2850 0.0001 -
0.8817 2900 0.0001 -
0.8969 2950 0.0001 -
0.9121 3000 0.0001 -
0.9273 3050 0.0001 -
0.9425 3100 0.0001 -
0.9577 3150 0.0001 -
0.9729 3200 0.0001 -
0.9881 3250 0.0001 -
1.0 3289 - 0.0982
1.0033 3300 0.0001 -
1.0185 3350 0.0001 -
1.0337 3400 0.0001 -
1.0490 3450 0.0001 -
1.0642 3500 0.0001 -
1.0794 3550 0.0249 -
1.0946 3600 0.0002 -
1.1098 3650 0.0001 -
1.1250 3700 0.0001 -
1.1402 3750 0.0001 -
1.1554 3800 0.0001 -
1.1706 3850 0.0001 -
1.1858 3900 0.0001 -
1.2010 3950 0.0001 -
1.2162 4000 0.0001 -
1.2314 4050 0.0 -
1.2466 4100 0.0001 -
1.2618 4150 0.0 -
1.2770 4200 0.0001 -
1.2922 4250 0.0 -
1.3074 4300 0.0001 -
1.3226 4350 0.0001 -
1.3378 4400 0.0001 -
1.3530 4450 0.0001 -
1.3682 4500 0.0001 -
1.3834 4550 0.0001 -
1.3986 4600 0.0001 -
1.4138 4650 0.0001 -
1.4290 4700 0.0001 -
1.4442 4750 0.0001 -
1.4594 4800 0.0001 -
1.4746 4850 0.0001 -
1.4898 4900 0.0 -
1.5050 4950 0.0 -
1.5202 5000 0.0 -
1.5354 5050 0.0 -
1.5506 5100 0.0 -
1.5658 5150 0.0001 -
1.5810 5200 0.0001 -
1.5962 5250 0.0 -
1.6114 5300 0.0 -
1.6266 5350 0.0001 -
1.6418 5400 0.0001 -
1.6570 5450 0.0 -
1.6722 5500 0.0001 -
1.6874 5550 0.0 -
1.7026 5600 0.0001 -
1.7178 5650 0.0 -
1.7330 5700 0.0001 -
1.7483 5750 0.0001 -
1.7635 5800 0.0001 -
1.7787 5850 0.0001 -
1.7939 5900 0.0 -
1.8091 5950 0.0001 -
1.8243 6000 0.0001 -
1.8395 6050 0.0 -
1.8547 6100 0.0001 -
1.8699 6150 0.0 -
1.8851 6200 0.0 -
1.9003 6250 0.0 -
1.9155 6300 0.0 -
1.9307 6350 0.0001 -
1.9459 6400 0.0 -
1.9611 6450 0.0 -
1.9763 6500 0.0001 -
1.9915 6550 0.0 -
2.0 6578 - 0.0939
2.0067 6600 0.0001 -
2.0219 6650 0.0001 -
2.0371 6700 0.0001 -
2.0523 6750 0.0001 -
2.0675 6800 0.0 -
2.0827 6850 0.0 -
2.0979 6900 0.0 -
2.1131 6950 0.0 -
2.1283 7000 0.0001 -
2.1435 7050 0.0001 -
2.1587 7100 0.0 -
2.1739 7150 0.0 -
2.1891 7200 0.0001 -
2.2043 7250 0.0001 -
2.2195 7300 0.0 -
2.2347 7350 0.0 -
2.2499 7400 0.0 -
2.2651 7450 0.0 -
2.2803 7500 0.0 -
2.2955 7550 0.0001 -
2.3107 7600 0.0 -
2.3259 7650 0.0001 -
2.3411 7700 0.0 -
2.3563 7750 0.0001 -
2.3715 7800 0.0 -
2.3867 7850 0.0001 -
2.4019 7900 0.0 -
2.4171 7950 0.0 -
2.4324 8000 0.0 -
2.4476 8050 0.0001 -
2.4628 8100 0.0001 -
2.4780 8150 0.0 -
2.4932 8200 0.0001 -
2.5084 8250 0.0001 -
2.5236 8300 0.0001 -
2.5388 8350 0.0 -
2.5540 8400 0.0 -
2.5692 8450 0.0 -
2.5844 8500 0.0 -
2.5996 8550 0.0 -
2.6148 8600 0.0 -
2.6300 8650 0.0 -
2.6452 8700 0.0 -
2.6604 8750 0.0 -
2.6756 8800 0.0 -
2.6908 8850 0.0 -
2.7060 8900 0.0001 -
2.7212 8950 0.0 -
2.7364 9000 0.0 -
2.7516 9050 0.0001 -
2.7668 9100 0.0 -
2.7820 9150 0.0 -
2.7972 9200 0.0 -
2.8124 9250 0.0 -
2.8276 9300 0.0 -
2.8428 9350 0.0 -
2.8580 9400 0.0 -
2.8732 9450 0.0 -
2.8884 9500 0.0 -
2.9036 9550 0.0 -
2.9188 9600 0.0 -
2.9340 9650 0.0 -
2.9492 9700 0.0 -
2.9644 9750 0.0 -
2.9796 9800 0.0 -
2.9948 9850 0.0 -
3.0 9867 - 0.0951
3.0100 9900 0.0 -
3.0252 9950 0.0 -
3.0404 10000 0.0 -
3.0556 10050 0.0 -
3.0708 10100 0.0 -
3.0860 10150 0.0 -
3.1012 10200 0.0 -
3.1164 10250 0.0 -
3.1317 10300 0.0 -
3.1469 10350 0.0 -
3.1621 10400 0.0 -
3.1773 10450 0.0001 -
3.1925 10500 0.0 -
3.2077 10550 0.0 -
3.2229 10600 0.0 -
3.2381 10650 0.0 -
3.2533 10700 0.0 -
3.2685 10750 0.0 -
3.2837 10800 0.0 -
3.2989 10850 0.0 -
3.3141 10900 0.0 -
3.3293 10950 0.0 -
3.3445 11000 0.0 -
3.3597 11050 0.0 -
3.3749 11100 0.0 -
3.3901 11150 0.0 -
3.4053 11200 0.0 -
3.4205 11250 0.0 -
3.4357 11300 0.0 -
3.4509 11350 0.0 -
3.4661 11400 0.0 -
3.4813 11450 0.0 -
3.4965 11500 0.0 -
3.5117 11550 0.0 -
3.5269 11600 0.0 -
3.5421 11650 0.0 -
3.5573 11700 0.0 -
3.5725 11750 0.0 -
3.5877 11800 0.0 -
3.6029 11850 0.0 -
3.6181 11900 0.0 -
3.6333 11950 0.0 -
3.6485 12000 0.0 -
3.6637 12050 0.0 -
3.6789 12100 0.0 -
3.6941 12150 0.0 -
3.7093 12200 0.0 -
3.7245 12250 0.0 -
3.7397 12300 0.0 -
3.7549 12350 0.0 -
3.7701 12400 0.0 -
3.7853 12450 0.0 -
3.8005 12500 0.0 -
3.8157 12550 0.0 -
3.8310 12600 0.0 -
3.8462 12650 0.0 -
3.8614 12700 0.0 -
3.8766 12750 0.0 -
3.8918 12800 0.0 -
3.9070 12850 0.0 -
3.9222 12900 0.0 -
3.9374 12950 0.0 -
3.9526 13000 0.0 -
3.9678 13050 0.0 -
3.9830 13100 0.0 -
3.9982 13150 0.0 -
4.0 13156 - 0.0954
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.0+cu121
  • Datasets: 2.19.2
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}