ModernBERT-base-nli / README.md
sileod's picture
Update README.md
975123f verified
|
raw
history blame
9.88 kB
metadata
library_name: transformers
base_model:
  - answerdotai/ModernBERT-base
license: apache-2.0
language:
  - en
pipeline_tag: zero-shot-classification
datasets:
  - nyu-mll/glue
  - facebook/anli
tags:
  - instruct
  - natural-language-inference
  - nli

Model Card for Model ID

ModernBERT multi-task fine-tuned on tasksource NLI tasks, including MNLI, ANLI, SICK, WANLI, doc-nli, LingNLI, FOLIO, FOL-NLI, LogicNLI, Label-NLI and all datasets in the below table). This is the equivalent of an "instruct" version. The model was trained for 200k steps on an Nvidia A30 GPU.

test_name test_accuracy
glue/mnli 0.87
glue/qnli 0.93
glue/rte 0.85
glue/mrpc 0.87
glue/qqp 0.9
glue/cola 0.86
glue/sst2 0.96
super_glue/boolq 0.64
super_glue/cb 0.89
super_glue/multirc 0.82
super_glue/wic 0.67
super_glue/axg 0.89
anli/a1 0.66
anli/a2 0.49
anli/a3 0.44
sick/label 0.93
sick/entailment_AB 0.91
snli 0.83
scitail/snli_format 0.94
hans 1
WANLI 0.74
recast/recast_ner 0.87
recast/recast_sentiment 0.99
recast/recast_verbnet 0.88
recast/recast_megaveridicality 0.88
recast/recast_verbcorner 0.94
recast/recast_kg_relations 0.91
recast/recast_factuality 0.94
recast/recast_puns 0.96
probability_words_nli/reasoning_1hop 0.99
probability_words_nli/usnli 0.72
probability_words_nli/reasoning_2hop 0.98
nan-nli 0.85
nli_fever 0.78
breaking_nli 0.99
conj_nli 0.74
fracas 0.86
dialogue_nli 0.93
mpe 0.74
dnc 0.92
recast_white/fnplus 0.82
recast_white/sprl 0.9
recast_white/dpr 0.68
robust_nli/IS_CS 0.79
robust_nli/LI_LI 0.99
robust_nli/ST_WO 0.85
robust_nli/PI_SP 0.74
robust_nli/PI_CD 0.8
robust_nli/ST_SE 0.81
robust_nli/ST_NE 0.86
robust_nli/ST_LM 0.87
robust_nli_is_sd 1
robust_nli_li_ts 0.89
add_one_rte 0.94
paws/labeled_final 0.95
pragmeval/pdtb 0.64
lex_glue/scotus 0.55
lex_glue/ledgar 0.8
dynasent/dynabench.dynasent.r1.all/r1 0.81
dynasent/dynabench.dynasent.r2.all/r2 0.75
cycic_classification 0.9
lingnli 0.84
monotonicity-entailment 0.97
scinli 0.8
naturallogic 0.96
dynahate 0.78
syntactic-augmentation-nli 0.92
autotnli 0.94
defeasible-nli/atomic 0.81
defeasible-nli/snli 0.78
help-nli 0.96
nli-veridicality-transitivity 0.98
lonli 0.97
dadc-limit-nli 0.69
folio 0.66
tomi-nli 0.48
puzzte 0.6
temporal-nli 0.92
counterfactually-augmented-snli 0.79
cnli 0.87
boolq-natural-perturbations 0.66
equate 0.63
logiqa-2.0-nli 0.52
mindgames 0.96
ConTRoL-nli 0.67
logical-fallacy 0.37
cladder 0.87
conceptrules_v2 1
zero-shot-label-nli 0.82
scone 0.98
monli 1
SpaceNLI 1
propsegment/nli 0.88
FLD.v2/default 0.91
FLD.v2/star 0.76
SDOH-NLI 0.98
scifact_entailment 0.84
AdjectiveScaleProbe-nli 0.99
resnli 1
semantic_fragments_nli 0.99
dataset_train_nli 0.94
nlgraph 0.94
ruletaker 0.99
PARARULE-Plus 1
logical-entailment 0.86
nope 0.44
LogicNLI 0.86
contract-nli/contractnli_a/seg 0.87
contract-nli/contractnli_b/full 0.79
nli4ct_semeval2024 0.67
biosift-nli 0.92
SIGA-nli 0.53
FOL-nli 0.8
doc-nli 0.77
mctest-nli 0.87
natural-language-satisfiability 0.9
idioms-nli 0.81
lifecycle-entailment 0.78
MSciNLI 0.85
hover-3way/nli 0.88
seahorse_summarization_evaluation 0.73
missing-item-prediction/contrastive 0.79
Pol_NLI 0.89
synthetic-retrieval-NLI/count 0.64
synthetic-retrieval-NLI/position 0.89
synthetic-retrieval-NLI/binary 0.91
babi_nli 0.97
gen_debiased_nli 0.91

Usage

[ZS] Zero-shot classification pipeline

from transformers import pipeline
classifier = pipeline("zero-shot-classification",model="tasksource/ModernBERT-base-nli")

text = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(text, candidate_labels)

NLI training data of this model includes label-nli, a NLI dataset specially constructed to improve this kind of zero-shot classification.

[NLI] Natural language inference pipeline

from transformers import pipeline
pipe = pipeline("text-classification",model="tasksource/ModernBERT-base-nli")
pipe([dict(text='there is a cat',
  text_pair='there is a black cat')]) #list of (premise,hypothesis)

Backbone for further fune-tuning

This checkpoint has stronger reasoning and fine-grained abilities than the base version and can be used for further fine-tuning.

Citation

@inproceedings{sileo-2024-tasksource,
    title = "tasksource: A Large Collection of {NLP} tasks with a Structured Dataset Preprocessing Framework",
    author = "Sileo, Damien",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.1361",
    pages = "15655--15684",
}