answerdotai/ModernBERT-base trained on schema JSONL
This is a Cross Encoder model finetuned from answerdotai/ModernBERT-base using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: answerdotai/ModernBERT-base
- Maximum Sequence Length: 8192 tokens
- Number of Output Labels: 1 label
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("thanhdath/reranker-ModernBERT-base-schema-bce")
# Get scores for pairs of texts
pairs = [
['Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;', 'Column: movies.movie_popularity ; Column meaning: Number of Mubi users who love this movie ; Column type: INTEGER ; Column has values: "105" ; Column has null values: False'],
['Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;', 'Column: lists_users.user_has_payment_method ; Column meaning: user_has_payment_method ; Column type: TEXT ; Column has values: "1" ; Column has null values: False'],
['Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;', 'Column: lists.list_description ; Column meaning: List description made by the user ; Column type: TEXT ; Column has values: "<p>[sorted by the year released]</p>", "<p>Films sorted by release year.</p>" ; Column has null values: False'],
['Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;', 'Column: lists.list_second_image_url ; Column meaning: list_second_image_url ; Column type: TEXT ; Column has values: ; Column has null values: False'],
['Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;', 'Column: movies.movie_release_year ; Column meaning: Release year of the movie ; Column type: INTEGER ; Column has values: "1945" ; Column has null values: False'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.\nHint: released in the year 1945 refers to movie_release_year = 1945;',
[
'Column: movies.movie_popularity ; Column meaning: Number of Mubi users who love this movie ; Column type: INTEGER ; Column has values: "105" ; Column has null values: False',
'Column: lists_users.user_has_payment_method ; Column meaning: user_has_payment_method ; Column type: TEXT ; Column has values: "1" ; Column has null values: False',
'Column: lists.list_description ; Column meaning: List description made by the user ; Column type: TEXT ; Column has values: "<p>[sorted by the year released]</p>", "<p>Films sorted by release year.</p>" ; Column has null values: False',
'Column: lists.list_second_image_url ; Column meaning: list_second_image_url ; Column type: TEXT ; Column has values: ; Column has null values: False',
'Column: movies.movie_release_year ; Column meaning: Release year of the movie ; Column type: INTEGER ; Column has values: "1945" ; Column has null values: False',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Training Details
Training Dataset
Unnamed Dataset
- Size: 180,144 training samples
- Columns:
sentence_A
,sentence_B
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_A sentence_B label type string string int details - min: 95 characters
- mean: 264.54 characters
- max: 551 characters
- min: 127 characters
- mean: 177.59 characters
- max: 313 characters
- 0: ~75.00%
- 1: ~25.00%
- Samples:
sentence_A sentence_B label Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.
Hint: released in the year 1945 refers to movie_release_year = 1945;Column: movies.movie_popularity ; Column meaning: Number of Mubi users who love this movie ; Column type: INTEGER ; Column has values: "105" ; Column has null values: False
1
Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.
Hint: released in the year 1945 refers to movie_release_year = 1945;Column: lists_users.user_has_payment_method ; Column meaning: user_has_payment_method ; Column type: TEXT ; Column has values: "1" ; Column has null values: False
0
Name movie titles released in year 1945. Sort the listing by the descending order of movie popularity.
Hint: released in the year 1945 refers to movie_release_year = 1945;Column: lists.list_description ; Column meaning: List description made by the user ; Column type: TEXT ; Column has values: "
[sorted by the year released]
", "Films sorted by release year.
" ; Column has null values: False0
- Loss:
BinaryCrossEntropyLoss
with these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": 2.9947667121887207 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 128per_device_eval_batch_size
: 128learning_rate
: 1e-05weight_decay
: 0.01num_train_epochs
: 10warmup_ratio
: 0.1seed
: 12fp16
: Truedataloader_num_workers
: 4
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 10max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 4dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportionalrouter_mapping
: {}learning_rate_mapping
: {}
Training Logs
Click to expand
Epoch | Step | Training Loss |
---|---|---|
0.0007 | 1 | 1.1598 |
0.0710 | 100 | 1.0653 |
0.1420 | 200 | 1.023 |
0.2131 | 300 | 0.9502 |
0.2841 | 400 | 0.8251 |
0.3551 | 500 | 0.6595 |
0.4261 | 600 | 0.5691 |
0.4972 | 700 | 0.5061 |
0.5682 | 800 | 0.4711 |
0.6392 | 900 | 0.4209 |
0.7102 | 1000 | 0.3882 |
0.7812 | 1100 | 0.3778 |
0.8523 | 1200 | 0.3743 |
0.9233 | 1300 | 0.3248 |
0.9943 | 1400 | 0.3283 |
1.0653 | 1500 | 0.2973 |
1.1364 | 1600 | 0.2653 |
1.2074 | 1700 | 0.263 |
1.2784 | 1800 | 0.2548 |
1.3494 | 1900 | 0.2329 |
1.4205 | 2000 | 0.2345 |
1.4915 | 2100 | 0.2303 |
1.5625 | 2200 | 0.205 |
1.6335 | 2300 | 0.2077 |
1.7045 | 2400 | 0.1836 |
1.7756 | 2500 | 0.186 |
1.8466 | 2600 | 0.1877 |
1.9176 | 2700 | 0.1757 |
1.9886 | 2800 | 0.1742 |
2.0597 | 2900 | 0.1278 |
2.1307 | 3000 | 0.104 |
2.2017 | 3100 | 0.1135 |
2.2727 | 3200 | 0.1087 |
2.3438 | 3300 | 0.0998 |
2.4148 | 3400 | 0.103 |
2.4858 | 3500 | 0.1029 |
2.5568 | 3600 | 0.096 |
2.6278 | 3700 | 0.1021 |
2.6989 | 3800 | 0.0836 |
2.7699 | 3900 | 0.08 |
2.8409 | 4000 | 0.0858 |
2.9119 | 4100 | 0.0816 |
2.9830 | 4200 | 0.0724 |
3.0540 | 4300 | 0.0451 |
3.125 | 4400 | 0.0415 |
3.1960 | 4500 | 0.0396 |
3.2670 | 4600 | 0.0397 |
3.3381 | 4700 | 0.0406 |
3.4091 | 4800 | 0.0468 |
3.4801 | 4900 | 0.0395 |
3.5511 | 5000 | 0.0399 |
3.6222 | 5100 | 0.0498 |
3.6932 | 5200 | 0.0453 |
3.7642 | 5300 | 0.0376 |
3.8352 | 5400 | 0.0472 |
3.9062 | 5500 | 0.038 |
3.9773 | 5600 | 0.0323 |
4.0483 | 5700 | 0.0214 |
4.1193 | 5800 | 0.0173 |
4.1903 | 5900 | 0.0229 |
4.2614 | 6000 | 0.0218 |
4.3324 | 6100 | 0.0216 |
4.4034 | 6200 | 0.0135 |
4.4744 | 6300 | 0.015 |
4.5455 | 6400 | 0.0204 |
4.6165 | 6500 | 0.0201 |
4.6875 | 6600 | 0.0145 |
4.7585 | 6700 | 0.0146 |
4.8295 | 6800 | 0.0191 |
4.9006 | 6900 | 0.0204 |
4.9716 | 7000 | 0.0129 |
5.0426 | 7100 | 0.0158 |
5.1136 | 7200 | 0.0045 |
5.1847 | 7300 | 0.0033 |
5.2557 | 7400 | 0.0041 |
5.3267 | 7500 | 0.0082 |
5.3977 | 7600 | 0.0129 |
5.4688 | 7700 | 0.0055 |
5.5398 | 7800 | 0.0047 |
5.6108 | 7900 | 0.0076 |
5.6818 | 8000 | 0.0085 |
5.7528 | 8100 | 0.0129 |
5.8239 | 8200 | 0.0089 |
5.8949 | 8300 | 0.0074 |
5.9659 | 8400 | 0.0075 |
6.0369 | 8500 | 0.0061 |
6.1080 | 8600 | 0.0025 |
6.1790 | 8700 | 0.003 |
6.25 | 8800 | 0.0055 |
6.3210 | 8900 | 0.0048 |
6.3920 | 9000 | 0.0036 |
6.4631 | 9100 | 0.0052 |
6.5341 | 9200 | 0.0014 |
6.6051 | 9300 | 0.0045 |
6.6761 | 9400 | 0.0022 |
6.7472 | 9500 | 0.0043 |
6.8182 | 9600 | 0.0036 |
6.8892 | 9700 | 0.0062 |
6.9602 | 9800 | 0.0059 |
7.0312 | 9900 | 0.0018 |
7.1023 | 10000 | 0.0029 |
7.1733 | 10100 | 0.002 |
7.2443 | 10200 | 0.004 |
7.3153 | 10300 | 0.002 |
7.3864 | 10400 | 0.0016 |
7.4574 | 10500 | 0.0031 |
7.5284 | 10600 | 0.0032 |
7.5994 | 10700 | 0.0025 |
7.6705 | 10800 | 0.0016 |
7.7415 | 10900 | 0.0014 |
7.8125 | 11000 | 0.0011 |
7.8835 | 11100 | 0.0005 |
7.9545 | 11200 | 0.0001 |
8.0256 | 11300 | 0.0001 |
8.0966 | 11400 | 0.0003 |
8.1676 | 11500 | 0.0 |
8.2386 | 11600 | 0.0021 |
8.3097 | 11700 | 0.0001 |
8.3807 | 11800 | 0.0002 |
8.4517 | 11900 | 0.0002 |
8.5227 | 12000 | 0.0027 |
8.5938 | 12100 | 0.0 |
8.6648 | 12200 | 0.0024 |
8.7358 | 12300 | 0.0001 |
8.8068 | 12400 | 0.0004 |
8.8778 | 12500 | 0.0001 |
8.9489 | 12600 | 0.0008 |
9.0199 | 12700 | 0.0001 |
9.0909 | 12800 | 0.0 |
9.1619 | 12900 | 0.0 |
9.2330 | 13000 | 0.0 |
9.3040 | 13100 | 0.0 |
9.375 | 13200 | 0.0 |
9.4460 | 13300 | 0.0 |
9.5170 | 13400 | 0.0 |
9.5881 | 13500 | 0.0 |
9.6591 | 13600 | 0.0 |
9.7301 | 13700 | 0.0 |
9.8011 | 13800 | 0.0 |
9.8722 | 13900 | 0.0 |
9.9432 | 14000 | 0.0021 |
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 5.1.0
- Transformers: 4.55.2
- PyTorch: 2.2.2+cu121
- Accelerate: 1.10.0
- Datasets: 4.0.0
- Tokenizers: 0.21.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for thanhdath/reranker-ModernBERT-base-schema-bce
Base model
answerdotai/ModernBERT-base