|
--- |
|
license: mit |
|
--- |
|
This repository contains the composer training checkpoints for BioClinical ModernBERT [base](https://huggingface.co/thomas-sounack/BioClinical-ModernBERT-base) and [large](https://huggingface.co/thomas-sounack/BioClinical-ModernBERT-large). |
|
|
|
The training checkpoints for Bio ModernBERT [base](https://huggingface.co/thomas-sounack/Bio-ModernBERT-base) and [large](https://huggingface.co/thomas-sounack/Bio-ModernBERT-large) are also included. |
|
|
|
**A step-by-step guide for reproducing our model training with your own data is available [here](https://huggingface.co/blog/thomas-sounack/bioclinical-modernbert-tutorial)!** |
|
|
|
If you use BioClinical ModernBERT in your work, please cite our [preprint](https://arxiv.org/abs/2506.10896): |
|
|
|
``` |
|
@misc{sounack2025bioclinicalmodernbertstateoftheartlongcontext, |
|
title={BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP}, |
|
author={Thomas Sounack and Joshua Davis and Brigitte Durieux and Antoine Chaffin and Tom J. Pollard and Eric Lehman and Alistair E. W. Johnson and Matthew McDermott and Tristan Naumann and Charlotta Lindvall}, |
|
year={2025}, |
|
eprint={2506.10896}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2506.10896}, |
|
} |
|
``` |