oracat
/

bert-paper-classifier-arxiv

@@ -5,31 +5,17 @@ tags:
 metrics:
 - accuracy
 model-index:
-- name: bert-paper-classifier-arxiv
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# bert-paper-classifier-arxiv
-This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 1.8488
-- Accuracy: 0.6562
-## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -37,27 +23,12 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 64
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 10
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 1.2783        | 1.0   | 579  | 1.2078          | 0.6502   |
-| 0.946         | 2.0   | 1158 | 1.1323          | 0.669    |
-| 0.9028        | 3.0   | 1737 | 1.1039          | 0.675    |
-| 0.6021        | 4.0   | 2316 | 1.1990          | 0.6765   |
-| 0.4469        | 5.0   | 2895 | 1.3144          | 0.6697   |
-| 0.3425        | 6.0   | 3474 | 1.4456          | 0.6647   |
-| 0.2306        | 7.0   | 4053 | 1.5915          | 0.6583   |
-| 0.1458        | 8.0   | 4632 | 1.7172          | 0.6555   |
-| 0.1209        | 9.0   | 5211 | 1.8114          | 0.6525   |
-| 0.0792        | 10.0  | 5790 | 1.8488          | 0.6562   |
 ### Framework versions
@@ -65,4 +36,4 @@ The following hyperparameters were used during training:
 - Transformers 4.28.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.11.0
-- Tokenizers 0.13.3

 metrics:
 - accuracy
 model-index:
+- name: bert-paper-classifier
   results: []
 ---
+# bert-paper-classifier
+This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract) on [the dataset](https://www.kaggle.com/datasets/neelshah18/arxivdataset) with titles and abstracts of papers from arXiv.
 ## Intended uses & limitations
+This model is intended to predict the [arXiv taxonomy terms](https://arxiv.org/category_taxonomy) given the paper title (and optionally its abstract) — this is relevant for papers in mathematics, computer science, physics, etc. For example, it is likely to predict `cs.CL` (computation and language) as the mostl likely category for [the original transformers paper](https://arxiv.org/abs/1706.03762).
 ## Training procedure
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 128
+- eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 3
 ### Framework versions
 - Transformers 4.28.1
 - Pytorch 2.0.0+cu117
 - Datasets 2.11.0
+- Tokenizers 0.13.3