added model

Browse files

Files changed (11) hide show

.gitattributes +3 -0
README.md +138 -0
added_tokens.json +104 -0
config.json +26 -0
configuration_wsl.py +45 -0
model.safetensors +3 -0
modeling_wsl.py +456 -0
special_tokens_map.json +154 -0
spm.model +3 -0
tokenizer.json +0 -0
tokenizer_config.json +970 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+model filter=lfs diff=lfs merge=lfs -text
+model.safetensors filter=lfs diff=lfs merge=lfs -text
+spm.model filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,141 @@
 ---
 license: cc-by-nc-sa-4.0
 ---

 ---
 license: cc-by-nc-sa-4.0
 ---
+---
+license:
+- cc-by-nc-sa-4.0
+source_datasets:
+- original
+task_ids:
+- word-sense-disambiguation
+pretty_name: word-sense-linking-dataset
+tags:
+- word-sense-linking
+- word-sense-disambiguation
+- lexical-semantics
+size_categories:
+- 10K<n<100K
+extra_gated_fields:
+  Email: text
+  Company: text
+  Country: country
+  I want to use this dataset for:
+    type: select
+    options:
+      - Research
+      - Education
+      - label: Other
+        value: other
+  I agree to use this dataset for non-commercial use ONLY: checkbox
+extra_gated_heading: "Acknowledge our [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://github.com/Babelscape/WSL/wsl_data_license.txt) to access the repository"
+extra_gated_description: "Our team may take 2-3 days to process your request"
+extra_gated_button_content: "Acknowledge license"
+---
+---
+# Word Sense Linking: Disambiguating Outside the Sandbox
+[![Conference](http://img.shields.io/badge/ACL-2024-4b44ce.svg)](https://2024.aclweb.org/)
+[![Paper](http://img.shields.io/badge/paper-ACL--anthology-B31B1B.svg)](https://aclanthology.org/)
+[![Hugging Face Collection](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-FCD21D)](https://huggingface.co/collections/Babelscape/word-sense-linking-66ace2182bc45680964cefcb)
+## Model Description
+The Word Sense Linking model is designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory. The annotations are provided as sense keys from WordNet, a large lexical database of English.
+## Installation
+Installation from PyPI:
+```bash
+git clone https://github.com/Babelscape/WSL
+cd WSL
+pip install -r requirements.txt
+```
+## Usage
+WSL is composed of two main components: a retriever and a reader.
+The retriever is responsible for retrieving relevant senses from a senses inventory (e.g WordNet),
+while the reader is responsible for extracting spans from the input text and link them to the retrieved documents.
+WSL can be used with the `from_pretrained` method to load a pre-trained pipeline.
+```python
+from wsl import WSL
+from wsl.inference.data.objects import WSLOutput
+wsl_model = WSL.from_pretrained("Babelscape/wsl-base")
+relik_out: WSLOutput = wsl_model("Bus drivers drive busses for a living.")
+```
+    WSLOutput(
+    text='Bus drivers drive busses for a living.',
+    tokens=['Bus', 'drivers', 'drive', 'busses', 'for', 'a', 'living', '.'],
+    id=0,
+    spans=[
+        Span(start=0, end=11, label='bus driver: someone who drives a bus', text='Bus drivers'),
+        Span(start=12, end=17, label='drive: operate or control a vehicle', text='drive'),
+        Span(start=18, end=24, label='bus: a vehicle carrying many passengers; used for public transport', text='busses'),
+        Span(start=31, end=37, label='living: the financial means whereby one lives', text='living')
+    ],
+    candidates=Candidates(
+        candidates=[
+                    {"text": "bus driver: someone who drives a bus", "id": "bus_driver%1:18:00::", "metadata": {}},
+                    {"text": "driver: the operator of a motor vehicle", "id": "driver%1:18:00::", "metadata": {}},
+                    {"text": "driver: someone who drives animals that pull a vehicle", "id": "driver%1:18:02::", "metadata": {}},
+                    {"text": "bus: a vehicle carrying many passengers; used for public transport", "id": "bus%1:06:00::", "metadata": {}},
+                    {"text": "living: the financial means whereby one lives", "id": "living%1:26:00::", "metadata": {}}
+        ]
+    ),
+)
+## Model Performance
+Here you can find the performances of our model on the [WSL evaluation dataset](https://huggingface.co/datasets/Babelscape/wsl).
+### Validation (SE07)
+| Models       | P    | R      | F1     |
+|--------------|------|--------|--------|
+| BEM_SUP      | 67.6 | 40.9   | 51.0   |
+| BEM_HEU      | 70.8 | 51.2   | 59.4   |
+| ConSeC_SUP   | 76.4 | 46.5   | 57.8   |
+| ConSeC_HEU   | **76.7** | 55.4   | 64.3   |
+| **Our Model**| 73.8 | **74.9** | **74.4** |
+### Test (ALL_FULL)
+| Models       | P    | R      | F1     |
+|--------------|------|--------|--------|
+| BEM_SUP      | 74.8 | 50.7   | 60.4   |
+| BEM_HEU      | 76.6 | 61.2   | 68.0   |
+| ConSeC_SUP   | 78.9 | 53.1   | 63.5   |
+| ConSeC_HEU   | **80.4** | 64.3   | 71.5   |
+| **Our Model**| 75.2 | **76.7** | **75.9** |
+## Additional Information
+**Licensing Information**: Contents of this repository are restricted to only non-commercial research purposes under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright of the dataset contents belongs to Babelscape.
+## Citation Information
+```bibtex
+@inproceedings{bejgu-etal-2024-wsl,
+    title     = "Word Sense Linking: Disambiguating Outside the Sandbox",
+    author    = "Bejgu, Andrei Stefan and Barba, Edoardo and Procopio, Luigi and Fern{\'a}ndez-Castro, Alberte and Navigli, Roberto",
+    booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
+    month     = aug,
+    year      = "2024",
+    address   = "Bangkok, Thailand",
+    publisher = "Association for Computational Linguistics",
+}
+```
+**Contributions**: Thanks to [@andreim14](https://github.com/andreim14), [@edobobo](https://github.com/edobobo), [@poccio](https://github.com/poccio) and [@navigli](https://github.com/navigli) for adding this model.

added_tokens.json ADDED Viewed

	@@ -0,0 +1,104 @@

+{
+  "--NME--": 128001,
+  "[E-0]": 128002,
+  "[E-10]": 128012,
+  "[E-11]": 128013,
+  "[E-12]": 128014,
+  "[E-13]": 128015,
+  "[E-14]": 128016,
+  "[E-15]": 128017,
+  "[E-16]": 128018,
+  "[E-17]": 128019,
+  "[E-18]": 128020,
+  "[E-19]": 128021,
+  "[E-1]": 128003,
+  "[E-20]": 128022,
+  "[E-21]": 128023,
+  "[E-22]": 128024,
+  "[E-23]": 128025,
+  "[E-24]": 128026,
+  "[E-25]": 128027,
+  "[E-26]": 128028,
+  "[E-27]": 128029,
+  "[E-28]": 128030,
+  "[E-29]": 128031,
+  "[E-2]": 128004,
+  "[E-30]": 128032,
+  "[E-31]": 128033,
+  "[E-32]": 128034,
+  "[E-33]": 128035,
+  "[E-34]": 128036,
+  "[E-35]": 128037,
+  "[E-36]": 128038,
+  "[E-37]": 128039,
+  "[E-38]": 128040,
+  "[E-39]": 128041,
+  "[E-3]": 128005,
+  "[E-40]": 128042,
+  "[E-41]": 128043,
+  "[E-42]": 128044,
+  "[E-43]": 128045,
+  "[E-44]": 128046,
+  "[E-45]": 128047,
+  "[E-46]": 128048,
+  "[E-47]": 128049,
+  "[E-48]": 128050,
+  "[E-49]": 128051,
+  "[E-4]": 128006,
+  "[E-50]": 128052,
+  "[E-51]": 128053,
+  "[E-52]": 128054,
+  "[E-53]": 128055,
+  "[E-54]": 128056,
+  "[E-55]": 128057,
+  "[E-56]": 128058,
+  "[E-57]": 128059,
+  "[E-58]": 128060,
+  "[E-59]": 128061,
+  "[E-5]": 128007,
+  "[E-60]": 128062,
+  "[E-61]": 128063,
+  "[E-62]": 128064,
+  "[E-63]": 128065,
+  "[E-64]": 128066,
+  "[E-65]": 128067,
+  "[E-66]": 128068,
+  "[E-67]": 128069,
+  "[E-68]": 128070,
+  "[E-69]": 128071,
+  "[E-6]": 128008,
+  "[E-70]": 128072,
+  "[E-71]": 128073,
+  "[E-72]": 128074,
+  "[E-73]": 128075,
+  "[E-74]": 128076,
+  "[E-75]": 128077,
+  "[E-76]": 128078,
+  "[E-77]": 128079,
+  "[E-78]": 128080,
+  "[E-79]": 128081,
+  "[E-7]": 128009,
+  "[E-80]": 128082,
+  "[E-81]": 128083,
+  "[E-82]": 128084,
+  "[E-83]": 128085,
+  "[E-84]": 128086,
+  "[E-85]": 128087,
+  "[E-86]": 128088,
+  "[E-87]": 128089,
+  "[E-88]": 128090,
+  "[E-89]": 128091,
+  "[E-8]": 128010,
+  "[E-90]": 128092,
+  "[E-91]": 128093,
+  "[E-92]": 128094,
+  "[E-93]": 128095,
+  "[E-94]": 128096,
+  "[E-95]": 128097,
+  "[E-96]": 128098,
+  "[E-97]": 128099,
+  "[E-98]": 128100,
+  "[E-99]": 128101,
+  "[E-9]": 128011,
+  "[MASK]": 128000
+}

config.json ADDED Viewed

	@@ -0,0 +1,26 @@

+{
+  "_name_or_path": "/mnt/data2/neural/wsl-dataset/relik/pretrained/relik-reader/relik-reader",
+  "activation": "gelu",
+  "add_entity_embedding": null,
+  "additional_special_symbols": 101,
+  "additional_special_symbols_types": 0,
+  "architectures": [
+    "WSLReaderSpanModel"
+  ],
+  "auto_map": {
+    "AutoConfig": "configuration_wsl.WSLReaderConfig",
+    "AutoModel": "modeling_wsl.WSLReaderSpanModel"
+  },
+  "binary_end_logits": false,
+  "default_reader_class": null,
+  "entity_type_loss": false,
+  "linears_hidden_size": 512,
+  "model_type": "wsl-reader",
+  "num_layers": null,
+  "threshold": 0.5,
+  "torch_dtype": "float32",
+  "training": true,
+  "transformer_model": "microsoft/deberta-v3-base",
+  "transformers_version": "4.41.2",
+  "use_last_k_layers": 1
+}

configuration_wsl.py ADDED Viewed

	@@ -0,0 +1,45 @@

+from typing import Optional
+from transformers import AutoConfig
+from transformers.configuration_utils import PretrainedConfig
+class WSLReaderConfig(PretrainedConfig):
+    model_type = "wsl-reader"
+    def __init__(
+        self,
+        transformer_model: str = "microsoft/deberta-v3-base",
+        additional_special_symbols: int = 101,
+        additional_special_symbols_types: Optional[int] = 0,
+        num_layers: Optional[int] = None,
+        activation: str = "gelu",
+        linears_hidden_size: Optional[int] = 512,
+        use_last_k_layers: int = 1,
+        entity_type_loss: bool = False,
+        add_entity_embedding: bool = None,
+        binary_end_logits: bool = False,
+        training: bool = False,
+        default_reader_class: Optional[str] = None,
+        threshold: Optional[float] = 0.5,
+        **kwargs
+    ) -> None:
+        # TODO: add name_or_path to kwargs
+        self.transformer_model = transformer_model
+        self.additional_special_symbols = additional_special_symbols
+        self.additional_special_symbols_types = additional_special_symbols_types
+        self.num_layers = num_layers
+        self.activation = activation
+        self.linears_hidden_size = linears_hidden_size
+        self.use_last_k_layers = use_last_k_layers
+        self.entity_type_loss = entity_type_loss
+        self.add_entity_embedding = (
+            True
+            if add_entity_embedding is None and entity_type_loss
+            else add_entity_embedding
+        )
+        self.threshold = threshold
+        self.binary_end_logits = binary_end_logits
+        self.training = training
+        self.default_reader_class = default_reader_class
+        super().__init__(**kwargs)

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a34288a25277d1027d00b8f6b739e4b8efdc4bd46d968640f603a87901fc90f1
+size 747233940

modeling_wsl.py ADDED Viewed

	@@ -0,0 +1,456 @@

+from typing import Any, Dict, Optional
+import torch
+from transformers import AutoModel, PreTrainedModel
+from transformers.activations import ClippedGELUActivation, GELUActivation
+from transformers.configuration_utils import PretrainedConfig
+from transformers.modeling_utils import PoolerEndLogits
+from .configuration_wsl import WSLReaderConfig
+class WSLReaderSample:
+    def __init__(self, **kwargs):
+        super().__setattr__("_d", {})
+        self._d = kwargs
+    def __getattribute__(self, item):
+        return super(WSLReaderSample, self).__getattribute__(item)
+    def __getattr__(self, item):
+        if item.startswith("__") and item.endswith("__"):
+            # this is likely some python library-specific variable (such as __deepcopy__ for copy)
+            # better follow standard behavior here
+            raise AttributeError(item)
+        elif item in self._d:
+            return self._d[item]
+        else:
+            return None
+    def __setattr__(self, key, value):
+        if key in self._d:
+            self._d[key] = value
+        else:
+            super().__setattr__(key, value)
+            self._d[key] = value
+activation2functions = {
+    "relu": torch.nn.ReLU(),
+    "gelu": GELUActivation(),
+    "gelu_10": ClippedGELUActivation(-10, 10),
+}
+class PoolerEndLogitsBi(PoolerEndLogits):
+    def __init__(self, config: PretrainedConfig):
+        super().__init__(config)
+        self.dense_1 = torch.nn.Linear(config.hidden_size, 2)
+    def forward(
+        self,
+        hidden_states: torch.FloatTensor,
+        start_states: Optional[torch.FloatTensor] = None,
+        start_positions: Optional[torch.LongTensor] = None,
+        p_mask: Optional[torch.FloatTensor] = None,
+    ) -> torch.FloatTensor:
+        if p_mask is not None:
+            p_mask = p_mask.unsqueeze(-1)
+        logits = super().forward(
+            hidden_states,
+            start_states,
+            start_positions,
+            p_mask,
+        )
+        return logits
+class WSLReaderSpanModel(PreTrainedModel):
+    config_class = WSLReaderConfig
+    def __init__(self, config: WSLReaderConfig, *args, **kwargs):
+        super().__init__(config)
+        # Transformer model declaration
+        self.config = config
+        self.transformer_model = (
+            AutoModel.from_pretrained(self.config.transformer_model)
+            if self.config.num_layers is None
+            else AutoModel.from_pretrained(
+                self.config.transformer_model, num_hidden_layers=self.config.num_layers
+            )
+        )
+        self.transformer_model.resize_token_embeddings(
+            self.transformer_model.config.vocab_size
+            + self.config.additional_special_symbols
+        )
+        self.activation = self.config.activation
+        self.linears_hidden_size = self.config.linears_hidden_size
+        self.use_last_k_layers = self.config.use_last_k_layers
+        # named entity detection layers
+        self.ned_start_classifier = self._get_projection_layer(
+            self.activation, last_hidden=2, layer_norm=False
+        )
+        if self.config.binary_end_logits:
+            self.ned_end_classifier = PoolerEndLogitsBi(self.transformer_model.config)
+        else:
+            self.ned_end_classifier = PoolerEndLogits(self.transformer_model.config)
+        # END entity disambiguation layer
+        self.ed_start_projector = self._get_projection_layer(self.activation)
+        self.ed_end_projector = self._get_projection_layer(self.activation)
+        self.training = self.config.training
+        # criterion
+        self.criterion = torch.nn.CrossEntropyLoss()
+    def _get_projection_layer(
+        self,
+        activation: str,
+        last_hidden: Optional[int] = None,
+        input_hidden=None,
+        layer_norm: bool = True,
+    ) -> torch.nn.Sequential:
+        head_components = [
+            torch.nn.Dropout(0.1),
+            torch.nn.Linear(
+                (
+                    self.transformer_model.config.hidden_size * self.use_last_k_layers
+                    if input_hidden is None
+                    else input_hidden
+                ),
+                self.linears_hidden_size,
+            ),
+            activation2functions[activation],
+            torch.nn.Dropout(0.1),
+            torch.nn.Linear(
+                self.linears_hidden_size,
+                self.linears_hidden_size if last_hidden is None else last_hidden,
+            ),
+        ]
+        if layer_norm:
+            head_components.append(
+                torch.nn.LayerNorm(
+                    self.linears_hidden_size if last_hidden is None else last_hidden,
+                    self.transformer_model.config.layer_norm_eps,
+                )
+            )
+        return torch.nn.Sequential(*head_components)
+    def _mask_logits(self, logits: torch.Tensor, mask: torch.Tensor) -> torch.Tensor:
+        mask = mask.unsqueeze(-1)
+        if next(self.parameters()).dtype == torch.float16:
+            logits = logits * (1 - mask) - 65500 * mask
+        else:
+            logits = logits * (1 - mask) - 1e30 * mask
+        return logits
+    def _get_model_features(
+        self,
+        input_ids: torch.Tensor,
+        attention_mask: torch.Tensor,
+        token_type_ids: Optional[torch.Tensor],
+    ):
+        model_input = {
+            "input_ids": input_ids,
+            "attention_mask": attention_mask,
+            "output_hidden_states": self.use_last_k_layers > 1,
+        }
+        if token_type_ids is not None:
+            model_input["token_type_ids"] = token_type_ids
+        model_output = self.transformer_model(**model_input)
+        if self.use_last_k_layers > 1:
+            model_features = torch.cat(
+                model_output[1][-self.use_last_k_layers :], dim=-1
+            )
+        else:
+            model_features = model_output[0]
+        return model_features
+    def compute_ned_end_logits(
+        self,
+        start_predictions,
+        start_labels,
+        model_features,
+        prediction_mask,
+        batch_size,
+    ) -> Optional[torch.Tensor]:
+        # todo: maybe when constraining on the spans,
+        #  we should not use a prediction_mask for the end tokens.
+        #  at least we should not during training imo
+        start_positions = start_labels if self.training else start_predictions
+        start_positions_indices = (
+            torch.arange(start_positions.size(1), device=start_positions.device)
+            .unsqueeze(0)
+            .expand(batch_size, -1)[start_positions > 0]
+        ).to(start_positions.device)
+        if len(start_positions_indices) > 0:
+            expanded_features = model_features.repeat_interleave(
+                torch.sum(start_positions > 0, dim=-1), dim=0
+            )
+            expanded_prediction_mask = prediction_mask.repeat_interleave(
+                torch.sum(start_positions > 0, dim=-1), dim=0
+            )
+            end_logits = self.ned_end_classifier(
+                hidden_states=expanded_features,
+                start_positions=start_positions_indices,
+                p_mask=expanded_prediction_mask,
+            )
+            return end_logits
+        return None
+    def compute_classification_logits(
+        self,
+        model_features_start,
+        model_features_end,
+        special_symbols_features,
+    ) -> torch.Tensor:
+        model_start_features = self.ed_start_projector(model_features_start)
+        model_end_features = self.ed_end_projector(model_features_end)
+        model_start_features_symbols = self.ed_start_projector(special_symbols_features)
+        model_end_features_symbols = self.ed_end_projector(special_symbols_features)
+        model_ed_features = torch.cat(
+            [model_start_features, model_end_features], dim=-1
+        )
+        special_symbols_representation = torch.cat(
+            [model_start_features_symbols, model_end_features_symbols], dim=-1
+        )
+        logits = torch.bmm(
+            model_ed_features,
+            torch.permute(special_symbols_representation, (0, 2, 1)),
+        )
+        logits = self._mask_logits(logits, (model_features_start == -100).all(2).long())
+        return logits
+    def forward(
+        self,
+        input_ids: torch.Tensor,
+        attention_mask: torch.Tensor,
+        token_type_ids: Optional[torch.Tensor] = None,
+        prediction_mask: Optional[torch.Tensor] = None,
+        special_symbols_mask: Optional[torch.Tensor] = None,
+        start_labels: Optional[torch.Tensor] = None,
+        end_labels: Optional[torch.Tensor] = None,
+        use_predefined_spans: bool = False,
+        *args,
+        **kwargs,
+    ) -> Dict[str, Any]:
+        batch_size, seq_len = input_ids.shape
+        model_features = self._get_model_features(
+            input_ids, attention_mask, token_type_ids
+        )
+        ned_start_labels = None
+        # named entity detection if required
+        if use_predefined_spans:  # no need to compute spans
+            ned_start_logits, ned_start_probabilities, ned_start_predictions = (
+                None,
+                None,
+                (
+                    torch.clone(start_labels)
+                    if start_labels is not None
+                    else torch.zeros_like(input_ids)
+                ),
+            )
+            ned_end_logits, ned_end_probabilities, ned_end_predictions = (
+                None,
+                None,
+                (
+                    torch.clone(end_labels)
+                    if end_labels is not None
+                    else torch.zeros_like(input_ids)
+                ),
+            )
+            ned_start_predictions[ned_start_predictions > 0] = 1
+            ned_end_predictions[end_labels > 0] = 1
+            ned_end_predictions = ned_end_predictions[~(end_labels == -100).all(2)]
+        else:  # compute spans
+            # start boundary prediction
+            ned_start_logits = self.ned_start_classifier(model_features)
+            ned_start_logits = self._mask_logits(ned_start_logits, prediction_mask)
+            ned_start_probabilities = torch.softmax(ned_start_logits, dim=-1)
+            ned_start_predictions = ned_start_probabilities.argmax(dim=-1)
+            # end boundary prediction
+            ned_start_labels = (
+                torch.zeros_like(start_labels) if start_labels is not None else None
+            )
+            if ned_start_labels is not None:
+                ned_start_labels[start_labels == -100] = -100
+                ned_start_labels[start_labels > 0] = 1
+            ned_end_logits = self.compute_ned_end_logits(
+                ned_start_predictions,
+                ned_start_labels,
+                model_features,
+                prediction_mask,
+                batch_size,
+            )
+            if ned_end_logits is not None:
+                ned_end_probabilities = torch.softmax(ned_end_logits, dim=-1)
+                if not self.config.binary_end_logits:
+                    ned_end_predictions = torch.argmax(
+                        ned_end_probabilities, dim=-1, keepdim=True
+                    )
+                    ned_end_predictions = torch.zeros_like(
+                        ned_end_probabilities
+                    ).scatter_(1, ned_end_predictions, 1)
+                else:
+                    ned_end_predictions = torch.argmax(ned_end_probabilities, dim=-1)
+            else:
+                ned_end_logits, ned_end_probabilities = None, None
+                ned_end_predictions = ned_start_predictions.new_zeros(
+                    batch_size, seq_len
+                )
+            if not self.training:
+                # if len(ned_end_predictions.shape) < 2:
+                #     print(ned_end_predictions)
+                end_preds_count = ned_end_predictions.sum(1)
+                # If there are no end predictions for a start prediction, remove the start prediction
+                if (end_preds_count == 0).any() and (ned_start_predictions > 0).any():
+                    ned_start_predictions[ned_start_predictions == 1] = (
+                        end_preds_count != 0
+                    ).long()
+                    ned_end_predictions = ned_end_predictions[end_preds_count != 0]
+        if end_labels is not None:
+            end_labels = end_labels[~(end_labels == -100).all(2)]
+        start_position, end_position = (
+            (start_labels, end_labels)
+            if self.training
+            else (ned_start_predictions, ned_end_predictions)
+        )
+        start_counts = (start_position > 0).sum(1)
+        if (start_counts > 0).any():
+            ned_end_predictions = ned_end_predictions.split(start_counts.tolist())
+        # Entity disambiguation
+        if (end_position > 0).sum() > 0:
+            ends_count = (end_position > 0).sum(1)
+            model_entity_start = torch.repeat_interleave(
+                model_features[start_position > 0], ends_count, dim=0
+            )
+            model_entity_end = torch.repeat_interleave(
+                model_features, start_counts, dim=0
+            )[end_position > 0]
+            ents_count = torch.nn.utils.rnn.pad_sequence(
+                torch.split(ends_count, start_counts.tolist()),
+                batch_first=True,
+                padding_value=0,
+            ).sum(1)
+            model_entity_start = torch.nn.utils.rnn.pad_sequence(
+                torch.split(model_entity_start, ents_count.tolist()),
+                batch_first=True,
+                padding_value=-100,
+            )
+            model_entity_end = torch.nn.utils.rnn.pad_sequence(
+                torch.split(model_entity_end, ents_count.tolist()),
+                batch_first=True,
+                padding_value=-100,
+            )
+            ed_logits = self.compute_classification_logits(
+                model_entity_start,
+                model_entity_end,
+                model_features[special_symbols_mask].view(
+                    batch_size, -1, model_features.shape[-1]
+                ),
+            )
+            ed_probabilities = torch.softmax(ed_logits, dim=-1)
+            ed_predictions = torch.argmax(ed_probabilities, dim=-1)
+        else:
+            ed_logits, ed_probabilities, ed_predictions = (
+                None,
+                ned_start_predictions.new_zeros(batch_size, seq_len),
+                ned_start_predictions.new_zeros(batch_size),
+            )
+        # output build
+        output_dict = dict(
+            batch_size=batch_size,
+            ned_start_logits=ned_start_logits,
+            ned_start_probabilities=ned_start_probabilities,
+            ned_start_predictions=ned_start_predictions,
+            ned_end_logits=ned_end_logits,
+            ned_end_probabilities=ned_end_probabilities,
+            ned_end_predictions=ned_end_predictions,
+            ed_logits=ed_logits,
+            ed_probabilities=ed_probabilities,
+            ed_predictions=ed_predictions,
+        )
+        # compute loss if labels
+        if start_labels is not None and end_labels is not None and self.training:
+            # named entity detection loss
+            # start
+            if ned_start_logits is not None:
+                ned_start_loss = self.criterion(
+                    ned_start_logits.view(-1, ned_start_logits.shape[-1]),
+                    ned_start_labels.view(-1),
+                )
+            else:
+                ned_start_loss = 0
+            # end
+            # use ents_count to assign the labels to the correct positions i.e. using end_labels -> [[0,0,4,0], [0,0,0,2]] -> [4,2] (this is just an element, for batch we need to mask it with ents_count), ie -> [[4,2,-100,-100], [3,1,2,-100], [1,3,2,5]]
+            if ned_end_logits is not None:
+                ed_labels = end_labels.clone()
+                ed_labels = torch.nn.utils.rnn.pad_sequence(
+                    torch.split(ed_labels[ed_labels > 0], ents_count.tolist()),
+                    batch_first=True,
+                    padding_value=-100,
+                )
+                end_labels[end_labels > 0] = 1
+                if not self.config.binary_end_logits:
+                    # transform label to position in the sequence
+                    end_labels = end_labels.argmax(dim=-1)
+                    ned_end_loss = self.criterion(
+                        ned_end_logits.view(-1, ned_end_logits.shape[-1]),
+                        end_labels.view(-1),
+                    )
+                else:
+                    ned_end_loss = self.criterion(
+                        ned_end_logits.reshape(-1, ned_end_logits.shape[-1]),
+                        end_labels.reshape(-1).long(),
+                    )
+                # entity disambiguation loss
+                ed_loss = self.criterion(
+                    ed_logits.view(-1, ed_logits.shape[-1]),
+                    ed_labels.view(-1).long(),
+                )
+            else:
+                ned_end_loss = 0
+                ed_loss = 0
+            output_dict["ned_start_loss"] = ned_start_loss
+            output_dict["ned_end_loss"] = ned_end_loss
+            output_dict["ed_loss"] = ed_loss
+            output_dict["loss"] = ned_start_loss + ned_end_loss + ed_loss
+        return output_dict

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,154 @@

+{
+  "additional_special_tokens": [
+    "--NME--",
+    "[E-0]",
+    "[E-1]",
+    "[E-2]",
+    "[E-3]",
+    "[E-4]",
+    "[E-5]",
+    "[E-6]",
+    "[E-7]",
+    "[E-8]",
+    "[E-9]",
+    "[E-10]",
+    "[E-11]",
+    "[E-12]",
+    "[E-13]",
+    "[E-14]",
+    "[E-15]",
+    "[E-16]",
+    "[E-17]",
+    "[E-18]",
+    "[E-19]",
+    "[E-20]",
+    "[E-21]",
+    "[E-22]",
+    "[E-23]",
+    "[E-24]",
+    "[E-25]",
+    "[E-26]",
+    "[E-27]",
+    "[E-28]",
+    "[E-29]",
+    "[E-30]",
+    "[E-31]",
+    "[E-32]",
+    "[E-33]",
+    "[E-34]",
+    "[E-35]",
+    "[E-36]",
+    "[E-37]",
+    "[E-38]",
+    "[E-39]",
+    "[E-40]",
+    "[E-41]",
+    "[E-42]",
+    "[E-43]",
+    "[E-44]",
+    "[E-45]",
+    "[E-46]",
+    "[E-47]",
+    "[E-48]",
+    "[E-49]",
+    "[E-50]",
+    "[E-51]",
+    "[E-52]",
+    "[E-53]",
+    "[E-54]",
+    "[E-55]",
+    "[E-56]",
+    "[E-57]",
+    "[E-58]",
+    "[E-59]",
+    "[E-60]",
+    "[E-61]",
+    "[E-62]",
+    "[E-63]",
+    "[E-64]",
+    "[E-65]",
+    "[E-66]",
+    "[E-67]",
+    "[E-68]",
+    "[E-69]",
+    "[E-70]",
+    "[E-71]",
+    "[E-72]",
+    "[E-73]",
+    "[E-74]",
+    "[E-75]",
+    "[E-76]",
+    "[E-77]",
+    "[E-78]",
+    "[E-79]",
+    "[E-80]",
+    "[E-81]",
+    "[E-82]",
+    "[E-83]",
+    "[E-84]",
+    "[E-85]",
+    "[E-86]",
+    "[E-87]",
+    "[E-88]",
+    "[E-89]",
+    "[E-90]",
+    "[E-91]",
+    "[E-92]",
+    "[E-93]",
+    "[E-94]",
+    "[E-95]",
+    "[E-96]",
+    "[E-97]",
+    "[E-98]",
+    "[E-99]"
+  ],
+  "bos_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

spm.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
+size 2464616

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,970 @@

+{
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128000": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128001": {
+      "content": "--NME--",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128002": {
+      "content": "[E-0]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128003": {
+      "content": "[E-1]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128004": {
+      "content": "[E-2]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128005": {
+      "content": "[E-3]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128006": {
+      "content": "[E-4]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128007": {
+      "content": "[E-5]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128008": {
+      "content": "[E-6]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128009": {
+      "content": "[E-7]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128010": {
+      "content": "[E-8]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128011": {
+      "content": "[E-9]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128012": {
+      "content": "[E-10]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128013": {
+      "content": "[E-11]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128014": {
+      "content": "[E-12]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128015": {
+      "content": "[E-13]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128016": {
+      "content": "[E-14]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128017": {
+      "content": "[E-15]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128018": {
+      "content": "[E-16]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128019": {
+      "content": "[E-17]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128020": {
+      "content": "[E-18]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128021": {
+      "content": "[E-19]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128022": {
+      "content": "[E-20]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128023": {
+      "content": "[E-21]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128024": {
+      "content": "[E-22]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128025": {
+      "content": "[E-23]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128026": {
+      "content": "[E-24]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128027": {
+      "content": "[E-25]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128028": {
+      "content": "[E-26]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128029": {
+      "content": "[E-27]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128030": {
+      "content": "[E-28]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128031": {
+      "content": "[E-29]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128032": {
+      "content": "[E-30]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128033": {
+      "content": "[E-31]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128034": {
+      "content": "[E-32]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128035": {
+      "content": "[E-33]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128036": {
+      "content": "[E-34]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128037": {
+      "content": "[E-35]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128038": {
+      "content": "[E-36]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128039": {
+      "content": "[E-37]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128040": {
+      "content": "[E-38]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128041": {
+      "content": "[E-39]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128042": {
+      "content": "[E-40]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128043": {
+      "content": "[E-41]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128044": {
+      "content": "[E-42]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128045": {
+      "content": "[E-43]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128046": {
+      "content": "[E-44]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128047": {
+      "content": "[E-45]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128048": {
+      "content": "[E-46]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128049": {
+      "content": "[E-47]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128050": {
+      "content": "[E-48]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128051": {
+      "content": "[E-49]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128052": {
+      "content": "[E-50]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128053": {
+      "content": "[E-51]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128054": {
+      "content": "[E-52]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128055": {
+      "content": "[E-53]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128056": {
+      "content": "[E-54]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128057": {
+      "content": "[E-55]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128058": {
+      "content": "[E-56]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128059": {
+      "content": "[E-57]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128060": {
+      "content": "[E-58]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128061": {
+      "content": "[E-59]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128062": {
+      "content": "[E-60]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128063": {
+      "content": "[E-61]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128064": {
+      "content": "[E-62]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128065": {
+      "content": "[E-63]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128066": {
+      "content": "[E-64]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128067": {
+      "content": "[E-65]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128068": {
+      "content": "[E-66]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128069": {
+      "content": "[E-67]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128070": {
+      "content": "[E-68]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128071": {
+      "content": "[E-69]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128072": {
+      "content": "[E-70]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128073": {
+      "content": "[E-71]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128074": {
+      "content": "[E-72]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128075": {
+      "content": "[E-73]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128076": {
+      "content": "[E-74]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128077": {
+      "content": "[E-75]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128078": {
+      "content": "[E-76]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128079": {
+      "content": "[E-77]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128080": {
+      "content": "[E-78]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128081": {
+      "content": "[E-79]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128082": {
+      "content": "[E-80]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128083": {
+      "content": "[E-81]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128084": {
+      "content": "[E-82]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128085": {
+      "content": "[E-83]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128086": {
+      "content": "[E-84]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128087": {
+      "content": "[E-85]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128088": {
+      "content": "[E-86]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128089": {
+      "content": "[E-87]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128090": {
+      "content": "[E-88]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128091": {
+      "content": "[E-89]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128092": {
+      "content": "[E-90]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128093": {
+      "content": "[E-91]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128094": {
+      "content": "[E-92]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128095": {
+      "content": "[E-93]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128096": {
+      "content": "[E-94]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128097": {
+      "content": "[E-95]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128098": {
+      "content": "[E-96]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128099": {
+      "content": "[E-97]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128100": {
+      "content": "[E-98]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128101": {
+      "content": "[E-99]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "--NME--",
+    "[E-0]",
+    "[E-1]",
+    "[E-2]",
+    "[E-3]",
+    "[E-4]",
+    "[E-5]",
+    "[E-6]",
+    "[E-7]",
+    "[E-8]",
+    "[E-9]",
+    "[E-10]",
+    "[E-11]",
+    "[E-12]",
+    "[E-13]",
+    "[E-14]",
+    "[E-15]",
+    "[E-16]",
+    "[E-17]",
+    "[E-18]",
+    "[E-19]",
+    "[E-20]",
+    "[E-21]",
+    "[E-22]",
+    "[E-23]",
+    "[E-24]",
+    "[E-25]",
+    "[E-26]",
+    "[E-27]",
+    "[E-28]",
+    "[E-29]",
+    "[E-30]",
+    "[E-31]",
+    "[E-32]",
+    "[E-33]",
+    "[E-34]",
+    "[E-35]",
+    "[E-36]",
+    "[E-37]",
+    "[E-38]",
+    "[E-39]",
+    "[E-40]",
+    "[E-41]",
+    "[E-42]",
+    "[E-43]",
+    "[E-44]",
+    "[E-45]",
+    "[E-46]",
+    "[E-47]",
+    "[E-48]",
+    "[E-49]",
+    "[E-50]",
+    "[E-51]",
+    "[E-52]",
+    "[E-53]",
+    "[E-54]",
+    "[E-55]",
+    "[E-56]",
+    "[E-57]",
+    "[E-58]",
+    "[E-59]",
+    "[E-60]",
+    "[E-61]",
+    "[E-62]",
+    "[E-63]",
+    "[E-64]",
+    "[E-65]",
+    "[E-66]",
+    "[E-67]",
+    "[E-68]",
+    "[E-69]",
+    "[E-70]",
+    "[E-71]",
+    "[E-72]",
+    "[E-73]",
+    "[E-74]",
+    "[E-75]",
+    "[E-76]",
+    "[E-77]",
+    "[E-78]",
+    "[E-79]",
+    "[E-80]",
+    "[E-81]",
+    "[E-82]",
+    "[E-83]",
+    "[E-84]",
+    "[E-85]",
+    "[E-86]",
+    "[E-87]",
+    "[E-88]",
+    "[E-89]",
+    "[E-90]",
+    "[E-91]",
+    "[E-92]",
+    "[E-93]",
+    "[E-94]",
+    "[E-95]",
+    "[E-96]",
+    "[E-97]",
+    "[E-98]",
+    "[E-99]"
+  ],
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "sp_model_kwargs": {},
+  "split_by_punct": false,
+  "tokenizer_class": "DebertaV2Tokenizer",
+  "unk_token": "[UNK]",
+  "vocab_type": "spm"
+}