madhavgohel commited on
Commit
3f0b7ec
·
verified ·
1 Parent(s): 8abb442

Add README

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - google-bert/bert-base-uncased
4
+ ---
5
+ # 🔍 BERT Token Classification – Important Chunk Extractor (ONNX)
6
+
7
+ This model identifies and extracts important parts of input sentences using BERT-based token classification, exported to the ONNX format for optimized inference.
8
+ ---
9
+ ## 🧠 Use Case
10
+ This model is designed for **context engineering** — to extract semantically important words or chunks from sentences or chat messages, enabling better personalization in downstream applications like AI assistants or dialogue systems.
11
+
12
+ Example:
13
+ ```text
14
+ Input: I’ll be unavailable tomorrow due to a team offsite.
15
+ Output: [unavailable, tomorrow, team offsite]
16
+ ````
17
+ ---
18
+
19
+ ## 🛠️ Model Details
20
+ * **Architecture**: BERT (`bert-base-uncased`) fine-tuned for token classification
21
+ * **Exported to**: ONNX for efficient runtime inference via [Optimum](https://huggingface.co/docs/optimum/onnxruntime)
22
+ * **Labels**:
23
+
24
+ * `0`: Not Important
25
+ * `1`: Important
26
+ ---
27
+
28
+ ## 📦 How to Use (with 🤗 Transformers + Optimum)
29
+ ```python
30
+ from transformers import AutoTokenizer
31
+ from optimum.onnxruntime import ORTModelForTokenClassification
32
+ import torch
33
+
34
+ model = ORTModelForTokenClassification.from_pretrained("your-username/bert-token-onnx", file_name="model.onnx")
35
+ tokenizer = AutoTokenizer.from_pretrained("your-username/bert-token-onnx")
36
+
37
+ text = "The server will go down at midnight for maintenance."
38
+
39
+ inputs = tokenizer(text, return_tensors="pt")
40
+ outputs = model(**inputs)
41
+ predictions = torch.argmax(outputs.logits, dim=-1)
42
+
43
+ tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
44
+ important_tokens = [tok for tok, label in zip(tokens, predictions[0]) if label == 1]
45
+ print("Important tokens:", important_tokens)
46
+ ```
47
+ ---
48
+
49
+ ## 📁 Files Included
50
+ | File | Purpose |
51
+ | ------------------------- | ----------------------------------- |
52
+ | `model.onnx` | Exported ONNX model |
53
+ | `config.json` | Model config |
54
+ | `tokenizer_config.json` | Tokenizer config |
55
+ | `vocab.txt` | Vocabulary for BERT tokenizer |
56
+ | `special_tokens_map.json` | Tokenization map for special tokens |
57
+ | `README.md` | Model usage documentation |