File size: 5,034 Bytes
5afe6d3 70387d2 5afe6d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
---
language:
- en
---
# Text Classification of conversation flow
This a ONNX quantized model and is fined-tuned version of [nreimers/MiniLMv2-L6-H384-distilled-from-RoBERTa-Large](https://huggingface.co/nreimers/MiniLMv2-L6-H384-distilled-from-RoBERTa-Large).
The original model can be found [here](https://huggingface.co/minuva/MiniLMv2-agentflow-v2)
A flow label is orthogonal to the main conversation goal, implying that it categorizes actions or responses in a way that is independent from the primary objective of the conversation.
This model should be used *only* for agent dialogs.
# Usage
## Installation
```bash
pip install tokenizers
pip install onnxruntime
git clone https://huggingface.co/minuva/MiniLMv2-agentflow-v2-onnx
```
## Run the Model
```py
import os
import numpy as np
import json
from tokenizers import Tokenizer
from onnxruntime import InferenceSession
model_name = "minuva/MiniLMv2-agentflow-v2-onnx"
tokenizer = Tokenizer.from_pretrained(model_name)
tokenizer.enable_padding(
pad_token="<pad>",
pad_id=1,
)
tokenizer.enable_truncation(max_length=256)
batch_size = 16
texts = ["thats my mistake"]
outputs = []
model = InferenceSession("MiniLMv2-agentflow-v2-onnx/model_optimized_quantized.onnx", providers=['CPUExecutionProvider'])
with open(os.path.join("MiniLMv2-agentflow-v2-onnx", "config.json"), "r") as f:
config = json.load(f)
output_names = [output.name for output in model.get_outputs()]
input_names = [input.name for input in model.get_inputs()]
for subtexts in np.array_split(np.array(texts), len(texts) // batch_size + 1):
encodings = tokenizer.encode_batch(list(subtexts))
inputs = {
"input_ids": np.vstack(
[encoding.ids for encoding in encodings],
),
"attention_mask": np.vstack(
[encoding.attention_mask for encoding in encodings],
),
"token_type_ids": np.vstack(
[encoding.type_ids for encoding in encodings],
),
}
for input_name in input_names:
if input_name not in inputs:
raise ValueError(f"Input name {input_name} not found in inputs")
inputs = {input_name: inputs[input_name] for input_name in input_names}
output = np.squeeze(
np.stack(
model.run(output_names=output_names, input_feed=inputs)
),
axis=0,
)
outputs.append(output)
outputs = np.concatenate(outputs, axis=0)
scores = 1 / (1 + np.exp(-outputs))
results = []
for item in scores:
labels = []
scores = []
for idx, s in enumerate(item):
labels.append(config["id2label"][str(idx)])
scores.append(float(s))
results.append({"labels": labels, "scores": scores})
res = []
for result in results:
joined = list(zip(result['labels'], result['scores']))
max_score = max(joined, key=lambda x: x[1])
res.append(max_score)
res
# [('agent_apology_error_mistake', 0.9991708993911743)]
```
# Categories Explanation
<details>
<summary>Click to expand!</summary>
- OTHER: Responses or actions by the agent that do not fit into the predefined categories or are outside the scope of the specific interactions listed.
- agent_apology_error_mistake: When the agent acknowledges an error or mistake in the information provided or in the handling of the request.
- agent_apology_unsatisfactory: The agent expresses an apology for providing an unsatisfactory response or for any dissatisfaction experienced by the user.
- agent_didnt_understand: Indicates that the agent did not understand the user's request or question.
- agent_limited_capabilities: The agent communicates its limitations in addressing certain requests or providing certain types of information.
- agent_refuses_answer: When the agent explicitly refuses to answer a question or fulfill a request, due to policy restrictions or ethical considerations.
- image_limitations": The agent points out limitations related to handling or interpreting images.
- no_information_doesnt_know": The agent indicates that it has no information available or does not know the answer to the user's question.
- success_and_followup_assistance": The agent successfully provides the requested information or service and offers further assistance or follow-up actions if needed.
</details>
<br>
# Metrics in our private test dataset
| Model (params) | Loss | Accuracy | F1 |
|--------------------|-------------|----------|--------|
| minuva/MiniLMv2-agentflow-v2 (33M) | 0.1462 | 0.9773 | 0.9774 |
| minuva/MiniLMv2-agentflow-v2-onnx (33M) | - | 0.97394 | 0.97392 |
# Deployment
Check [our repository](https://github.com/minuva/flow-cloudrun) to see how to easily deploy this (quantized) model in a serverless environment with fast CPU inference and light resource utilization. |