tensorplex-labs
/

Sumo-T9-7B-v0.1

 - English
 - text-generation
 - base-model
+- bittensor
+- decentralized AI
+- Web3
 datasets:
 - tiiuae/falcon-refinedweb
 ---
+# 🏯 Sumo-Qyuu-7B-v0.1
+Sumo is a family of models developed by [Tensorplex](https://tensorplex.ai). Specifically, "Sumo-Qyuu" stands for the best model developed for the bittensor subnet 9.
+## Model Details
+### Model Description
+- **Developed by:** [Tensorplex Labs](https://tensorplex.ai)
+- **Model type:** Pretrained Foundational Language Model
+- **Language(s) (NLP):** Primarily English
+- **License:** MIT
+### Model Sources
+- **Bittensor Subnet9 Leaderboard:** [https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard)
+- **Bittensor Subnet9 Repository:** [https://github.com/RaoFoundation/pretraining/tree/main](https://github.com/RaoFoundation/pretraining/tree/main)
+## Usage
+⛔ **This is a pretrained base model, which hasn't been aligned yet. Use with caution or finetune further on downstream tasks before deployment.**
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+model = "tensorplex-labs/Sumo-Qyuu-7B-v0.1"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+)
+sequences = pipeline(
+   "Tell me about decentralization.",
+    max_length=256,
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.9,
+    num_return_sequences=1,
+    eos_token_id=tokenizer.eos_token_id,
+    bos_token_id=tokenizer.bos_token_id,
+)
+for seq in sequences:
+    print(f"Result: {seq['generated_text']}")
+```
+## Training Details
+### Training Data
+This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset.
+## Evaluation
+|                                  |      tensorplex-labs/Sumo-Qyuu-7B-v0.1 |   NousResearch/Llama-2-7b-hf |   yahma/llama-7b-hf |   tiiuae/falcon-7b |
+|----------------------------------|----------------------------------------|------------------------------|---------------------|--------------------|
+| **avg**                          |                            **47.85**   |                      47.31   |               44.22 |             42.03  |
+| arc_challenge (acc_norm, 0-shot) |                                47.53   |                      46.16   |               44.88 |             43.43  |
+| gsm8k (exact_match, 5-shot)      |                                10.46   |                      13.27   |               10.39 |             05.23  |
+| hellaswag (acc_norm, 0-shot)     |                                76.66   |                      75.97   |               76.19 |             76.33  |
+| mmlu (acc, 0-shot)               |                                44.26   |                      40.78   |               29.68 |             25.72  |
+| truthfulqa_mc2 (acc, 0-shot)     |                                37.29   |                      39.00   |               34.01 |             34.27  |
+| winogrande (acc, 0-shot)         |                                70.88   |                      68.67   |               70.17 |             67.17  |
+[LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)
+## Model Architecture and Objective
+The model is a 6.9B parameter model based on `LlamaForCausalLM` architecture, with larger vocab size (100k) that matches with the gpt-4 tokenizer.
+The training objective was the standard next token prediction.
+## Model Card Authors
+- [email protected]
+## Model Card Contact
+Should you have any inquiries, contact [email protected].