tensorplex-labs commited on
Commit
3b8c381
·
verified ·
1 Parent(s): 6cbe688

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md CHANGED
@@ -9,6 +9,100 @@ tags:
9
  - English
10
  - text-generation
11
  - base-model
 
 
 
12
  datasets:
13
  - tiiuae/falcon-refinedweb
14
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - English
10
  - text-generation
11
  - base-model
12
+ - bittensor
13
+ - decentralized AI
14
+ - Web3
15
  datasets:
16
  - tiiuae/falcon-refinedweb
17
  ---
18
+
19
+ # 🏯 Sumo-Qyuu-7B-v0.1
20
+
21
+ Sumo is a family of models developed by [Tensorplex](https://tensorplex.ai). Specifically, "Sumo-Qyuu" stands for the best model developed for the bittensor subnet 9.
22
+
23
+ ## Model Details
24
+
25
+ ### Model Description
26
+
27
+ - **Developed by:** [Tensorplex Labs](https://tensorplex.ai)
28
+ - **Model type:** Pretrained Foundational Language Model
29
+ - **Language(s) (NLP):** Primarily English
30
+ - **License:** MIT
31
+
32
+ ### Model Sources
33
+
34
+ - **Bittensor Subnet9 Leaderboard:** [https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard)
35
+ - **Bittensor Subnet9 Repository:** [https://github.com/RaoFoundation/pretraining/tree/main](https://github.com/RaoFoundation/pretraining/tree/main)
36
+
37
+ ## Usage
38
+
39
+ ⛔ **This is a pretrained base model, which hasn't been aligned yet. Use with caution or finetune further on downstream tasks before deployment.**
40
+
41
+ ## How to Get Started with the Model
42
+
43
+ Use the code below to get started with the model.
44
+
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+ import transformers
48
+ import torch
49
+
50
+ model = "tensorplex-labs/Sumo-Qyuu-7B-v0.1"
51
+
52
+ tokenizer = AutoTokenizer.from_pretrained(model)
53
+ pipeline = transformers.pipeline(
54
+ "text-generation",
55
+ model=model,
56
+ tokenizer=tokenizer,
57
+ torch_dtype=torch.bfloat16,
58
+ )
59
+ sequences = pipeline(
60
+ "Tell me about decentralization.",
61
+ max_length=256,
62
+ do_sample=True,
63
+ temperature=0.6,
64
+ top_p=0.9,
65
+ num_return_sequences=1,
66
+ eos_token_id=tokenizer.eos_token_id,
67
+ bos_token_id=tokenizer.bos_token_id,
68
+ )
69
+ for seq in sequences:
70
+ print(f"Result: {seq['generated_text']}")
71
+
72
+ ```
73
+
74
+ ## Training Details
75
+
76
+ ### Training Data
77
+
78
+ This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset.
79
+
80
+ ## Evaluation
81
+
82
+ | | tensorplex-labs/Sumo-Qyuu-7B-v0.1 | NousResearch/Llama-2-7b-hf | yahma/llama-7b-hf | tiiuae/falcon-7b |
83
+ |----------------------------------|----------------------------------------|------------------------------|---------------------|--------------------|
84
+ | **avg** | **47.85** | 47.31 | 44.22 | 42.03 |
85
+ | arc_challenge (acc_norm, 0-shot) | 47.53 | 46.16 | 44.88 | 43.43 |
86
+ | gsm8k (exact_match, 5-shot) | 10.46 | 13.27 | 10.39 | 05.23 |
87
+ | hellaswag (acc_norm, 0-shot) | 76.66 | 75.97 | 76.19 | 76.33 |
88
+ | mmlu (acc, 0-shot) | 44.26 | 40.78 | 29.68 | 25.72 |
89
+ | truthfulqa_mc2 (acc, 0-shot) | 37.29 | 39.00 | 34.01 | 34.27 |
90
+ | winogrande (acc, 0-shot) | 70.88 | 68.67 | 70.17 | 67.17 |
91
+
92
+ [LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)
93
+
94
+
95
+ ## Model Architecture and Objective
96
+
97
+ The model is a 6.9B parameter model based on `LlamaForCausalLM` architecture, with larger vocab size (100k) that matches with the gpt-4 tokenizer.
98
+ The training objective was the standard next token prediction.
99
+
100
+ ## Model Card Authors
101
+
102
103
+
104
+ ## Model Card Contact
105
+
106
+ Should you have any inquiries, contact [email protected].
107
+
108
+