tensorplex-labs commited on
Commit
61a5408
·
verified ·
1 Parent(s): 28efced

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -6
README.md CHANGED
@@ -17,11 +17,21 @@ datasets:
17
  ---
18
 
19
 
20
- # 🏯 Sumo-Qyuu-7B-v0.1
21
 
22
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a8a4c5539e211436ef5485/RXiIpU1BwTpvUdhzv-XK9.png)
23
 
24
- 🏯 Sumo is a family of models developed by [Tensorplex](https://tensorplex.ai). Specifically, "Sumo-Qyuu" stands for the best model developed for the bittensor subnet 9.
 
 
 
 
 
 
 
 
 
 
25
 
26
  ## Model Details
27
 
@@ -31,6 +41,11 @@ datasets:
31
  - **Model type:** Pretrained Foundational Language Model
32
  - **Language(s) (NLP):** Primarily English
33
  - **License:** MIT
 
 
 
 
 
34
 
35
  ### Model Sources
36
 
@@ -78,10 +93,15 @@ for seq in sequences:
78
 
79
  ### Training Data
80
 
81
- This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset.
82
 
83
  ## Evaluation
84
 
 
 
 
 
 
85
  | | tensorplex-labs/Sumo-Qyuu-7B-v0.1 | NousResearch/Llama-2-7b-hf | yahma/llama-7b-hf | tiiuae/falcon-7b |
86
  |----------------------------------|----------------------------------------|------------------------------|---------------------|--------------------|
87
  | **avg** | **47.85** | 47.31 | 44.22 | 42.03 |
@@ -94,11 +114,15 @@ This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.
94
 
95
  [LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)
96
 
 
 
 
 
 
97
 
98
- ## Model Architecture and Objective
99
 
100
- The model is a 6.9B parameter model based on `LlamaForCausalLM` architecture, with larger vocab size (100k) that matches with the gpt-4 tokenizer.
101
- The training objective was the standard next token prediction.
102
 
103
  ## Model Card Authors
104
 
 
17
  ---
18
 
19
 
20
+ # Sumo-Qyuu-7B-v0.1
21
 
22
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a8a4c5539e211436ef5485/RXiIpU1BwTpvUdhzv-XK9.png)
23
 
24
+ **Tensorplex Labs Unveils Sumo-Qyuu-7B: Beating Notable 7b Pretrained Models**
25
+
26
+ [Tensorplex Labs]((https://tensorplex.ai)) is proud to announce that its latest top-performing model on Bittensor Subnet 9, Sumo-Qyuu-7B,
27
+ has outperformed notable models such as TII Falcon 7B and Meta's Llama-2-7b-hf. This achievement highlights the potential of decentralized networks
28
+ like Bittensor and underscores Tensorplex Labs' commitment to advancing open-source AI technologies.
29
+
30
+ "Sumo" represents the family of models developed by Tensorplex, and "Qyuu" designates the top-performing model specifically trained for Bittensor Subnet 9.
31
+
32
+ Bittensor Subnet 9 serves a unique role within the Bittensor ecosystem by rewarding miners who produce pretrained foundational models on the Falcon Refined Web dataset. This subnet functions as a continuous benchmark, where miners are incentivized to achieve the best performance metrics using a model under the parameter limit. The competitive nature of Subnet 9 drives rapid advancements and refinements in large language model training.
33
+
34
+ Since the parameter limit was upgraded to 7 billion on April 19, 2024, Tensorplex Labs has published the top-performing model, surpassing the performance of notable models such as Falcon 7B and Llama 2 7B within less than a month.
35
 
36
  ## Model Details
37
 
 
41
  - **Model type:** Pretrained Foundational Language Model
42
  - **Language(s) (NLP):** Primarily English
43
  - **License:** MIT
44
+ - **Architecture**: Adopted Llama-style architecture with 6.9 billion parameters
45
+ - **Training Data**: Trained on the tiiuae/falcon-refinedweb dataset
46
+ - **Training Objective**: Causal Language Modeling (next token prediction)
47
+
48
+ Sumo-Qyuu-7B-v0.1 features a larger vocabulary size (100k), compatible with the GPT-4 tokenizer, ensuring its versatility across various natural language processing tasks.
49
 
50
  ### Model Sources
51
 
 
93
 
94
  ### Training Data
95
 
96
+ This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset continuously.
97
 
98
  ## Evaluation
99
 
100
+ Sumo-Qyuu-7B-v0.1 has outperformed notable models such as TII Falcon 7B, Meta's Llama-2-7b and Llama-1-7b in zero-shot performance,
101
+ establishing itself as the leading model in aggregate across various evaluation tasks.
102
+ Such benchmarks include ARC Challenge, GSM8K, HellaSwag, MMLU, TruthfulQA MC2, and Winogrande.
103
+
104
+
105
  | | tensorplex-labs/Sumo-Qyuu-7B-v0.1 | NousResearch/Llama-2-7b-hf | yahma/llama-7b-hf | tiiuae/falcon-7b |
106
  |----------------------------------|----------------------------------------|------------------------------|---------------------|--------------------|
107
  | **avg** | **47.85** | 47.31 | 44.22 | 42.03 |
 
114
 
115
  [LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)
116
 
117
+ ## Future Plans
118
+
119
+ Tensorplex Labs will continue pushing the limits of what is possible on Subnet 9, and will also work on fine-tuning state-of-the-art models for Web3 domain-specific use-cases.
120
+
121
+ One of the most ambitious projects is the development of a new data collection subnet. This will enable open and incentivized contributions of intelligence from a diverse pool of participants. The subnet will function as a collaborative platform where individuals can provide human preference or training data, which will be used to train, fine-tune, and evaluate AI models and miners across various subnets on Bittensor.
122
 
123
+ ## About Tensorplex Labs
124
 
125
+ Tensorplex Labs is an AI and Web3 startup that is building the decentralized AI of the future. The company’s mission is to decentralize AI, democratize access to data and intelligence, and build a more open, transparent, and equitable future for AI. Tensorplex Labs develops open-source capital and intelligence infrastructure and applications designed to grow decentralized AI, Web3, and crypto ecosystems by making them more capital efficient, intelligent, and trustworthy. The company is currently developing a novel way to better incentivize human input to train AI models, opening up more access to new pools of human contributors with new income opportunities. Founded in 2023 with headquarters in Singapore, Tensorplex Labs’ investors include Canonical Crypto, Collab+Currency, and Digital Currency Group among several others. For more information, visit Tensorplex.
 
126
 
127
  ## Model Card Authors
128