tensorplex-labs
/

Sumo-T9-7B-v0.1

Text Generation

decentralized AI

text-generation-inference

Model card Files Files and versions

Sumo-T9-7B-v0.1 / README.md

tensorplex-labs's picture

tensorplex-labs

Update README.md

28efced verified over 1 year ago

|

4.01 kB

	---
	language:
	- en
	license: mit
	library_name: transformers
	tags:
	- pretrained
	- 7B
	- English
	- text-generation
	- base-model
	- bittensor
	- decentralized AI
	- Web3
	datasets:
	- tiiuae/falcon-refinedweb
	---


	# 🏯 Sumo-Qyuu-7B-v0.1

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a8a4c5539e211436ef5485/RXiIpU1BwTpvUdhzv-XK9.png)

	🏯 Sumo is a family of models developed by [Tensorplex](https://tensorplex.ai). Specifically, "Sumo-Qyuu" stands for the best model developed for the bittensor subnet 9.

	## Model Details

	### Model Description

	- Developed by: [Tensorplex Labs](https://tensorplex.ai)
	- Model type: Pretrained Foundational Language Model
	- Language(s) (NLP): Primarily English
	- License: MIT

	### Model Sources

	- Bittensor Subnet9 Leaderboard: [https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard)
	- Bittensor Subnet9 Repository: [https://github.com/RaoFoundation/pretraining/tree/main](https://github.com/RaoFoundation/pretraining/tree/main)

	## Usage

	⛔ This is a pretrained base model, which hasn't been aligned yet. Use with caution or finetune further on downstream tasks before deployment.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import transformers
	import torch

	model = "tensorplex-labs/Sumo-Qyuu-7B-v0.1"

	tokenizer = AutoTokenizer.from_pretrained(model)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	torch_dtype=torch.bfloat16,
	)
	sequences = pipeline(
	"What is Yokozuna?",
	max_length=256,
	do_sample=True,
	temperature=0.6,
	top_p=0.9,
	num_return_sequences=1,
	eos_token_id=tokenizer.eos_token_id,
	bos_token_id=tokenizer.bos_token_id,
	)
	for seq in sequences:
	print(f"Result: {seq['generated_text']}")

	```

	## Training Details

	### Training Data

	This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset.

	## Evaluation

	\| \| tensorplex-labs/Sumo-Qyuu-7B-v0.1 \| NousResearch/Llama-2-7b-hf \| yahma/llama-7b-hf \| tiiuae/falcon-7b \|
	\|----------------------------------\|----------------------------------------\|------------------------------\|---------------------\|--------------------\|
	\| avg \| 47.85 \| 47.31 \| 44.22 \| 42.03 \|
	\| arc_challenge (acc_norm, 0-shot) \| 47.53 \| 46.16 \| 44.88 \| 43.43 \|
	\| gsm8k (exact_match, 5-shot) \| 10.46 \| 13.27 \| 10.39 \| 05.23 \|
	\| hellaswag (acc_norm, 0-shot) \| 76.66 \| 75.97 \| 76.19 \| 76.33 \|
	\| mmlu (acc, 0-shot) \| 44.26 \| 40.78 \| 29.68 \| 25.72 \|
	\| truthfulqa_mc2 (acc, 0-shot) \| 37.29 \| 39.00 \| 34.01 \| 34.27 \|
	\| winogrande (acc, 0-shot) \| 70.88 \| 68.67 \| 70.17 \| 67.17 \|

	[LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)


	## Model Architecture and Objective

	The model is a 6.9B parameter model based on `LlamaForCausalLM` architecture, with larger vocab size (100k) that matches with the gpt-4 tokenizer.
	The training objective was the standard next token prediction.

	## Model Card Authors

	- [email protected]

	## Model Card Contact

	Should you have any inquiries, contact [email protected].