Fork of sosier's nanoGPT - Character-Level Shakespeare
This is a fork of sosier/nanoGPT-shakespear-char-tied-weights for demonstration purposes.
Quickstart
Load model:
from transformers import AutoModel
model = AutoModel.from_pretrained("n8cha/nanoGPT-shakespeare-char", trust_remote_code=True)
Setup inference:
import torch
class CharTokenizer:
def __init__(self):
self.token_map = {'\n': 0, ' ': 1, '!': 2, '$': 3, '&': 4, "'": 5, ',': 6, '-': 7, '.': 8, '3': 9, ':': 10, ';': 11, '?': 12, 'A': 13, 'B': 14, 'C': 15, 'D': 16, 'E': 17, 'F': 18, 'G': 19, 'H': 20, 'I': 21, 'J': 22, 'K': 23, 'L': 24, 'M': 25, 'N': 26, 'O': 27, 'P': 28, 'Q': 29, 'R': 30, 'S': 31, 'T': 32, 'U': 33, 'V': 34, 'W': 35, 'X': 36, 'Y': 37, 'Z': 38, 'a': 39, 'b': 40, 'c': 41, 'd': 42, 'e': 43, 'f': 44, 'g': 45, 'h': 46, 'i': 47, 'j': 48, 'k': 49, 'l': 50, 'm': 51, 'n': 52, 'o': 53, 'p': 54, 'q': 55, 'r': 56, 's': 57, 't': 58, 'u': 59, 'v': 60, 'w': 61, 'x': 62, 'y': 63, 'z': 64}
self.rev_map = {v: k for k, v in self.token_map.items()}
def encode(self, text):
try:
return [self.token_map[c] for c in text]
except KeyError as e:
raise ValueError(f"Character not in vocabulary: {e.args[0]}")
def decode(self, tokens):
try:
return ''.join(self.rev_map[t] for t in tokens)
except KeyError as e:
raise ValueError(f"Token not in vocabulary: {e.args[0]}")
tokenizer = CharTokenizer()
def generate(prompt):
prompt_encoded = tokenizer.encode(prompt)
x = (torch.tensor(prompt_encoded, dtype=torch.long, device="cpu")[None, ...])
with torch.no_grad():
y = model.generate(
x,
max_new_tokens=1000,
temperature=0.8,
top_k=200
)
return tokenizer.decode(y[0].tolist())
Run inference:
response = generate("O Romeo, Romeo, ")
print(response)
Below is the original README.
nanoGPT - Character-Level Shakespeare - Tied Weights
Small character-level, GPT-style language model trained on the works of Shakespeare using Andrej Karpathy's nanoGPT repo from my project LLMs Universally Learn a Feature Representing Token Frequency / Rarity.
Versions
This model has two versions:
- With tied embedding / unembedding weights (in true GPT fashion) - THIS PAGE
- Without tied embedding / unembedding weights
Usage
The model can be loaded using AutoModel
from Hugging Face's transformers
package:
>>> from transformers import AutoModel
>>> model = AutoModel.from_pretrained("n8cha/nanoGPT-shakespeare-char", trust_remote_code=True)
>>> model
number of parameters: 10.65M
NanoGPT(
(transformer): ModuleDict(
(wte): Embedding(65, 384)
(wpe): Embedding(256, 384)
(drop): Dropout(p=0.2, inplace=False)
(h): ModuleList(
(0-5): 6 x Block(
(ln_1): LayerNorm()
(attn): CausalSelfAttention(
(c_attn): Linear(in_features=384, out_features=1152, bias=False)
(c_proj): Linear(in_features=384, out_features=384, bias=False)
(attn_dropout): Dropout(p=0.2, inplace=False)
(resid_dropout): Dropout(p=0.2, inplace=False)
)
(ln_2): LayerNorm()
(mlp): MLP(
(c_fc): Linear(in_features=384, out_features=1536, bias=False)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=1536, out_features=384, bias=False)
(dropout): Dropout(p=0.2, inplace=False)
)
)
)
(ln_f): LayerNorm()
)
(lm_head): Linear(in_features=384, out_features=65, bias=False)
)
Training Data / Token Counts
The training data token counts can be found on my GitHub repo here and can be loaded using the instructions here.
Tokenizer
As a character-level model the tokenizer is simply a mapping for each character to its token id as given in the token counts (see section above).
- Downloads last month
- 77
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support