Kaggle error

#38
by legolasyiu - opened

Hello

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

Error : "# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)"

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("unsloth/gpt-oss-20b")
model = AutoModelForCausalLM.from_pretrained("unsloth/gpt-oss-20b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

Error outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Getting error: "ValueError: Could not find GptOssForCausalLM neither in <module 'transformers.models.gpt_oss' from '/usr/local/lib/python3.11/dist-packages/transformers/models/gpt_oss/__init__.py'> nor in <module 'transformers' from '/usr/local/lib/python3.11/dist-packages/transformers/__init__.py'>!"

Please fix

Kaggle is not enough to run this model, I guess.

Can you try Google Colab?

Same, I am using A100 and same problem. I will use nested 4 bit and try

Are you trying for 20B model?

yes, I will try again

I am currently trying with T4, then I gonna try with A100

let me know how it goes, thanks

It's a little bit complicated model to run, I think i give up for today, if you can manage it let me know also send the notebook, thanks.

yes, I think so too. It seems they really need a single H100 GPU with 80GB of memory.

There is currently a cookbook that is being added that should help with running on T4
https://github.com/openai/openai-cookbook/pull/2003

totally awesome. You guys are the best! I am going to try it in colab

There is another errror when running notebook from the cookbook on the Google Collab
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.98 GiB. GPU 0 has a total capacity of 14.74 GiB of which 1.96 GiB is free. Process 72549 has 12.77 GiB memory in use. Of the allocated memory 10.15 GiB is allocated by PyTorch, and 2.51 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management

I got the same problem

This comment has been hidden (marked as Off-Topic)

It's now merged, feel free to lift the code and more specifically dependencies from here: https://cookbook.openai.com/articles/gpt-oss/run-colab

did you guys able to fine tune gpt oss 20b ? i am facing same issues like you guys have mentioned above.

can you tell me about this gpt-oss-20b. can we train it for any type of use case , especially like customer care?

also the link you provided, getting this error buddy - AcceleratorError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I have to let unsloth know. where in the program?

from transformers import TextStreamer

messages = [
{"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt",
return_dict = True,
reasoning_effort = "low", # NEW! Set reasoning effort to low, medium or high
).to(model.device)

_ = model.generate(**inputs, max_new_tokens = 512, streamer =

In this block of code.

I will let unsloth know, I also have same problem

I talked to unsloth. It is working now, try again.

Sign up or log in to comment