Kaggle error
Hello
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)
Error : "# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)"
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("unsloth/gpt-oss-20b")
model = AutoModelForCausalLM.from_pretrained("unsloth/gpt-oss-20b")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
Error outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Getting error: "ValueError: Could not find GptOssForCausalLM neither in <module 'transformers.models.gpt_oss' from '/usr/local/lib/python3.11/dist-packages/transformers/models/gpt_oss/__init__.py'> nor in <module 'transformers' from '/usr/local/lib/python3.11/dist-packages/transformers/__init__.py'>!"
Please fix
Kaggle is not enough to run this model, I guess.
Can you try Google Colab?
Same, I am using A100 and same problem. I will use nested 4 bit and try
Are you trying for 20B model?
yes, I will try again
I am currently trying with T4, then I gonna try with A100
let me know how it goes, thanks
It's a little bit complicated model to run, I think i give up for today, if you can manage it let me know also send the notebook, thanks.
yes, I think so too. It seems they really need a single H100 GPU with 80GB of memory.
There is currently a cookbook that is being added that should help with running on T4
https://github.com/openai/openai-cookbook/pull/2003
totally awesome. You guys are the best! I am going to try it in colab
There is another errror when running notebook from the cookbook on the Google Collab
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.98 GiB. GPU 0 has a total capacity of 14.74 GiB of which 1.96 GiB is free. Process 72549 has 12.77 GiB memory in use. Of the allocated memory 10.15 GiB is allocated by PyTorch, and 2.51 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management
I got the same problem
It's now merged, feel free to lift the code and more specifically dependencies from here: https://cookbook.openai.com/articles/gpt-oss/run-colab
did you guys able to fine tune gpt oss 20b ? i am facing same issues like you guys have mentioned above.
This model uses MXFP4, to avoid OOM, please use alternative unsloth - https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb
can you tell me about this gpt-oss-20b. can we train it for any type of use case , especially like customer care?
also the link you provided, getting this error buddy - AcceleratorError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
I have to let unsloth know. where in the program?
from transformers import TextStreamer
messages = [
{"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt",
return_dict = True,
reasoning_effort = "low", # NEW! Set reasoning effort to low, medium or high
).to(model.device)
_ = model.generate(**inputs, max_new_tokens = 512, streamer =
In this block of code.
I will let unsloth know, I also have same problem
I talked to unsloth. It is working now, try again.