Kaggle error

#38

by legolasyiu - opened 19 days ago

Discussion

legolasyiu

19 days ago

•

edited 19 days ago

Hello

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

Error : "# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="unsloth/gpt-oss-20b")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)"

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("unsloth/gpt-oss-20b")
model = AutoModelForCausalLM.from_pretrained("unsloth/gpt-oss-20b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

Error outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Getting error: "ValueError: Could not find GptOssForCausalLM neither in <module 'transformers.models.gpt_oss' from '/usr/local/lib/python3.11/dist-packages/transformers/models/gpt_oss/__init__.py'> nor in <module 'transformers' from '/usr/local/lib/python3.11/dist-packages/transformers/__init__.py'>!"

Please fix

Slex0

18 days ago

Kaggle is not enough to run this model, I guess.

Slex0

18 days ago

Can you try Google Colab?

legolasyiu

18 days ago

Same, I am using A100 and same problem. I will use nested 4 bit and try

Slex0

18 days ago

Are you trying for 20B model?

legolasyiu

18 days ago

•

edited 18 days ago

yes, I will try again

Slex0

18 days ago

I am currently trying with T4, then I gonna try with A100

legolasyiu

18 days ago

•

edited 18 days ago

let me know how it goes, thanks

Slex0

18 days ago

It's a little bit complicated model to run, I think i give up for today, if you can manage it let me know also send the notebook, thanks.

legolasyiu

18 days ago

•

edited 18 days ago

yes, I think so too. It seems they really need a single H100 GPU with 80GB of memory.

dkundel-openai

OpenAI org 18 days ago

There is currently a cookbook that is being added that should help with running on T4
https://github.com/openai/openai-cookbook/pull/2003

legolasyiu

18 days ago

•

edited 18 days ago

totally awesome. You guys are the best! I am going to try it in colab

phoenyx08

18 days ago

There is another errror when running notebook from the cookbook on the Google Collab
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.98 GiB. GPU 0 has a total capacity of 14.74 GiB of which 1.96 GiB is free. Process 72549 has 12.77 GiB memory in use. Of the allocated memory 10.15 GiB is allocated by PyTorch, and 2.51 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management

legolasyiu

18 days ago

I got the same problem

legolasyiu

18 days ago

This comment has been hidden (marked as Off-Topic)

reach-vb

16 days ago

It's now merged, feel free to lift the code and more specifically dependencies from here: https://cookbook.openai.com/articles/gpt-oss/run-colab

krish53

12 days ago

did you guys able to fine tune gpt oss 20b ? i am facing same issues like you guys have mentioned above.

legolasyiu

12 days ago

•

edited 12 days ago

This model uses MXFP4, to avoid OOM, please use alternative unsloth - https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb

krish53

12 days ago

can you tell me about this gpt-oss-20b. can we train it for any type of use case , especially like customer care?

krish53

12 days ago

also the link you provided, getting this error buddy - AcceleratorError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

legolasyiu

12 days ago

•

edited 12 days ago

I have to let unsloth know. where in the program?

krish53

12 days ago

from transformers import TextStreamer

messages = [
{"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt",
return_dict = True,
reasoning_effort = "low", # NEW! Set reasoning effort to low, medium or high
).to(model.device)

_ = model.generate(**inputs, max_new_tokens = 512, streamer =

In this block of code.

legolasyiu

12 days ago

•

edited 12 days ago

I will let unsloth know, I also have same problem

legolasyiu

11 days ago

•

edited 11 days ago

I talked to unsloth. It is working now, try again.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment