Proposal to revise multimodality statement

#12

by dkleine - opened 6 days ago

6 days ago

The current sentence in the model card

Gemma 3 models are multimodal, handling text and image input and generating text output

appears overly broad as not all Gemma 3 model sizes support image input (the smaller 270M and 1B variants are text-only).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment