Instructions to use google/gemma-4-E2B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-E2B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-E2B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-E2B-it") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -380,7 +380,10 @@ Compared to Gemma 3, the models use standard `system`, `assistant`, and `user` r
|
|
| 380 |
|
| 381 |
### 4. Modality order
|
| 382 |
|
| 383 |
-
|
|
|
|
|
|
|
|
|
|
| 384 |
|
| 385 |
### 5. Variable Image Resolution
|
| 386 |
|
|
|
|
| 380 |
|
| 381 |
### 4. Modality order
|
| 382 |
|
| 383 |
+
For optimal performance with multimodal inputs, place:
|
| 384 |
+
|
| 385 |
+
* Image content **before** the text in your prompt.
|
| 386 |
+
* Audio content **after** the text in your prompt.
|
| 387 |
|
| 388 |
### 5. Variable Image Resolution
|
| 389 |
|