Update README.md
Browse filesadded context on the tokens
README.md
CHANGED
|
@@ -70,6 +70,10 @@ generation_text = processor.batch_decode(generation_output[:, -7:], skip_special
|
|
| 70 |
assert generation_text == ['A bus parked on the side of a road.']
|
| 71 |
```
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
Fuyu can also perform some question answering on natural images and charts/diagrams (thought fine-tuning may be required for good performance):
|
| 74 |
```python
|
| 75 |
text_prompt = "What color is the bus?\n"
|
|
|
|
| 70 |
assert generation_text == ['A bus parked on the side of a road.']
|
| 71 |
```
|
| 72 |
|
| 73 |
+
N.B.: The token `|SPEAKER|` is a placeholder token for image patch embeddings, so it will show up in the model context (e.g., in the portion of `generation_output` representing the model context).
|
| 74 |
+
`|NEWLINE|` is the "image newline" token, denoting new rows in the raster scan order input of the image patches.
|
| 75 |
+
`\x04` is the "beginning of answer" token.
|
| 76 |
+
|
| 77 |
Fuyu can also perform some question answering on natural images and charts/diagrams (thought fine-tuning may be required for good performance):
|
| 78 |
```python
|
| 79 |
text_prompt = "What color is the bus?\n"
|