ydshieh
/

flax-vision-encoder-decoder-vit-gpt2-coco-en

Model card Files Files and versions

ydshieh commited on Dec 18, 2021

Commit

9a97c24

·

1 Parent(s): 212ecaa

improve doc

Files changed (1) hide show

run_image_captioning_flax.py +8 -4

run_image_captioning_flax.py CHANGED Viewed

@@ -114,10 +114,14 @@ class TrainingArguments:
     )
     _block_size_doc = \
         """
-        Split a dataset into chunks of size `block_size`. On each block, images are transformed by the feature extractor
-        and are kept in memory, and the batches of size `batch_size` are yield before processing the next block.
-        The default value `0` indicates we don't use blocks, and the whole dataset will be preprocessed
-        (tokenization + feature extraction) and cached before training.
         """
     block_size: int = field(
         default=0,

     )
     _block_size_doc = \
         """
+        The default value `0` will preprocess (tokenization + feature extraction) the whole dataset before training and
+        cache the results. This uses more disk space, but avoids (repeated) processing time during training. This is a
+        good option if your disk space is large enough to store the whole processed dataset.
+        If a positive value is given, the captions in the dataset will be tokenized before training and the results are
+        cached. During training, it iterates the dataset in chunks of size `block_size`. On each block, images are
+        transformed by the feature extractor with the results being kept in memory (no cache), and batches of size
+        `batch_size` are yielded before processing the next block. This could avoid the heavy disk usage when the
+        dataset is large.
         """
     block_size: int = field(
         default=0,