ibm-ai-platform
/

codellama-13b-accelerator

Model card Files Files and versions

JRosenkranz commited on Apr 21, 2024

Commit

5e4a937

·

verified ·

1 Parent(s): 1efe76f

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -95,7 +95,7 @@ python fms-extras/scripts/paged_speculative_inference.py \
     --speculator_path=ibm-fms/codellama-13b-accelerator \
     --speculator_source=hf \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
-    --prompt_type=code
     --compile \
     --compile_mode=reduce-overhead
 ```
@@ -111,7 +111,7 @@ python fms-extras/scripts/paged_speculative_inference.py \
     --speculator_path=ibm-fms/codellama-13b-accelerator \
     --speculator_source=hf \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
-    --prompt_type=code
     --compile \
 ```
@@ -127,7 +127,7 @@ python fms-extras/scripts/paged_speculative_inference.py \
     --speculator_source=hf \
     --batch_input \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
-    --prompt_type=code
     --compile \
 ```

     --speculator_path=ibm-fms/codellama-13b-accelerator \
     --speculator_source=hf \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
+    --prompt_type=code \
     --compile \
     --compile_mode=reduce-overhead
 ```
     --speculator_path=ibm-fms/codellama-13b-accelerator \
     --speculator_source=hf \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
+    --prompt_type=code \
     --compile \
 ```
     --speculator_source=hf \
     --batch_input \
     --top_k_tokens_per_head=4,3,2,2,2,2,2 \
+    --prompt_type=code \
     --compile \
 ```