Update README.md
Browse files
README.md
CHANGED
@@ -87,10 +87,21 @@ model-index:
|
|
87 |
|
88 |
|
89 |
## 📈 Performance
|
|
|
|
|
90 |
| Model | MVBench | LongVideoBench | VideoMME(w/o sub)|
|
91 |
| --- | --- | --- | --- |
|
92 |
|InternVideo2.5| 75.7 | 60.6 | 65.1|
|
93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
94 |
## 🚀 How to use the model
|
95 |
|
96 |
First, you need to install [flash attention2](https://github.com/Dao-AILab/flash-attention) and some other modules. We provide a simple installation example below:
|
|
|
87 |
|
88 |
|
89 |
## 📈 Performance
|
90 |
+
|
91 |
+
- VideoBenchmark
|
92 |
| Model | MVBench | LongVideoBench | VideoMME(w/o sub)|
|
93 |
| --- | --- | --- | --- |
|
94 |
|InternVideo2.5| 75.7 | 60.6 | 65.1|
|
95 |
|
96 |
+
- Inference Speed
|
97 |
+
|
98 |
+
We measured the average inference speed (tokens/s) of generating 1024 new tokens and 5198 (8192-2998) tokens with the context of an video (which takes 2998 tokens) under BF16 precision.
|
99 |
+
|
100 |
+
|Quantization | Speed (3022 tokens) | Speed (8192 tokens)|
|
101 |
+
|--- |--- |---|
|
102 |
+
|BF16 | 33.40 | 31.91 |
|
103 |
+
|
104 |
+
|
105 |
## 🚀 How to use the model
|
106 |
|
107 |
First, you need to install [flash attention2](https://github.com/Dao-AILab/flash-attention) and some other modules. We provide a simple installation example below:
|