prithivida
/

Splade_PP_en_v1

Feature Extraction

sentence-transformers

document-expansion

sparse representation

passage-retrieval

knowledge-distillation

document encoder

text-embeddings-inference

Model card Files Files and versions

prithivida commited on Feb 16, 2024

Commit

b2893b9

·

verified ·

1 Parent(s): ab4e07e

Update README.md

Files changed (1) hide show

README.md +6 -13

README.md CHANGED Viewed

@@ -29,19 +29,12 @@ SPLADE models are a fine balance between retrieval effectiveness (quality) and r
 *(Pure MLE folks should not conflate efficiency to model inference efficiency. Our main focus is on retrieval efficiency. Hereinafter efficiency is a short hand for retrieval efficiency unless explicitly qualified otherwise. Not that inference efficiency is not important, we will address that subsequently.)*
 **TL;DR of Our attempt & results**
-1. FLOPS tuning:
-   - Seperate **seq len for doc and query** unlike Official SPLADE++.
-   - **Severely restricive token budget** doc(128) & query(24) NOT 256 unlike Official SPLADE++.
-   - Idea Inspired from **SparseEmbed** (instead of 2 models for query & doc).
-2. Init Weights: **MLM adapted on MS MARCO corpus**.
-3. Achieves a modest yet competitive effectiveness - **MRR@10 37.22** in ID data (& OOD).
-2. and a retrieval latency of - **47.27ms**. (multi-threaded)
-3. On **mono-GPU** with **only 5 negatives per query**.
-4. For Industry setting
-   - Effectiveness on custom domains needs more than just **Trading FLOPS for tiny gains**.
-   - The Premise "SPLADE++ are not well suited to mono-cpu retrieval" does not hold.
-<img src="./ID.png" width=500 height=350/>
 *Note: The paper refers to the best performing models as SPLADE++, hence for consistency we are reusing the same.*

 *(Pure MLE folks should not conflate efficiency to model inference efficiency. Our main focus is on retrieval efficiency. Hereinafter efficiency is a short hand for retrieval efficiency unless explicitly qualified otherwise. Not that inference efficiency is not important, we will address that subsequently.)*
 **TL;DR of Our attempt & results**
+1. FLOPS tuning: Seperate **Seq lens and Severely restrictive token budget** doc(128) & query(24) NOT 256 unlike Official SPLADE++. Inspired from **SparseEmbed** (instead of 2 models for query & doc).
+3. Init Weights: **MLM adapted on MS MARCO corpus**.
+4. Achieves a modest yet competitive effectiveness **MRR@10 37.22** in ID data (& OOD) and a retrieval latency of - **47.27ms**. (multi-threaded) all On **mono-GPU** with **only 5 negatives per query**.
+4. For Industry setting: Effectiveness on custom domains needs more than just **Trading FLOPS for tiny gains** and The Premise "SPLADE++ are not well suited to mono-cpu retrieval" does not hold.
+<img src="./ID.png" width=550 height=450/>
 *Note: The paper refers to the best performing models as SPLADE++, hence for consistency we are reusing the same.*