BM-K commited on
Commit
c702c15
·
verified ·
1 Parent(s): ba81b69

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -46,7 +46,7 @@ We report **Normalized Discounted Cumulative Gain (NDCG)** scores, which measure
46
  - **Avg. NDCG**: Average of NDCG@1, @3, @5, and @10 across all benchmark datasets.
47
  - **NDCG@k**: Relevance quality of the top-*k* retrieved results.
48
 
49
- #### Korean Retrieval Benchmarks
50
  Our model, **telepix/PIXIE-Rune-Preview**, achieves state-of-the-art performance across most metrics and benchmarks, demonstrating strong generalization across domains such as multi-hop QA, long-document retrieval, public health, and e-commerce.
51
 
52
  | Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |
@@ -78,7 +78,7 @@ Descriptions of the benchmark datasets used for evaluation are as follows:
78
  - **XPQARetrieval**
79
  A real-world dataset constructed from user queries and relevant product documents in a Korean e-commerce platform.
80
 
81
- #### English Retrieval Benchmarks
82
  Our model, **telepix/PIXIE-Rune-Preview**, achieves strong performance on a wide range of tasks, including fact verification, multi-hop question answering, financial QA, and scientific document retrieval, demonstrating competitive generalization across diverse domains.
83
 
84
  | Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |
 
46
  - **Avg. NDCG**: Average of NDCG@1, @3, @5, and @10 across all benchmark datasets.
47
  - **NDCG@k**: Relevance quality of the top-*k* retrieved results.
48
 
49
+ #### 7 Datasets of MTEB (Korean)
50
  Our model, **telepix/PIXIE-Rune-Preview**, achieves state-of-the-art performance across most metrics and benchmarks, demonstrating strong generalization across domains such as multi-hop QA, long-document retrieval, public health, and e-commerce.
51
 
52
  | Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |
 
78
  - **XPQARetrieval**
79
  A real-world dataset constructed from user queries and relevant product documents in a Korean e-commerce platform.
80
 
81
+ #### 7 Datasets of BEIR (English)
82
  Our model, **telepix/PIXIE-Rune-Preview**, achieves strong performance on a wide range of tasks, including fact verification, multi-hop question answering, financial QA, and scientific document retrieval, demonstrating competitive generalization across diverse domains.
83
 
84
  | Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |