@ashercn97 on AI快站: "does anyone know what the SOTA in text embedding is? Specifically for like…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

posted an update Mar 10, 2025

Post

3537

does anyone know what the SOTA in text embedding is? Specifically for like sentence similarity and clustering?

I think that the MTEB leaderboard is super complex. I feel lost looking at it (what metric should I judge by?)

monsoon-nlp

Mar 10, 2025

I would say, sort by "Mean (task)" and pick one of those. Or if you can, compare three of the best on your data. That holds unless you need a longer context, or you are in medical or similar field where there are domain-specific models