Shirin Yamani's picture

1 4 3

Shirin Yamani

ShirinYamani

·

shirinyamani

AI & ML interests

Core ML

Recent Activity

updated a dataset 2 days ago

trl-lib/documentation-images

reacted to arthurbresnu's post with 🚀 about 2 months ago

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module & much more. Sparse + Dense = 🔥 hybrid search performance! 1️⃣ Sparse Encoder Models - New support for sparse embeddings (30k+ dims, <1% non-zero) * Full SPLADE, Inference-free SPLADE, CSR support * 4 new modules, 12 losses, 9 evaluators * Integration with elastic, opensearch-project, Qdrant, ibm-granite * Decode interpretable embeddings * Hybrid search integration 2️⃣ Enhanced Encode Methods * encode_query & encode_document with auto prompts * Direct device list passing to encode() * Cleaner multi-processing 3️⃣ Router Module & Training * Different paths for queries vs documents * Custom learning rates per parameter group * Composite loss logging * Perfect for two-tower architectures 4️⃣ Documentation & Training * New Training/Loss Overview docs * 6 training example pages * Search engine integration examples Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0 What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap! I'm incredibly excited to see the community explore sparse embeddings and hybrid search! The interpretability alone makes this a game-changer for understanding what your models are actually doing. 🙏 Thanks to @tomaarsen for this incredible opportunity!

upvoted an article 2 months ago

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

View all activity

Organizations

Articles 1

Article

49

🐯 Liger GRPO meets TRL

spaces 3

No application file

Strl Sft

No application file

SFT Job

testing SFT script

Compute

models 8

ShirinYamani/Qwen3-4B-Base-SFT

Text Generation • 4B • Updated May 25 • 16

ShirinYamani/Qwen2.5-0.5B-SFT-model

Updated Dec 20, 2024

ShirinYamani/chronos-t5-small-fine-tuned

0.0B • Updated Sep 5, 2024 • 6

ShirinYamani/llama-2-7b-fine-tuned

Updated Jun 24, 2024 • 1

ShirinYamani/huggyllama-llama-7b-finetuned

Text Generation • Updated Jun 20, 2024

ShirinYamani/llama-3-8B-fine-tuned-dora

Updated Jun 12, 2024

ShirinYamani/mistral7b-fine-tuned-qlora

Text Generation • Updated Jun 12, 2024 • 5

ShirinYamani/NLTK-tokenizer

Updated Jan 3, 2024 • 1

datasets 2

ShirinYamani/2011-2017-load

Viewer • Updated Nov 30, 2024 • 52.6k • 6

ShirinYamani/ts

Updated Aug 7, 2024 • 3