File size: 4,404 Bytes
b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 b93e3f8 282e0a8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
library_name: sentence-transformers
pipeline_tag: sentence-similarity
license: apache-2.0
tags:
- embeddings
- semantic-search
- sentence-transformers
- presentation-templates
- information-retrieval
---
# Field-adaptive-bi-encoder
## Model Details
### Model Description
A fine-tuned SentenceTransformers bi-encoder model for semantic similarity and information retrieval. This model is specifically trained for finding relevant presentation templates based on user queries, descriptions, and metadata (industries, categories, tags).
**Developed by:** Mudasir Syed (mudasir13cs)
**Model type:** SentenceTransformer (Bi-encoder)
**Language(s) (NLP):** English
**License:** Apache 2.0
**Finetuned from model:** Microsoft/MiniLM-L12-H384-uncased
### Model Sources
**Repository:** https://github.com/mudasir13cs/hybrid-search
## Uses
### Direct Use
This model is designed for semantic search and information retrieval tasks, specifically for finding relevant presentation templates based on natural language queries.
### Downstream Use
- Presentation template recommendation systems
- Content discovery platforms
- Semantic search engines
- Information retrieval systems
### Out-of-Scope Use
- Text generation
- Question answering
- Machine translation
- Any task not related to semantic similarity
## Bias, Risks, and Limitations
- The model is trained on presentation template data and may not generalize well to other domains
- Performance may vary based on the quality and diversity of training data
- The model inherits biases present in the base model and training data
## How to Get Started with the Model
```python
from sentence_transformers import SentenceTransformer
import torch
# Load the model
model = SentenceTransformer("mudasir13cs/Field-adaptive-bi-encoder")
# Encode text for similarity search
queries = ["business presentation template", "marketing slides for startups"]
embeddings = model.encode(queries)
# Compute similarity
from sentence_transformers import util
cosine_scores = util.cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {cosine_scores.item():.4f}")
```
## Training Details
### Training Data
- **Dataset:** Presentation template dataset with descriptions and queries
- **Size:** Custom dataset of presentation templates with metadata
- **Source:** Curated presentation template collection
### Training Procedure
- **Architecture:** SentenceTransformer with triplet loss
- **Loss Function:** Triplet loss with hard negative mining
- **Optimizer:** AdamW
- **Learning Rate:** 2e-5
- **Batch Size:** 16
- **Epochs:** 3
### Training Hyperparameters
- **Training regime:** Supervised learning with triplet loss
- **Hardware:** GPU (NVIDIA)
- **Training time:** ~2 hours
## Evaluation
### Testing Data, Factors & Metrics
- **Testing Data:** Validation split from presentation template dataset
- **Factors:** Query-description similarity, template relevance
- **Metrics:**
- MAP@K (Mean Average Precision at K)
- MRR@K (Mean Reciprocal Rank at K)
- Cosine similarity scores
### Results
- **MAP@10:** ~0.85
- **MRR@10:** ~0.90
- **Performance:** Optimized for presentation template retrieval
## Environmental Impact
- **Hardware Type:** NVIDIA GPU
- **Hours used:** ~2 hours
- **Cloud Provider:** Local/Cloud
- **Carbon Emitted:** Minimal (local training)
## Technical Specifications
### Model Architecture and Objective
- **Architecture:** Transformer-based bi-encoder
- **Objective:** Learn semantic representations for similarity search
- **Input:** Text sequences (queries and descriptions)
- **Output:** 384-dimensional embeddings
### Compute Infrastructure
- **Hardware:** NVIDIA GPU
- **Software:** PyTorch, SentenceTransformers, Transformers
## Citation
**BibTeX:**
```bibtex
@misc{field-adaptive-bi-encoder,
title={Field-adaptive Bi-encoder for Presentation Template Search},
author={Mudasir Syed},
year={2024},
url={https://huggingface.co/mudasir13cs/Field-adaptive-bi-encoder}
}
```
**APA:**
Syed, M. (2024). Field-adaptive Bi-encoder for Presentation Template Search. Hugging Face. https://huggingface.co/mudasir13cs/Field-adaptive-bi-encoder
## Model Card Authors
Mudasir Syed (mudasir13cs)
## Model Card Contact
- **GitHub:** https://github.com/mudasir13cs
- **Hugging Face:** https://huggingface.co/mudasir13cs
## Framework versions
- SentenceTransformers: 2.2.2
- Transformers: 4.35.0
- PyTorch: 2.0.0
|