Update model card with comprehensive information

282e0a8 verified 27 days ago

4.4 kB

	---
	library_name: sentence-transformers
	pipeline_tag: sentence-similarity
	license: apache-2.0
	tags:
	- embeddings
	- semantic-search
	- sentence-transformers
	- presentation-templates
	- information-retrieval
	---

	# Field-adaptive-bi-encoder

	## Model Details

	### Model Description
	A fine-tuned SentenceTransformers bi-encoder model for semantic similarity and information retrieval. This model is specifically trained for finding relevant presentation templates based on user queries, descriptions, and metadata (industries, categories, tags).

	Developed by: Mudasir Syed (mudasir13cs)

	Model type: SentenceTransformer (Bi-encoder)

	Language(s) (NLP): English

	License: Apache 2.0

	Finetuned from model: Microsoft/MiniLM-L12-H384-uncased

	### Model Sources
	Repository: https://github.com/mudasir13cs/hybrid-search

	## Uses

	### Direct Use
	This model is designed for semantic search and information retrieval tasks, specifically for finding relevant presentation templates based on natural language queries.

	### Downstream Use
	- Presentation template recommendation systems
	- Content discovery platforms
	- Semantic search engines
	- Information retrieval systems

	### Out-of-Scope Use
	- Text generation
	- Question answering
	- Machine translation
	- Any task not related to semantic similarity

	## Bias, Risks, and Limitations
	- The model is trained on presentation template data and may not generalize well to other domains
	- Performance may vary based on the quality and diversity of training data
	- The model inherits biases present in the base model and training data

	## How to Get Started with the Model

	```python
	from sentence_transformers import SentenceTransformer
	import torch

	# Load the model
	model = SentenceTransformer("mudasir13cs/Field-adaptive-bi-encoder")

	# Encode text for similarity search
	queries = ["business presentation template", "marketing slides for startups"]
	embeddings = model.encode(queries)

	# Compute similarity
	from sentence_transformers import util
	cosine_scores = util.cos_sim(embeddings[0], embeddings[1])
	print(f"Similarity: {cosine_scores.item():.4f}")
	```

	## Training Details

	### Training Data
	- Dataset: Presentation template dataset with descriptions and queries
	- Size: Custom dataset of presentation templates with metadata
	- Source: Curated presentation template collection

	### Training Procedure
	- Architecture: SentenceTransformer with triplet loss
	- Loss Function: Triplet loss with hard negative mining
	- Optimizer: AdamW
	- Learning Rate: 2e-5
	- Batch Size: 16
	- Epochs: 3

	### Training Hyperparameters
	- Training regime: Supervised learning with triplet loss
	- Hardware: GPU (NVIDIA)
	- Training time: ~2 hours

	## Evaluation

	### Testing Data, Factors & Metrics
	- Testing Data: Validation split from presentation template dataset
	- Factors: Query-description similarity, template relevance
	- Metrics:
	- MAP@K (Mean Average Precision at K)
	- MRR@K (Mean Reciprocal Rank at K)
	- Cosine similarity scores

	### Results
	- MAP@10: ~0.85
	- MRR@10: ~0.90
	- Performance: Optimized for presentation template retrieval

	## Environmental Impact
	- Hardware Type: NVIDIA GPU
	- Hours used: ~2 hours
	- Cloud Provider: Local/Cloud
	- Carbon Emitted: Minimal (local training)

	## Technical Specifications

	### Model Architecture and Objective
	- Architecture: Transformer-based bi-encoder
	- Objective: Learn semantic representations for similarity search
	- Input: Text sequences (queries and descriptions)
	- Output: 384-dimensional embeddings

	### Compute Infrastructure
	- Hardware: NVIDIA GPU
	- Software: PyTorch, SentenceTransformers, Transformers

	## Citation

	BibTeX:
	```bibtex
	@misc{field-adaptive-bi-encoder,
	title={Field-adaptive Bi-encoder for Presentation Template Search},
	author={Mudasir Syed},
	year={2024},
	url={https://huggingface.co/mudasir13cs/Field-adaptive-bi-encoder}
	}
	```

	APA:
	Syed, M. (2024). Field-adaptive Bi-encoder for Presentation Template Search. Hugging Face. https://huggingface.co/mudasir13cs/Field-adaptive-bi-encoder

	## Model Card Authors
	Mudasir Syed (mudasir13cs)

	## Model Card Contact
	- GitHub: https://github.com/mudasir13cs
	- Hugging Face: https://huggingface.co/mudasir13cs

	## Framework versions
	- SentenceTransformers: 2.2.2
	- Transformers: 4.35.0
	- PyTorch: 2.0.0