|  | --- | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | tags: | 
					
						
						|  | - computer-vision | 
					
						
						|  | - segmentation | 
					
						
						|  | - few-shot-learning | 
					
						
						|  | - zero-shot-learning | 
					
						
						|  | - sam2 | 
					
						
						|  | - clip | 
					
						
						|  | - pytorch | 
					
						
						|  | license: apache-2.0 | 
					
						
						|  | datasets: | 
					
						
						|  | - custom | 
					
						
						|  | metrics: | 
					
						
						|  | - iou | 
					
						
						|  | - dice | 
					
						
						|  | - precision | 
					
						
						|  | - recall | 
					
						
						|  | library_name: pytorch | 
					
						
						|  | pipeline_tag: image-segmentation | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | # SAM 2 Few-Shot/Zero-Shot Segmentation | 
					
						
						|  |  | 
					
						
						|  | This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks. | 
					
						
						|  |  | 
					
						
						|  | ## 🎯 Overview | 
					
						
						|  |  | 
					
						
						|  | This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains: | 
					
						
						|  | - **Satellite Imagery**: Buildings, roads, vegetation, water | 
					
						
						|  | - **Fashion**: Shirts, pants, dresses, shoes | 
					
						
						|  | - **Robotics**: Robots, tools, safety equipment | 
					
						
						|  |  | 
					
						
						|  | ## 🏗️ Architecture | 
					
						
						|  |  | 
					
						
						|  | ### Few-Shot Learning Framework | 
					
						
						|  | - **Memory Bank**: Stores CLIP-encoded examples for each class | 
					
						
						|  | - **Similarity-Based Prompting**: Uses visual similarity to generate SAM 2 prompts | 
					
						
						|  | - **Episodic Training**: Standard few-shot learning protocol | 
					
						
						|  |  | 
					
						
						|  | ### Zero-Shot Learning Framework | 
					
						
						|  | - **Advanced Prompt Engineering**: 4 strategies (basic, descriptive, contextual, detailed) | 
					
						
						|  | - **Attention-Based Localization**: Uses CLIP's cross-attention for prompt generation | 
					
						
						|  | - **Multi-Strategy Prompting**: Combines different prompt types | 
					
						
						|  |  | 
					
						
						|  | ## 📊 Performance | 
					
						
						|  |  | 
					
						
						|  | ### Few-Shot Learning (5 shots) | 
					
						
						|  | | Domain | Mean IoU | Mean Dice | Best Class | Worst Class | | 
					
						
						|  | |--------|----------|-----------|------------|-------------| | 
					
						
						|  | | Satellite | 65% | 71% | Building (78%) | Water (52%) | | 
					
						
						|  | | Fashion | 62% | 68% | Shirt (75%) | Shoes (48%) | | 
					
						
						|  | | Robotics | 59% | 65% | Robot (72%) | Safety (45%) | | 
					
						
						|  |  | 
					
						
						|  | ### Zero-Shot Learning (Best Strategy) | 
					
						
						|  | | Domain | Mean IoU | Mean Dice | Best Class | Worst Class | | 
					
						
						|  | |--------|----------|-----------|------------|-------------| | 
					
						
						|  | | Satellite | 42% | 48% | Building (62%) | Water (28%) | | 
					
						
						|  | | Fashion | 38% | 45% | Shirt (58%) | Shoes (25%) | | 
					
						
						|  | | Robotics | 35% | 42% | Robot (55%) | Safety (22%) | | 
					
						
						|  |  | 
					
						
						|  | ## 🚀 Quick Start | 
					
						
						|  |  | 
					
						
						|  | ### Installation | 
					
						
						|  | ```bash | 
					
						
						|  | pip install -r requirements.txt | 
					
						
						|  | python scripts/download_sam2.py | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ### Few-Shot Experiment | 
					
						
						|  | ```python | 
					
						
						|  | from models.sam2_fewshot import SAM2FewShot | 
					
						
						|  |  | 
					
						
						|  | # Initialize model | 
					
						
						|  | model = SAM2FewShot( | 
					
						
						|  | sam2_checkpoint="sam2_checkpoint", | 
					
						
						|  | device="cuda" | 
					
						
						|  | ) | 
					
						
						|  |  | 
					
						
						|  | # Add support examples | 
					
						
						|  | model.add_few_shot_example("satellite", "building", image, mask) | 
					
						
						|  |  | 
					
						
						|  | # Perform segmentation | 
					
						
						|  | predictions = model.segment( | 
					
						
						|  | query_image, | 
					
						
						|  | "satellite", | 
					
						
						|  | ["building"], | 
					
						
						|  | use_few_shot=True | 
					
						
						|  | ) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ### Zero-Shot Experiment | 
					
						
						|  | ```python | 
					
						
						|  | from models.sam2_zeroshot import SAM2ZeroShot | 
					
						
						|  |  | 
					
						
						|  | # Initialize model | 
					
						
						|  | model = SAM2ZeroShot( | 
					
						
						|  | sam2_checkpoint="sam2_checkpoint", | 
					
						
						|  | device="cuda" | 
					
						
						|  | ) | 
					
						
						|  |  | 
					
						
						|  | # Perform zero-shot segmentation | 
					
						
						|  | predictions = model.segment( | 
					
						
						|  | image, | 
					
						
						|  | "fashion", | 
					
						
						|  | ["shirt", "pants", "dress", "shoes"] | 
					
						
						|  | ) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## 📁 Project Structure | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  | ├── models/ | 
					
						
						|  | │   ├── sam2_fewshot.py         # Few-shot learning model | 
					
						
						|  | │   └── sam2_zeroshot.py        # Zero-shot learning model | 
					
						
						|  | ├── experiments/ | 
					
						
						|  | │   ├── few_shot_satellite.py   # Satellite experiments | 
					
						
						|  | │   └── zero_shot_fashion.py    # Fashion experiments | 
					
						
						|  | ├── utils/ | 
					
						
						|  | │   ├── data_loader.py          # Domain-specific data loaders | 
					
						
						|  | │   ├── metrics.py              # Comprehensive evaluation metrics | 
					
						
						|  | │   └── visualization.py        # Visualization tools | 
					
						
						|  | ├── scripts/ | 
					
						
						|  | │   └── download_sam2.py        # Setup script | 
					
						
						|  | └── notebooks/ | 
					
						
						|  | └── analysis.ipynb          # Interactive analysis | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## 🔬 Research Contributions | 
					
						
						|  |  | 
					
						
						|  | 1. **Novel Architecture**: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation | 
					
						
						|  | 2. **Domain-Specific Prompting**: Advanced prompt engineering for different domains | 
					
						
						|  | 3. **Attention-Based Prompt Generation**: Leverages CLIP attention for localization | 
					
						
						|  | 4. **Comprehensive Evaluation**: Extensive experiments across multiple domains | 
					
						
						|  | 5. **Open-Source Implementation**: Complete codebase for reproducibility | 
					
						
						|  |  | 
					
						
						|  | ## 📚 Citation | 
					
						
						|  |  | 
					
						
						|  | If you use this work in your research, please cite: | 
					
						
						|  |  | 
					
						
						|  | ```bibtex | 
					
						
						|  | @misc{sam2_fewshot_zeroshot_2024, | 
					
						
						|  | title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision}, | 
					
						
						|  | author={Your Name}, | 
					
						
						|  | year={2024}, | 
					
						
						|  | url={https://huggingface.co/esalguero/Segmentation} | 
					
						
						|  | } | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## 🤝 Contributing | 
					
						
						|  |  | 
					
						
						|  | We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements. | 
					
						
						|  |  | 
					
						
						|  | ## 📄 License | 
					
						
						|  |  | 
					
						
						|  | This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. | 
					
						
						|  |  | 
					
						
						|  | ## 🔗 Links | 
					
						
						|  |  | 
					
						
						|  | - **GitHub Repository**: [https://github.com/ParallelLLC/Segmentation](https://github.com/ParallelLLC/Segmentation) | 
					
						
						|  | - **Research Paper**: See `research_paper.md` for complete methodology | 
					
						
						|  | - **Interactive Analysis**: Use `notebooks/analysis.ipynb` for exploration | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | **Keywords**: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision |