File size: 6,018 Bytes
b4740c6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
Metadata-Version: 2.4
Name: multi-model-orchestrator
Version: 1.0.0
Summary: A sophisticated multi-model orchestration system for parent-child LLM relationships
Home-page: https://github.com/kunaliitkgp09/multi-model-orchestrator
Author: Kunal Dhanda
Author-email: [email protected]
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=1.12.0
Requires-Dist: transformers>=4.21.0
Requires-Dist: diffusers>=0.14.0
Requires-Dist: Pillow>=8.0.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: accelerate>=0.12.0
Requires-Dist: safetensors>=0.3.0
Requires-Dist: xformers>=0.0.16
Requires-Dist: ftfy>=6.0.0
Requires-Dist: regex>=2021.8.3
Requires-Dist: tqdm>=4.64.0
Requires-Dist: requests>=2.25.0
Requires-Dist: huggingface-hub>=0.10.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: black>=21.0; extra == "dev"
Requires-Dist: flake8>=3.8; extra == "dev"
Requires-Dist: mypy>=0.800; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

---
language:
- en
license: mit
library_name: multi-model-orchestrator
tags:
- ai
- machine-learning
- multimodal
- image-captioning
- text-to-image
- orchestration
- transformers
- pytorch
---

# Multi-Model Orchestrator

A sophisticated multi-model orchestration system that manages parent-child LLM relationships, specifically integrating CLIP-GPT2 image captioner and Flickr30k text-to-image models.

## 🚀 Features

### **Parent Orchestrator**
- **Intelligent Task Routing**: Automatically routes tasks to appropriate child models
- **Model Management**: Handles loading, caching, and lifecycle of child models
- **Error Handling**: Robust error handling and recovery mechanisms
- **Task History**: Comprehensive logging and monitoring of all operations
- **Async Support**: Both synchronous and asynchronous processing modes

### **Child Models**
- **CLIP-GPT2 Image Captioner**: Converts images to descriptive text captions
- **Flickr30k Text-to-Image**: Generates images from text descriptions
- **Extensible Architecture**: Easy to add new child models

## 📦 Installation

```bash
pip install git+https://huggingface.co/kunaliitkgp09/multi-model-orchestrator
```

## 🎯 Quick Start

```python
from multi_model_orchestrator import SimpleMultiModelOrchestrator

# Initialize orchestrator
orchestrator = SimpleMultiModelOrchestrator()
orchestrator.initialize_models()

# Generate caption from image
caption = orchestrator.generate_caption("sample_image.jpg")
print(f"Caption: {caption}")

# Generate image from text
image_path = orchestrator.generate_image("A beautiful sunset over mountains")
print(f"Generated image: {image_path}")
```

## 🔗 Model Integration

### **Child Model 1: CLIP-GPT2 Image Captioner**
- **Model**: `kunaliitkgp09/clip-gpt2-image-captioner`
- **Task**: Image-to-text captioning
- **Performance**: ~40% accuracy on test samples

### **Child Model 2: Flickr30k Text-to-Image**
- **Model**: `kunaliitkgp09/flickr30k-text-to-image`
- **Task**: Text-to-image generation
- **Performance**: Fine-tuned on Flickr30k dataset

## 📊 Usage Examples

### **Multimodal Processing**
```python
# Process both image and text together
results = orchestrator.process_multimodal_task(
    image_path="sample_image.jpg",
    text_prompt="A serene landscape with mountains"
)

print("Caption:", results["caption"])
print("Generated Image:", results["generated_image"])
```

### **Async Processing**
```python
from multi_model_orchestrator import AsyncMultiModelOrchestrator
import asyncio

async def async_example():
    orchestrator = AsyncMultiModelOrchestrator()
    orchestrator.initialize_models()
    
    results = await orchestrator.process_multimodal_async(
        image_path="sample_image.jpg",
        text_prompt="A futuristic cityscape"
    )
    return results

asyncio.run(async_example())
```

## 🎯 Use Cases

- **Content Creation**: Generate captions and images for social media
- **Research and Development**: Model performance comparison and prototyping
- **Production Systems**: Automated content generation pipelines
- **Educational Applications**: AI model demonstration and learning

## 📈 Performance Metrics

- **Processing Time**: Optimized for real-time applications
- **Memory Usage**: Efficient GPU/CPU memory management
- **Success Rate**: Robust error handling and recovery
- **Extensibility**: Easy integration of new child models

## 🤝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for:
- New child model integrations
- Performance improvements
- Bug fixes
- Documentation enhancements

## 📄 License

This project is licensed under the MIT License.

## 🙏 Acknowledgments

- **CLIP-GPT2 Model**: [kunaliitkgp09/clip-gpt2-image-captioner](https://huggingface.co/kunaliitkgp09/clip-gpt2-image-captioner)
- **Stable Diffusion Model**: [kunaliitkgp09/flickr30k-text-to-image](https://huggingface.co/kunaliitkgp09/flickr30k-text-to-image)
- **Hugging Face**: For providing the model hosting platform
- **PyTorch**: For the deep learning framework
- **Transformers**: For the model loading and processing utilities

---

**Happy Orchestrating! 🚀**