Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
9
4
19
Bilge Yücel
bilgeyucel
Follow
devanandrk's profile picture
beratcmn's profile picture
seekinmonky's profile picture
18 followers
·
6 following
bilgeyucl
bilgeyucel
bilge-yucel
AI & ML interests
NLP, Semantic Search, LLMs
Recent Activity
reacted
to
anakin87
's
post
with ❤️
4 days ago
Haystack can now see 👀 The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support! 📓 Notebooks below This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases. What's new? 🧠 Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming 🔜) 🎛️ Prompt template language to handle structured inputs, including images 📄 PDF and image converters 🔍 Image embedders using CLIP-like models 🧾 LLM-based extractor to pull text from images 🧩 Components to build multimodal RAG pipelines and Agents I had the chance of leading this effort with @sjrhuschlee (great collab). 📓 Below you can find two notebooks to explore the new features: • Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro • Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag (🖼️ image by @bilgeyucel )
reacted
to
anakin87
's
post
with 🔥
4 days ago
🕵️🌐 Building Browser Agents - notebook No API? No problem. Browser Agents can use websites like you do: click, type, wait, read. 📓 Step-by-step notebook: https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/browser_agents.ipynb 🎥 In the video, the Agent: - Goes to Hugging Face Spaces - Finds https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell - Expands a short prompt ("my holiday on Lake Como") into a detailed image generation prompt - Waits for the image - Returns the image URL ## What else can it do? Great for information gathering and summarization 🗞️🗞️ Compare news websites and create a table of shared stories with links ▶️ Find content creator social profiles from YouTube videos 🛍️ Find a product's price range on Amazon 🚂 🚌 Gather public transportation travel options ## How is it built? 🏗️ Haystack → Agent execution logic 🧠 Google Gemini 2.5 Flash → Good and fast LLM with a generous free tier 🛠️ Playwright MCP server → Browser automation tools: navigate, click, type, wait... Even without vision capabilities, this setup can get quite far. ## Next steps - Try a local open model - Move from notebook to real deployment - Incorporate vision And you? Have you built something similar? What's in your stack?
liked
a model
about 1 month ago
Trendyol/Trendyol-LLM-8B-T1
View all activity
Organizations
spaces
1
Running
15
Captionate
📸
Generate Instagram captions from images
models
0
None public yet
datasets
1
bilgeyucel/seven-wonders
Viewer
•
Updated
Mar 9, 2023
•
151
•
1.47k
•
5