Search foundation: embeddings, rerankers, small LMs for better search
Generate token-wise heatmaps for images