Shehan Munasinghe's picture

2 11 2

Shehan Munasinghe

shehan97

·

https://shehanmunasinghe.github.io/

AI & ML interests

Computer Vision, Multi-modal learning

Recent Activity

authored a paper about 1 month ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

upvoted a paper 6 months ago

Sekai: A Video Dataset towards World Exploration

upvoted a paper 7 months ago

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

View all activity

Organizations

authored a paper about 1 month ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 23

authored a paper about 2 years ago

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

Paper • 2311.13435 • Published Nov 22, 2023 • 18