Picture for Kanchana Ranasinghe

Kanchana Ranasinghe

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Add code
Jun 28, 2024
Viaarxiv icon

Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA

Add code
Jun 17, 2024
Viaarxiv icon

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs

Add code
Apr 11, 2024
Viaarxiv icon

Understanding Long Videos in One Multimodal Language Model Pass

Add code
Mar 25, 2024
Viaarxiv icon

Language Repository for Long Video Understanding

Add code
Mar 21, 2024
Viaarxiv icon

Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning

Add code
Mar 21, 2024
Viaarxiv icon

Diffusion Illusions: Hiding Images in Plain Sight

Add code
Dec 06, 2023
Viaarxiv icon

Language-based Action Concept Spaces Improve Video Self-Supervised Learning

Add code
Jul 20, 2023
Viaarxiv icon

Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors

Add code
Nov 23, 2022
Viaarxiv icon

Perceptual Grouping in Vision-Language Models

Add code
Oct 18, 2022
Figure 1 for Perceptual Grouping in Vision-Language Models
Figure 2 for Perceptual Grouping in Vision-Language Models
Figure 3 for Perceptual Grouping in Vision-Language Models
Figure 4 for Perceptual Grouping in Vision-Language Models
Viaarxiv icon