Picture for Kanchana Ranasinghe

Kanchana Ranasinghe

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Add code
Jun 28, 2024
Viaarxiv icon

Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA

Add code
Jun 17, 2024
Viaarxiv icon

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs

Add code
Apr 11, 2024
Viaarxiv icon

Understanding Long Videos in One Multimodal Language Model Pass

Add code
Mar 25, 2024
Viaarxiv icon

Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning

Add code
Mar 21, 2024
Figure 1 for Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
Figure 2 for Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
Figure 3 for Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
Figure 4 for Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
Viaarxiv icon

Language Repository for Long Video Understanding

Add code
Mar 21, 2024
Figure 1 for Language Repository for Long Video Understanding
Figure 2 for Language Repository for Long Video Understanding
Figure 3 for Language Repository for Long Video Understanding
Figure 4 for Language Repository for Long Video Understanding
Viaarxiv icon

Diffusion Illusions: Hiding Images in Plain Sight

Add code
Dec 06, 2023
Figure 1 for Diffusion Illusions: Hiding Images in Plain Sight
Figure 2 for Diffusion Illusions: Hiding Images in Plain Sight
Figure 3 for Diffusion Illusions: Hiding Images in Plain Sight
Figure 4 for Diffusion Illusions: Hiding Images in Plain Sight
Viaarxiv icon

Language-based Action Concept Spaces Improve Video Self-Supervised Learning

Add code
Jul 20, 2023
Figure 1 for Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Figure 2 for Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Figure 3 for Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Figure 4 for Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Viaarxiv icon

Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors

Add code
Nov 23, 2022
Figure 1 for Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Figure 2 for Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Figure 3 for Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Figure 4 for Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Viaarxiv icon

Perceptual Grouping in Vision-Language Models

Add code
Oct 18, 2022
Figure 1 for Perceptual Grouping in Vision-Language Models
Figure 2 for Perceptual Grouping in Vision-Language Models
Figure 3 for Perceptual Grouping in Vision-Language Models
Figure 4 for Perceptual Grouping in Vision-Language Models
Viaarxiv icon