Picture for Lorenzo Torresani

Lorenzo Torresani

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Add code
Apr 17, 2025
Viaarxiv icon

Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

Add code
Apr 07, 2025
Viaarxiv icon

VITED: Video Temporal Evidence Distillation

Add code
Mar 17, 2025
Viaarxiv icon

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Add code
Mar 13, 2025
Viaarxiv icon

TimeRefine: Temporal Grounding with Time Refining Video LLM

Add code
Dec 12, 2024
Figure 1 for TimeRefine: Temporal Grounding with Time Refining Video LLM
Figure 2 for TimeRefine: Temporal Grounding with Time Refining Video LLM
Figure 3 for TimeRefine: Temporal Grounding with Time Refining Video LLM
Figure 4 for TimeRefine: Temporal Grounding with Time Refining Video LLM
Viaarxiv icon

Semantic Compositions Enhance Vision-Language Contrastive Learning

Add code
Jul 01, 2024
Figure 1 for Semantic Compositions Enhance Vision-Language Contrastive Learning
Figure 2 for Semantic Compositions Enhance Vision-Language Contrastive Learning
Figure 3 for Semantic Compositions Enhance Vision-Language Contrastive Learning
Figure 4 for Semantic Compositions Enhance Vision-Language Contrastive Learning
Viaarxiv icon

Step Differences in Instructional Video

Add code
Apr 24, 2024
Figure 1 for Step Differences in Instructional Video
Figure 2 for Step Differences in Instructional Video
Figure 3 for Step Differences in Instructional Video
Figure 4 for Step Differences in Instructional Video
Viaarxiv icon

Video ReCap: Recursive Captioning of Hour-Long Videos

Add code
Feb 28, 2024
Figure 1 for Video ReCap: Recursive Captioning of Hour-Long Videos
Figure 2 for Video ReCap: Recursive Captioning of Hour-Long Videos
Figure 3 for Video ReCap: Recursive Captioning of Hour-Long Videos
Figure 4 for Video ReCap: Recursive Captioning of Hour-Long Videos
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Multiscale Video Pretraining for Long-Term Activity Forecasting

Add code
Jul 24, 2023
Viaarxiv icon