Picture for Andrea Madotto

Andrea Madotto

Shammie

Proactive Assistant Dialogue Generation from Streaming Egocentric Videos

Add code
Jun 06, 2025
Viaarxiv icon

Perception Encoder: The best visual embeddings are not at the output of the network

Add code
Apr 17, 2025
Viaarxiv icon

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Add code
Apr 17, 2025
Figure 1 for PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Figure 2 for PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Figure 3 for PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Figure 4 for PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Viaarxiv icon

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM

Add code
Mar 07, 2024
Figure 1 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 2 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 3 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 4 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Viaarxiv icon

Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

Add code
Feb 06, 2024
Viaarxiv icon

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Add code
Sep 27, 2023
Figure 1 for AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Figure 2 for AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Figure 3 for AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Figure 4 for AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Viaarxiv icon

Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

Add code
Jul 06, 2023
Viaarxiv icon

Continual Dialogue State Tracking via Example-Guided Question Answering

Add code
May 23, 2023
Figure 1 for Continual Dialogue State Tracking via Example-Guided Question Answering
Figure 2 for Continual Dialogue State Tracking via Example-Guided Question Answering
Figure 3 for Continual Dialogue State Tracking via Example-Guided Question Answering
Figure 4 for Continual Dialogue State Tracking via Example-Guided Question Answering
Viaarxiv icon

IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text

Add code
Oct 26, 2022
Figure 1 for IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Figure 2 for IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Figure 3 for IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Figure 4 for IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Viaarxiv icon

Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

Add code
Oct 14, 2022
Figure 1 for Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Figure 2 for Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Figure 3 for Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Figure 4 for Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Viaarxiv icon