Alert button
Picture for Alexander Hauptmann

Alexander Hauptmann

Alert button

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Add code
Bookmark button
Alert button
Apr 02, 2024
Ruohong Zhang, Liangke Gui, Zhiqing Sun, Yihao Feng, Keyang Xu, Yuanhan Zhang, Di Fu, Chunyuan Li, Alexander Hauptmann, Yonatan Bisk, Yiming Yang

Viaarxiv icon

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin

Add code
Bookmark button
Alert button
Sep 18, 2023
Gabriel Moreira, Manuel Marques, João Paulo Costeira, Alexander Hauptmann

Viaarxiv icon

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

Add code
Bookmark button
Alert button
Mar 31, 2023
Xiaoyu Zhu, Po-Yao Huang, Junwei Liang, Celso M. de Melo, Alexander Hauptmann

Figure 1 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 2 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 3 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 4 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Viaarxiv icon

GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement

Add code
Bookmark button
Alert button
Sep 01, 2022
Zhi-Qi Cheng, Qi Dai, Siyao Li, Teruko Mitamura, Alexander Hauptmann

Figure 1 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 2 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 3 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 4 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Viaarxiv icon

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

Add code
Bookmark button
Alert button
Apr 15, 2021
Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alexander Hauptmann

Figure 1 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 2 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 3 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 4 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Viaarxiv icon

Spatial-Temporal Alignment Network for Action Recognition and Detection

Add code
Bookmark button
Alert button
Dec 04, 2020
Junwei Liang, Liangliang Cao, Xuehan Xiong, Ting Yu, Alexander Hauptmann

Figure 1 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 2 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 3 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 4 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Viaarxiv icon

Event-Related Bias Removal for Real-time Disaster Events

Add code
Bookmark button
Alert button
Nov 02, 2020
Evangelia Spiliopoulou, Salvador Medina Maza, Eduard Hovy, Alexander Hauptmann

Figure 1 for Event-Related Bias Removal for Real-time Disaster Events
Figure 2 for Event-Related Bias Removal for Real-time Disaster Events
Figure 3 for Event-Related Bias Removal for Real-time Disaster Events
Figure 4 for Event-Related Bias Removal for Real-time Disaster Events
Viaarxiv icon

Support-set bottlenecks for video-text representation learning

Add code
Bookmark button
Alert button
Oct 06, 2020
Mandela Patrick, Po-Yao Huang, Yuki Asano, Florian Metze, Alexander Hauptmann, João Henriques, Andrea Vedaldi

Figure 1 for Support-set bottlenecks for video-text representation learning
Figure 2 for Support-set bottlenecks for video-text representation learning
Figure 3 for Support-set bottlenecks for video-text representation learning
Figure 4 for Support-set bottlenecks for video-text representation learning
Viaarxiv icon