Picture for Alexander Hauptmann

Alexander Hauptmann

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Add code
Jun 17, 2024
Viaarxiv icon

Learning Visual-Semantic Subspace Representations for Propositional Reasoning

May 25, 2024
Viaarxiv icon

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Add code
Apr 02, 2024
Viaarxiv icon

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin

Sep 18, 2023
Figure 1 for Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin
Figure 2 for Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin
Figure 3 for Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin
Figure 4 for Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin
Viaarxiv icon

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

Add code
Mar 31, 2023
Figure 1 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 2 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 3 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Figure 4 for STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Viaarxiv icon

GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement

Add code
Sep 01, 2022
Figure 1 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 2 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 3 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Figure 4 for GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Viaarxiv icon

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

Add code
Apr 15, 2021
Figure 1 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 2 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 3 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 4 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Viaarxiv icon

Spatial-Temporal Alignment Network for Action Recognition and Detection

Dec 04, 2020
Figure 1 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 2 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 3 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Figure 4 for Spatial-Temporal Alignment Network for Action Recognition and Detection
Viaarxiv icon

Event-Related Bias Removal for Real-time Disaster Events

Add code
Nov 02, 2020
Figure 1 for Event-Related Bias Removal for Real-time Disaster Events
Figure 2 for Event-Related Bias Removal for Real-time Disaster Events
Figure 3 for Event-Related Bias Removal for Real-time Disaster Events
Figure 4 for Event-Related Bias Removal for Real-time Disaster Events
Viaarxiv icon

Support-set bottlenecks for video-text representation learning

Oct 06, 2020
Figure 1 for Support-set bottlenecks for video-text representation learning
Figure 2 for Support-set bottlenecks for video-text representation learning
Figure 3 for Support-set bottlenecks for video-text representation learning
Figure 4 for Support-set bottlenecks for video-text representation learning
Viaarxiv icon