Frames Dataset


VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction

Add code
Apr 30, 2025
Viaarxiv icon

Recursive KL Divergence Optimization: A Dynamic Framework for Representation Learning

Add code
Apr 30, 2025
Viaarxiv icon

TesserAct: Learning 4D Embodied World Models

Add code
Apr 29, 2025
Viaarxiv icon

UNet with Axial Transformer : A Neural Weather Model for Precipitation Nowcasting

Add code
Apr 28, 2025
Viaarxiv icon

Full-field surrogate modeling of cardiac function encoding geometric variability

Add code
Apr 29, 2025
Viaarxiv icon

Large-scale visual SLAM for in-the-wild videos

Add code
Apr 29, 2025
Viaarxiv icon

Exploiting Inter-Sample Correlation and Intra-Sample Redundancy for Partially Relevant Video Retrieval

Add code
Apr 28, 2025
Viaarxiv icon

CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition

Add code
Apr 28, 2025
Viaarxiv icon

MASR: Self-Reflective Reasoning through Multimodal Hierarchical Attention Focusing for Agent-based Video Understanding

Add code
Apr 28, 2025
Viaarxiv icon

Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI

Add code
Apr 28, 2025
Viaarxiv icon