Alert button

"Information": models, code, and papers
Alert button

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

Add code
Bookmark button
Alert button
Feb 25, 2024
Minsu Kim, Jee-weon Jung, Hyeongseop Rha, Soumi Maiti, Siddhant Arora, Xuankai Chang, Shinji Watanabe, Yong Man Ro

Viaarxiv icon

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

Add code
Bookmark button
Alert button
Feb 25, 2024
Yuxuan Wang, Yueqian Wang, Pengfei Wu, Jianxin Liang, Dongyan Zhao, Zilong Zheng

Viaarxiv icon

Reward Design for Justifiable Sequential Decision-Making

Feb 24, 2024
Aleksa Sukovic, Goran Radanovic

Viaarxiv icon

MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning

Feb 21, 2024
Wanqing Cui, Keping Bi, Jiafeng Guo, Xueqi Cheng

Viaarxiv icon

A Multi-Fidelity Methodology for Reduced Order Models with High-Dimensional Inputs

Feb 26, 2024
Bilal Mufti, Christian Perron, Dimitri N. Mavris

Viaarxiv icon

Set the Clock: Temporal Alignment of Pretrained Language Models

Add code
Bookmark button
Alert button
Feb 26, 2024
Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith

Viaarxiv icon

Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

Add code
Bookmark button
Alert button
Feb 23, 2024
Yurim Jeon, E In Son, Seung-Woo Seo

Viaarxiv icon

Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

Feb 21, 2024
Kihong Kim, Haneol Lee, Jihye Park, Seyeon Kim, Kwanghee Lee, Seungryong Kim, Jaejun Yoo

Viaarxiv icon

Hierarchical Bayes Approach to Personalized Federated Unsupervised Learning

Feb 25, 2024
Kaan Ozkara, Bruce Huang, Ruida Zhou, Suhas Diggavi

Viaarxiv icon

Plausible Extractive Rationalization through Semi-Supervised Entailment Signal

Feb 25, 2024
Wei Jie Yeo, Ranjan Satapathy, Erik Cambria

Viaarxiv icon