Alert button

"Information": models, code, and papers
Alert button

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

Mar 26, 2024
Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Figure 1 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 2 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 3 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 4 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Viaarxiv icon

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Add code
Bookmark button
Alert button
Mar 29, 2024
Runhao Zeng, Xiaoyong Chen, Jiaming Liang, Huisi Wu, Guangzhong Cao, Yong Guo

Figure 1 for Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Figure 2 for Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Figure 3 for Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Figure 4 for Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Viaarxiv icon

Dia-LLaMA: Towards Large Language Model-driven CT Report Generation

Mar 25, 2024
Zhixuan Chen, Luyang Luo, Yequan Bie, Hao Chen

Viaarxiv icon

Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Mar 28, 2024
Johannes Müller, Semih Çaycı, Guido Montúfar

Figure 1 for Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients
Figure 2 for Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients
Figure 3 for Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients
Figure 4 for Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients
Viaarxiv icon

Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond

Add code
Bookmark button
Alert button
Mar 28, 2024
Katherine Xu, Lingzhi Zhang, Jianbo Shi

Viaarxiv icon

Semantic Map-based Generation of Navigation Instructions

Add code
Bookmark button
Alert button
Mar 28, 2024
Chengzu Li, Chao Zhang, Simone Teufel, Rama Sanand Doddipatla, Svetlana Stoyanchev

Figure 1 for Semantic Map-based Generation of Navigation Instructions
Figure 2 for Semantic Map-based Generation of Navigation Instructions
Figure 3 for Semantic Map-based Generation of Navigation Instructions
Figure 4 for Semantic Map-based Generation of Navigation Instructions
Viaarxiv icon

Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction

Add code
Bookmark button
Alert button
Mar 28, 2024
Chenming Tang, Fanyi Qu, Yunfang Wu

Viaarxiv icon

Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Mar 28, 2024
Mingyu Cai, Karankumar Patel, Soshi Iba, Songpo Li

Figure 1 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks
Figure 2 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks
Figure 3 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks
Figure 4 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks
Viaarxiv icon

Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

Mar 27, 2024
Tianxin Wei, Bowen Jin, Ruirui Li, Hansi Zeng, Zhengyang Wang, Jianhui Sun, Qingyu Yin, Hanqing Lu, Suhang Wang, Jingrui He, Xianfeng Tang

Figure 1 for Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
Figure 2 for Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
Figure 3 for Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
Figure 4 for Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
Viaarxiv icon

TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes

Mar 27, 2024
Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Yongqiang Mao, Hanbo Bi, Chenglong Liu, Xian Sun, Kun Fu

Viaarxiv icon