Alert button

"Text": models, code, and papers
Alert button

Investigating the Catastrophic Forgetting in Multimodal Large Language Models

Sep 19, 2023
Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, Yi Ma

Figure 1 for Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Figure 2 for Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Figure 3 for Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Figure 4 for Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Viaarxiv icon

Can Large Language Models Understand Real-World Complex Instructions?

Sep 17, 2023
Qianyu He, Jie Zeng, Wenhao Huang, Lina Chen, Jin Xiao, Qianxi He, Xunzhe Zhou, Lida Chen, Xintao Wang, Yuncheng Huang, Haoning Ye, Zihan Li, Shisong Chen, Yikai Zhang, Zhouhong Gu, Jiaqing Liang, Yanghua Xiao

Figure 1 for Can Large Language Models Understand Real-World Complex Instructions?
Figure 2 for Can Large Language Models Understand Real-World Complex Instructions?
Figure 3 for Can Large Language Models Understand Real-World Complex Instructions?
Figure 4 for Can Large Language Models Understand Real-World Complex Instructions?
Viaarxiv icon

Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping

Sep 18, 2023
Adam Rashid, Satvik Sharma, Chung Min Kim, Justin Kerr, Lawrence Chen, Angjoo Kanazawa, Ken Goldberg

Figure 1 for Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Figure 2 for Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Figure 3 for Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Figure 4 for Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Viaarxiv icon

BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing

Sep 02, 2023
Chen Wang, Minpeng Liao, Zhongqiang Huang, Jinliang Lu, Junhong Wu, Yuchen Liu, Chengqing Zong, Jiajun Zhang

Figure 1 for BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing
Figure 2 for BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing
Figure 3 for BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing
Figure 4 for BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing
Viaarxiv icon

EnCodecMAE: Leveraging neural codecs for universal audio representation learning

Sep 14, 2023
Leonardo Pepino, Pablo Riera, Luciana Ferrer

Viaarxiv icon

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior

Aug 25, 2023
Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Zhipeng Hu, Changjie Fan, Xin Yu

Figure 1 for EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Figure 2 for EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Figure 3 for EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Figure 4 for EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Viaarxiv icon

Generative AI

Sep 13, 2023
Stefan Feuerriegel, Jochen Hartmann, Christian Janiesch, Patrick Zschech

Viaarxiv icon

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

Aug 23, 2023
Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang

Figure 1 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 2 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 3 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 4 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Viaarxiv icon

Audio Generation with Multiple Conditional Diffusion Model

Aug 23, 2023
Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang

Viaarxiv icon

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning

Jul 21, 2023
Jian Ma, Junhao Liang, Chen Chen, Haonan Lu

Viaarxiv icon