Alert button
Picture for Dongxu Li

Dongxu Li

Alert button

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Jan 03, 2024
David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo

Viaarxiv icon

Fundamental Limitation of Semantic Communications: Neural Estimation for Rate-Distortion

Jan 02, 2024
Dongxu Li, Jianhao Huang, Chuan Huang, Xiaoqi Qin, Han Zhang, Ping Zhang

Viaarxiv icon

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

Nov 30, 2023
Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, Ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Viaarxiv icon

Linearized Relative Positional Encoding

Jul 18, 2023
Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Figure 1 for Linearized Relative Positional Encoding
Figure 2 for Linearized Relative Positional Encoding
Figure 3 for Linearized Relative Positional Encoding
Figure 4 for Linearized Relative Positional Encoding
Viaarxiv icon

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

May 24, 2023
Dongxu Li, Junnan Li, Steven C. H. Hoi

Figure 1 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 2 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 3 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 4 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Viaarxiv icon

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

May 11, 2023
Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi

Figure 1 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 2 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 3 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 4 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Viaarxiv icon

Toeplitz Neural Network for Sequence Modeling

May 08, 2023
Zhen Qin, Xiaodong Han, Weixuan Sun, Bowen He, Dong Li, Dongxu Li, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Figure 1 for Toeplitz Neural Network for Sequence Modeling
Figure 2 for Toeplitz Neural Network for Sequence Modeling
Figure 3 for Toeplitz Neural Network for Sequence Modeling
Figure 4 for Toeplitz Neural Network for Sequence Modeling
Viaarxiv icon

Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme

Feb 27, 2023
Jianhao Huang, Dongxu Li, Chuan Huang, Xiaoqi Qin, Wei Zhang

Figure 1 for Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme
Figure 2 for Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme
Figure 3 for Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme
Figure 4 for Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme
Viaarxiv icon

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Jan 30, 2023
Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi

Figure 1 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 2 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 3 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 4 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Viaarxiv icon

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models

Dec 21, 2022
Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, Dacheng Tao, Steven C. H. Hoi

Figure 1 for From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
Figure 2 for From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
Figure 3 for From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
Figure 4 for From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
Viaarxiv icon