Alert button
Picture for Shaoxiang Chen

Shaoxiang Chen

Alert button

Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models

Add code
Bookmark button
Alert button
Mar 12, 2024
Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang

Figure 1 for Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
Figure 2 for Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
Figure 3 for Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
Figure 4 for Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
Viaarxiv icon

LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs

Add code
Bookmark button
Alert button
Jan 30, 2024
Shaoxiang Chen, Zequn Jie, Lin Ma

Viaarxiv icon

Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning

Add code
Bookmark button
Alert button
Dec 13, 2023
Yang Jiao, Zequn Jie, Shaoxiang Chen, Lechao Cheng, Jingjing Chen, Lin Ma, Yu-Gang Jiang

Viaarxiv icon

Prompting Large Language Models to Reformulate Queries for Moment Localization

Add code
Bookmark button
Alert button
Jun 06, 2023
Wenfeng Yan, Shaoxiang Chen, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Figure 2 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Figure 3 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Viaarxiv icon

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

Add code
Bookmark button
Alert button
Sep 07, 2022
Yang Jiao, Zequn Jie, Shaoxiang Chen, Jingjing Chen, Xiaolin Wei, Lin Ma, Yu-Gang Jiang

Figure 1 for MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Figure 2 for MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Figure 3 for MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Figure 4 for MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Viaarxiv icon

MT-Net Submission to the Waymo 3D Detection Leaderboard

Add code
Bookmark button
Alert button
Jul 11, 2022
Shaoxiang Chen, Zequn Jie, Xiaolin Wei, Lin Ma

Figure 1 for MT-Net Submission to the Waymo 3D Detection Leaderboard
Figure 2 for MT-Net Submission to the Waymo 3D Detection Leaderboard
Figure 3 for MT-Net Submission to the Waymo 3D Detection Leaderboard
Viaarxiv icon

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes

Add code
Bookmark button
Alert button
Mar 10, 2022
Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang

Figure 1 for MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Figure 2 for MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Figure 3 for MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Figure 4 for MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Viaarxiv icon

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Add code
Bookmark button
Alert button
Sep 23, 2021
Fan Luo, Shaoxiang Chen, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for Self-supervised Learning for Semi-supervised Temporal Language Grounding
Figure 2 for Self-supervised Learning for Semi-supervised Temporal Language Grounding
Figure 3 for Self-supervised Learning for Semi-supervised Temporal Language Grounding
Figure 4 for Self-supervised Learning for Semi-supervised Temporal Language Grounding
Viaarxiv icon

FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting

Add code
Bookmark button
Alert button
Aug 10, 2021
Junke Wang, Shaoxiang Chen, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting
Figure 2 for FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting
Figure 3 for FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting
Figure 4 for FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting
Viaarxiv icon

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos

Add code
Bookmark button
Alert button
Jul 28, 2020
Shaoxiang Chen, Wenhao Jiang, Wei Liu, Yu-Gang Jiang

Figure 1 for Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Figure 2 for Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Figure 3 for Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Figure 4 for Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Viaarxiv icon