Alert button
Picture for Jianjian Sun

Jianjian Sun

Alert button

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
Bookmark button
Alert button
May 23, 2024
Chenglong Liu, Haoran Wei, Jinyue Chen, Lingyu Kong, Zheng Ge, Zining Zhu, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang

Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Bookmark button
Alert button
Apr 15, 2024
Jinyue Chen, Lingyu Kong, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang

Figure 1 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 2 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 3 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 4 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Viaarxiv icon

Small Language Model Meets with Reinforced Vision Vocabulary

Add code
Bookmark button
Alert button
Jan 23, 2024
Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang

Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Bookmark button
Alert button
Dec 11, 2023
Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang

Figure 1 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 2 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 3 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 4 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Bookmark button
Alert button
Sep 20, 2023
Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, Hongyu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

Figure 1 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 2 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 3 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 4 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Viaarxiv icon

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

Add code
Bookmark button
Alert button
Jul 18, 2023
Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, Hongyu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang

Figure 1 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 2 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 3 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 4 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Viaarxiv icon

The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge

Add code
Bookmark button
Alert button
Jun 16, 2023
Dongming Wu, Fan Jia, Jiahao Chang, Zhuoling Li, Jianjian Sun, Chunrui Han, Shuailin Li, Yingfei Liu, Zheng Ge, Tiancai Wang

Figure 1 for The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge
Figure 2 for The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge
Figure 3 for The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge
Figure 4 for The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge
Viaarxiv icon

BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo

Add code
Bookmark button
Alert button
Apr 09, 2023
Yinhao Li, Jinrong Yang, Jianjian Sun, Han Bao, Zheng Ge, Li Xiao

Figure 1 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 2 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 3 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 4 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Viaarxiv icon

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

Add code
Bookmark button
Alert button
Mar 13, 2023
Chunrui Han, Jianjian Sun, Zheng Ge, Jinrong Yang, Runpei Dong, Hongyu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang

Figure 1 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 2 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 3 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 4 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Viaarxiv icon

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Add code
Bookmark button
Alert button
Jan 03, 2023
Junjie Yan, Yingfei Liu, Jianjian Sun, Fan Jia, Shuailin Li, Tiancai Wang, Xiangyu Zhang

Figure 1 for Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection
Figure 2 for Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection
Figure 3 for Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection
Figure 4 for Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection
Viaarxiv icon