Alert button
Picture for Jinrong Yang

Jinrong Yang

Alert button

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Bookmark button
Alert button
Dec 11, 2023
Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang

Viaarxiv icon

Merlin:Empowering Multimodal LLMs with Foresight Minds

Add code
Bookmark button
Alert button
Nov 30, 2023
En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao

Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Bookmark button
Alert button
Sep 20, 2023
Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, Hongyu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

Figure 1 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 2 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 3 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 4 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Viaarxiv icon

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

Add code
Bookmark button
Alert button
Jul 18, 2023
Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, Hongyu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang

Figure 1 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 2 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 3 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Figure 4 for ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Viaarxiv icon

GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

Add code
Bookmark button
Alert button
Jul 18, 2023
Zhuoling Li, Chunrui Han, Zheng Ge, Jinrong Yang, En Yu, Haoqian Wang, Hengshuang Zhao, Xiangyu Zhang

Figure 1 for GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping
Figure 2 for GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping
Figure 3 for GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping
Figure 4 for GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping
Viaarxiv icon

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

Add code
Bookmark button
Alert button
Jun 30, 2023
Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, Hongyu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie

Figure 1 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 2 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 3 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Figure 4 for GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Viaarxiv icon

BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo

Add code
Bookmark button
Alert button
Apr 09, 2023
Yinhao Li, Jinrong Yang, Jianjian Sun, Han Bao, Zheng Ge, Li Xiao

Figure 1 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 2 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 3 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Figure 4 for BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo
Viaarxiv icon

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

Add code
Bookmark button
Alert button
Mar 13, 2023
Chunrui Han, Jianjian Sun, Zheng Ge, Jinrong Yang, Runpei Dong, Hongyu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang

Figure 1 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 2 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 3 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Figure 4 for Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception
Viaarxiv icon

Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation

Add code
Bookmark button
Alert button
Dec 03, 2022
En Yu, Songtao Liu, Zhuoling Li, Jinrong Yang, Zeming li, Shoudong Han, Wenbing Tao

Figure 1 for Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
Figure 2 for Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
Figure 3 for Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
Figure 4 for Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
Viaarxiv icon

Towards 3D Object Detection with 2D Supervision

Add code
Bookmark button
Alert button
Nov 15, 2022
Jinrong Yang, Tiancai Wang, Zheng Ge, Weixin Mao, Xiaoping Li, Xiangyu Zhang

Figure 1 for Towards 3D Object Detection with 2D Supervision
Figure 2 for Towards 3D Object Detection with 2D Supervision
Figure 3 for Towards 3D Object Detection with 2D Supervision
Figure 4 for Towards 3D Object Detection with 2D Supervision
Viaarxiv icon