Picture for Hongsheng Li

Hongsheng Li

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Add code
Nov 16, 2024
Viaarxiv icon

ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving

Add code
Nov 08, 2024
Viaarxiv icon

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Add code
Nov 04, 2024
Figure 1 for A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
Figure 2 for A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
Figure 3 for A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
Figure 4 for A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
Viaarxiv icon

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events

Add code
Oct 27, 2024
Viaarxiv icon

Stable Consistency Tuning: Understanding and Improving Consistency Models

Add code
Oct 24, 2024
Viaarxiv icon

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Add code
Oct 17, 2024
Viaarxiv icon

SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction

Add code
Oct 11, 2024
Figure 1 for SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Figure 2 for SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Figure 3 for SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Figure 4 for SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Viaarxiv icon

A foundation model for generalizable disease diagnosis in chest X-ray images

Add code
Oct 11, 2024
Viaarxiv icon

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Add code
Oct 10, 2024
Figure 1 for MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Figure 2 for MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Figure 3 for MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Figure 4 for MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Viaarxiv icon

CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection

Add code
Oct 10, 2024
Figure 1 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 2 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 3 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 4 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Viaarxiv icon