Picture for Bin Sun

Bin Sun

Member, IEEE

SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving

Add code
Jan 12, 2026
Viaarxiv icon

LatentVLA: Efficient Vision-Language Models for Autonomous Driving via Latent Action Prediction

Add code
Jan 09, 2026
Viaarxiv icon

Multimodal Prompt Alignment for Facial Expression Recognition

Add code
Jun 26, 2025
Viaarxiv icon

FocalAD: Local Motion Planning for End-to-End Autonomous Driving

Add code
Jun 13, 2025
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance

Add code
May 18, 2025
Figure 1 for Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
Figure 2 for Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
Figure 3 for Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
Figure 4 for Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
Viaarxiv icon

Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning

Add code
Dec 21, 2024
Figure 1 for Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning
Figure 2 for Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning
Figure 3 for Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning
Figure 4 for Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning
Viaarxiv icon

DrVideo: Document Retrieval Based Long Video Understanding

Add code
Jun 18, 2024
Figure 1 for DrVideo: Document Retrieval Based Long Video Understanding
Figure 2 for DrVideo: Document Retrieval Based Long Video Understanding
Figure 3 for DrVideo: Document Retrieval Based Long Video Understanding
Figure 4 for DrVideo: Document Retrieval Based Long Video Understanding
Viaarxiv icon

Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation

Add code
Jun 12, 2024
Figure 1 for Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Figure 2 for Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Figure 3 for Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Figure 4 for Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Viaarxiv icon

GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering

Add code
Feb 04, 2024
Viaarxiv icon