Picture for Shuo Xing

Shuo Xing

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Add code
Jun 09, 2025
Viaarxiv icon

mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation

Add code
May 29, 2025
Viaarxiv icon

Generative AI for Autonomous Driving: Frontiers and Opportunities

Add code
May 13, 2025
Viaarxiv icon

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Add code
Mar 31, 2025
Viaarxiv icon

Can Large Vision Language Models Read Maps Like a Human?

Add code
Mar 18, 2025
Viaarxiv icon

DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning

Add code
Mar 14, 2025
Viaarxiv icon

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Add code
Feb 18, 2025
Viaarxiv icon

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

Add code
Dec 19, 2024
Viaarxiv icon

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

Add code
Dec 19, 2024
Figure 1 for AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving
Figure 2 for AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving
Figure 3 for AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving
Figure 4 for AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving
Viaarxiv icon

Video Quality Assessment: A Comprehensive Survey

Add code
Dec 04, 2024
Viaarxiv icon