Picture for Yuxi Chen

Yuxi Chen

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Add code
Jun 01, 2026
Viaarxiv icon

TOC-Bench: A Temporal Object Consistency Benchmark for Video Large Language Models

Add code
May 11, 2026
Viaarxiv icon

CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training

Add code
Mar 23, 2026
Viaarxiv icon

Multi-Integration of Labels across Categories for Component Identification (MILCCI)

Add code
Feb 04, 2026
Viaarxiv icon

Whole-Body Proprioceptive Morphing: A Modular Soft Gripper for Robust Cross-Scale Grasping

Add code
Oct 31, 2025
Viaarxiv icon

Top-$k$ Feature Importance Ranking

Add code
Sep 18, 2025
Figure 1 for Top-$k$ Feature Importance Ranking
Figure 2 for Top-$k$ Feature Importance Ranking
Figure 3 for Top-$k$ Feature Importance Ranking
Figure 4 for Top-$k$ Feature Importance Ranking
Viaarxiv icon

Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning

Add code
Aug 26, 2025
Viaarxiv icon

Practical Poisoning Attacks against Retrieval-Augmented Generation

Add code
Apr 04, 2025
Viaarxiv icon

MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap

Add code
Mar 17, 2025
Figure 1 for MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap
Figure 2 for MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap
Figure 3 for MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap
Figure 4 for MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap
Viaarxiv icon

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Add code
Jan 29, 2024
Figure 1 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 2 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 3 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 4 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Viaarxiv icon