Picture for Bingxuan Li

Bingxuan Li

MagiC: Evaluating Multimodal Cognition Toward Grounded Visual Reasoning

Add code
Jul 09, 2025
Viaarxiv icon

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Add code
Jun 18, 2025
Viaarxiv icon

Nano-3D: Metasurface-Based Neural Depth Imaging

Add code
Mar 20, 2025
Figure 1 for Nano-3D: Metasurface-Based Neural Depth Imaging
Figure 2 for Nano-3D: Metasurface-Based Neural Depth Imaging
Figure 3 for Nano-3D: Metasurface-Based Neural Depth Imaging
Figure 4 for Nano-3D: Metasurface-Based Neural Depth Imaging
Viaarxiv icon

Contrastive Visual Data Augmentation

Add code
Feb 24, 2025
Viaarxiv icon

METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling

Add code
Feb 24, 2025
Figure 1 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 2 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 3 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 4 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Viaarxiv icon

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Add code
Dec 03, 2024
Viaarxiv icon

st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction

Add code
Oct 30, 2024
Figure 1 for st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction
Figure 2 for st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction
Figure 3 for st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction
Figure 4 for st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction
Viaarxiv icon

Control Large Language Models via Divide and Conquer

Add code
Oct 06, 2024
Figure 1 for Control Large Language Models via Divide and Conquer
Figure 2 for Control Large Language Models via Divide and Conquer
Figure 3 for Control Large Language Models via Divide and Conquer
Figure 4 for Control Large Language Models via Divide and Conquer
Viaarxiv icon

Latent Feature Mining for Predictive Model Enhancement with Large Language Models

Add code
Oct 06, 2024
Viaarxiv icon

REFFLY: Melody-Constrained Lyrics Editing Model

Add code
Aug 30, 2024
Figure 1 for REFFLY: Melody-Constrained Lyrics Editing Model
Figure 2 for REFFLY: Melody-Constrained Lyrics Editing Model
Figure 3 for REFFLY: Melody-Constrained Lyrics Editing Model
Figure 4 for REFFLY: Melody-Constrained Lyrics Editing Model
Viaarxiv icon