Picture for Yang Li

Yang Li

Shanghai Center for Systems Biomedicine, Key Laboratory of Systems Biomedicine

Multimodal Classification Network Guided Trajectory Planning for Four-Wheel Independent Steering Autonomous Parking Considering Obstacle Attributes

Add code
Dec 21, 2025
Viaarxiv icon

PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics

Add code
Dec 19, 2025
Figure 1 for PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics
Figure 2 for PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics
Figure 3 for PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics
Figure 4 for PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics
Viaarxiv icon

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Add code
Dec 19, 2025
Figure 1 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 2 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 3 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Figure 4 for Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
Viaarxiv icon

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Add code
Dec 18, 2025
Figure 1 for Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
Figure 2 for Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
Figure 3 for Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
Figure 4 for Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
Viaarxiv icon

In Pursuit of Pixel Supervision for Visual Pre-training

Add code
Dec 17, 2025
Viaarxiv icon

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Add code
Dec 16, 2025
Figure 1 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 2 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 3 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 4 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Viaarxiv icon

HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion

Add code
Dec 16, 2025
Figure 1 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 2 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 3 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 4 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Viaarxiv icon

Unifying Dynamic Tool Creation and Cross-Task Experience Sharing through Cognitive Memory Architecture

Add code
Dec 12, 2025
Viaarxiv icon

MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation

Add code
Dec 08, 2025
Viaarxiv icon

BBox DocVQA: A Large Scale Bounding Box Grounded Dataset for Enhancing Reasoning in Document Visual Question Answer

Add code
Nov 19, 2025
Viaarxiv icon