Picture for Zheng Wu

Zheng Wu

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Add code
Mar 31, 2025
Viaarxiv icon

CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement

Add code
Feb 19, 2025
Figure 1 for CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
Figure 2 for CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
Figure 3 for CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
Figure 4 for CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
Viaarxiv icon

Physics-Aware Robotic Palletization with Online Masking Inference

Add code
Feb 19, 2025
Viaarxiv icon

Flaming-hot Initiation with Regular Execution Sampling for Large Language Models

Add code
Oct 28, 2024
Figure 1 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 2 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 3 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 4 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Viaarxiv icon

Process Supervision-Guided Policy Optimization for Code Generation

Add code
Oct 23, 2024
Figure 1 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 2 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 3 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 4 for Process Supervision-Guided Policy Optimization for Code Generation
Viaarxiv icon

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

Add code
Oct 11, 2024
Figure 1 for Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Figure 2 for Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Figure 3 for Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Figure 4 for Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Viaarxiv icon

Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning

Add code
Apr 07, 2024
Figure 1 for Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning
Figure 2 for Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning
Figure 3 for Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning
Figure 4 for Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning
Viaarxiv icon

DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking

Add code
Mar 25, 2024
Figure 1 for DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking
Figure 2 for DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking
Figure 3 for DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking
Figure 4 for DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking
Viaarxiv icon

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Figure 1 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 2 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 3 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 4 for Pearl: A Production-ready Reinforcement Learning Agent
Viaarxiv icon

Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning

Add code
Oct 16, 2023
Viaarxiv icon