Picture for Qianlong Xie

Qianlong Xie

Reassessing the Role of Supervised Fine-Tuning: An Empirical Study in VLM Reasoning

Add code
Dec 14, 2025
Viaarxiv icon

AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing

Add code
Nov 15, 2025
Viaarxiv icon

Off-Policy Primal-Dual Safe Reinforcement Learning

Add code
Jan 26, 2024
Figure 1 for Off-Policy Primal-Dual Safe Reinforcement Learning
Figure 2 for Off-Policy Primal-Dual Safe Reinforcement Learning
Figure 3 for Off-Policy Primal-Dual Safe Reinforcement Learning
Figure 4 for Off-Policy Primal-Dual Safe Reinforcement Learning
Viaarxiv icon

HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning

Add code
Dec 29, 2023
Figure 1 for HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning
Figure 2 for HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning
Figure 3 for HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning
Figure 4 for HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning
Viaarxiv icon

RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems

Add code
Dec 27, 2023
Figure 1 for RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems
Figure 2 for RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems
Figure 3 for RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems
Figure 4 for RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems
Viaarxiv icon

Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Add code
Jun 01, 2023
Figure 1 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Figure 2 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Figure 3 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Figure 4 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Viaarxiv icon