Picture for Wendi Li

Wendi Li

LLM-based Human-like Traffic Simulation for Self-driving Tests

Add code
Aug 23, 2025
Viaarxiv icon

Process Reinforcement through Implicit Rewards

Add code
Feb 03, 2025
Viaarxiv icon

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Figure 1 for Free Process Rewards without Process Labels
Figure 2 for Free Process Rewards without Process Labels
Figure 3 for Free Process Rewards without Process Labels
Figure 4 for Free Process Rewards without Process Labels
Viaarxiv icon

FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks

Add code
Oct 28, 2024
Figure 1 for FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Figure 2 for FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Figure 3 for FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Figure 4 for FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Viaarxiv icon

Process Reward Model with Q-Value Rankings

Add code
Oct 15, 2024
Viaarxiv icon

Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue

Add code
Jun 04, 2024
Figure 1 for Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue
Figure 2 for Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue
Figure 3 for Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue
Figure 4 for Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue
Viaarxiv icon

Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Add code
Mar 18, 2024
Figure 1 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation
Figure 2 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation
Figure 3 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation
Figure 4 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation
Viaarxiv icon

TREA: Tree-Structure Reasoning Schema for Conversational Recommendation

Add code
Jul 20, 2023
Figure 1 for TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Figure 2 for TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Figure 3 for TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Figure 4 for TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Viaarxiv icon

Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning

Add code
May 04, 2023
Figure 1 for Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning
Figure 2 for Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning
Figure 3 for Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning
Figure 4 for Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning
Viaarxiv icon

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

Add code
Jan 11, 2022
Figure 1 for DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
Figure 2 for DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
Figure 3 for DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
Figure 4 for DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
Viaarxiv icon