Picture for Yu Yang

Yu Yang

Celine

LLM-Guided Reinforcement Learning: Addressing Training Bottlenecks through Policy Modulation

Add code
May 27, 2025
Viaarxiv icon

O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering

Add code
May 22, 2025
Viaarxiv icon

Predicting Student Dropout Risk With A Dual-Modal Abrupt Behavioral Changes Approach

Add code
May 16, 2025
Viaarxiv icon

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Add code
Apr 17, 2025
Viaarxiv icon

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Add code
Mar 20, 2025
Viaarxiv icon

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Add code
Feb 07, 2025
Figure 1 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 2 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 3 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 4 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Viaarxiv icon

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency

Add code
Feb 07, 2025
Figure 1 for Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
Figure 2 for Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
Figure 3 for Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
Figure 4 for Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
Viaarxiv icon

Combinatorial Optimization Perspective based Framework for Multi-behavior Recommendation

Add code
Feb 04, 2025
Viaarxiv icon

Data-Efficient Model for Psychological Resilience Prediction based on Neurological Data

Add code
Feb 03, 2025
Viaarxiv icon

Pre-train and Fine-tune: Recommenders as Large Models

Add code
Jan 24, 2025
Figure 1 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 2 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 3 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 4 for Pre-train and Fine-tune: Recommenders as Large Models
Viaarxiv icon