Picture for Xiaoying Zhang

Xiaoying Zhang

Building Task Bots with Self-learning for Enhanced Adaptability, Extensibility, and Factuality

Add code
Aug 27, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Add code
Mar 16, 2025
Viaarxiv icon

Conversational Dueling Bandits in Generalized Linear Models

Add code
Jul 26, 2024
Figure 1 for Conversational Dueling Bandits in Generalized Linear Models
Figure 2 for Conversational Dueling Bandits in Generalized Linear Models
Figure 3 for Conversational Dueling Bandits in Generalized Linear Models
Figure 4 for Conversational Dueling Bandits in Generalized Linear Models
Viaarxiv icon

User-Creator Feature Dynamics in Recommender Systems with Dual Influence

Add code
Jul 19, 2024
Figure 1 for User-Creator Feature Dynamics in Recommender Systems with Dual Influence
Figure 2 for User-Creator Feature Dynamics in Recommender Systems with Dual Influence
Figure 3 for User-Creator Feature Dynamics in Recommender Systems with Dual Influence
Figure 4 for User-Creator Feature Dynamics in Recommender Systems with Dual Influence
Viaarxiv icon

Toward Optimal LLM Alignments Using Two-Player Games

Add code
Jun 16, 2024
Figure 1 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 2 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 3 for Toward Optimal LLM Alignments Using Two-Player Games
Figure 4 for Toward Optimal LLM Alignments Using Two-Player Games
Viaarxiv icon

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Add code
Jun 11, 2024
Viaarxiv icon

GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems

Add code
Apr 01, 2024
Figure 1 for GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems
Figure 2 for GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems
Figure 3 for GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems
Figure 4 for GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems
Viaarxiv icon

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

Add code
Mar 14, 2024
Figure 1 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 2 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 3 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 4 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Viaarxiv icon

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Add code
Mar 08, 2024
Figure 1 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 2 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 3 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Viaarxiv icon