Picture for Ziniu Li

Ziniu Li

Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language Models

Add code
Jun 08, 2025
Viaarxiv icon

Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Add code
May 16, 2025
Viaarxiv icon

Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment

Add code
May 07, 2025
Viaarxiv icon

Controlling Large Language Model with Latent Actions

Add code
Mar 27, 2025
Viaarxiv icon

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

Add code
Jan 24, 2025
Viaarxiv icon

Enabling Scalable Oversight via Self-Evolving Critic

Add code
Jan 10, 2025
Viaarxiv icon

Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity

Add code
Aug 29, 2024
Viaarxiv icon

Adam-mini: Use Fewer Learning Rates To Gain More

Add code
Jun 26, 2024
Viaarxiv icon

BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation

Add code
May 27, 2024
Viaarxiv icon

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization

Add code
May 26, 2024
Viaarxiv icon