Picture for Lijun Zhang

Lijun Zhang

Michigan State University

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

Add code
Jan 13, 2026
Viaarxiv icon

Constrained Language Model Policy Optimization via Risk-aware Stepwise Alignment

Add code
Dec 30, 2025
Viaarxiv icon

Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images

Add code
Dec 19, 2025
Figure 1 for Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images
Figure 2 for Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images
Figure 3 for Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images
Figure 4 for Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images
Viaarxiv icon

Parameter-Free Clustering via Self-Supervised Consensus Maximization (Extended Version)

Add code
Nov 13, 2025
Viaarxiv icon

BadThink: Triggered Overthinking Attacks on Chain-of-Thought Reasoning in Large Language Models

Add code
Nov 13, 2025
Viaarxiv icon

Goal-Guided Efficient Exploration via Large Language Model in Reinforcement Learning

Add code
Sep 26, 2025
Viaarxiv icon

Convergence Analysis of the Lion Optimizer in Centralized and Distributed Settings

Add code
Aug 17, 2025
Viaarxiv icon

Topology Enhanced MARL for Multi-Vehicle Cooperative Decision-Making of CAVs

Add code
Jul 16, 2025
Viaarxiv icon

Improved Analysis for Sign-based Methods with Momentum Updates

Add code
Jul 16, 2025
Figure 1 for Improved Analysis for Sign-based Methods with Momentum Updates
Figure 2 for Improved Analysis for Sign-based Methods with Momentum Updates
Figure 3 for Improved Analysis for Sign-based Methods with Momentum Updates
Figure 4 for Improved Analysis for Sign-based Methods with Momentum Updates
Viaarxiv icon

Risk-aware Direct Preference Optimization under Nested Risk Measure

Add code
May 29, 2025
Viaarxiv icon