Picture for Huaimin Wang

Huaimin Wang

Decoupling Constraint from Two Direction in Evolutionary Constrained Multi-objective Optimization

Add code
Dec 30, 2025
Viaarxiv icon

Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving

Add code
Dec 27, 2025
Viaarxiv icon

Evolutionary training-free guidance in diffusion model for 3D multi-objective molecular generation

Add code
May 16, 2025
Viaarxiv icon

Pay More Attention to the Robustness of Prompt for Instruction Data Mining

Add code
Mar 31, 2025
Viaarxiv icon

NebulaFL: Effective Asynchronous Federated Learning for JointCloud Computing

Add code
Dec 06, 2024
Viaarxiv icon

AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation

Add code
Oct 09, 2024
Figure 1 for AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Figure 2 for AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Figure 3 for AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Figure 4 for AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Viaarxiv icon

Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models

Add code
Sep 14, 2024
Figure 1 for Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Figure 2 for Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Figure 3 for Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Figure 4 for Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
Viaarxiv icon

Online Self-Preferring Language Models

Add code
May 23, 2024
Figure 1 for Online Self-Preferring Language Models
Figure 2 for Online Self-Preferring Language Models
Figure 3 for Online Self-Preferring Language Models
Figure 4 for Online Self-Preferring Language Models
Viaarxiv icon

Optimistic Model Rollouts for Pessimistic Offline Policy Optimization

Add code
Jan 11, 2024
Figure 1 for Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Figure 2 for Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Figure 3 for Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Figure 4 for Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Viaarxiv icon

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

Add code
Dec 30, 2023
Viaarxiv icon