Picture for Hongkun Yu

Hongkun Yu

Department of Biomedical Engineering, University of Wisconsin Madison, Madison, WI, USA

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning

Add code
Oct 27, 2025
Figure 1 for Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Figure 2 for Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Figure 3 for Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Figure 4 for Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Viaarxiv icon

3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer

Add code
Feb 26, 2025
Viaarxiv icon

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Add code
Jul 22, 2024
Figure 1 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 2 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 3 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 4 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Viaarxiv icon

ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography

Add code
May 07, 2024
Figure 1 for ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography
Figure 2 for ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography
Figure 3 for ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography
Figure 4 for ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Multitask Multilingual Model Adaptation with Featurized Low-Rank Mixtures

Add code
Feb 27, 2024
Viaarxiv icon

Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

Add code
Feb 05, 2024
Figure 1 for Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Figure 2 for Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Figure 3 for Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Figure 4 for Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Enable Language Models to Implicitly Learn Self-Improvement From Data

Add code
Oct 05, 2023
Figure 1 for Enable Language Models to Implicitly Learn Self-Improvement From Data
Figure 2 for Enable Language Models to Implicitly Learn Self-Improvement From Data
Figure 3 for Enable Language Models to Implicitly Learn Self-Improvement From Data
Figure 4 for Enable Language Models to Implicitly Learn Self-Improvement From Data
Viaarxiv icon

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

Add code
May 24, 2023
Viaarxiv icon