Picture for Jianing Qi

Jianing Qi

Policy Gradient Guidance Enables Test Time Control

Add code
Oct 02, 2025
Viaarxiv icon

Learning to Reason Across Parallel Samples for LLM Reasoning

Add code
Jun 10, 2025
Viaarxiv icon

Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models

Add code
Mar 21, 2025
Viaarxiv icon

VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers

Add code
Oct 10, 2024
Figure 1 for VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Figure 2 for VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Figure 3 for VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Figure 4 for VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Viaarxiv icon