Picture for Zhenting Qi

Zhenting Qi

Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering

Add code
May 29, 2025
Viaarxiv icon

Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision

Add code
May 21, 2025
Viaarxiv icon

Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models

Add code
May 19, 2025
Viaarxiv icon

ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Add code
Mar 31, 2025
Viaarxiv icon

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Add code
Feb 04, 2025
Viaarxiv icon

Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models

Add code
Dec 31, 2024
Figure 1 for Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Figure 2 for Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Figure 3 for Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Figure 4 for Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Viaarxiv icon

Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge

Add code
Nov 05, 2024
Figure 1 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 2 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 3 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 4 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Viaarxiv icon

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains

Add code
Oct 11, 2024
Figure 1 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 2 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 3 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 4 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Viaarxiv icon

Quantifying Generalization Complexity for Large Language Models

Add code
Oct 02, 2024
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon