Picture for Boxi Cao

Boxi Cao

P^2O: Joint Policy and Prompt Optimization

Add code
Mar 23, 2026
Viaarxiv icon

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Add code
Mar 10, 2026
Viaarxiv icon

Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning

Add code
Apr 01, 2025
Figure 1 for Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
Figure 2 for Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
Figure 3 for Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
Figure 4 for Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
Viaarxiv icon

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

Add code
Nov 18, 2024
Figure 1 for Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Figure 2 for Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Figure 3 for Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Figure 4 for Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Viaarxiv icon

Multi-Facet Counterfactual Learning for Content Quality Evaluation

Add code
Oct 10, 2024
Figure 1 for Multi-Facet Counterfactual Learning for Content Quality Evaluation
Figure 2 for Multi-Facet Counterfactual Learning for Content Quality Evaluation
Figure 3 for Multi-Facet Counterfactual Learning for Content Quality Evaluation
Figure 4 for Multi-Facet Counterfactual Learning for Content Quality Evaluation
Viaarxiv icon

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Add code
Aug 29, 2024
Figure 1 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Figure 2 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Figure 3 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Figure 4 for Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Viaarxiv icon

StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation

Add code
Aug 07, 2024
Figure 1 for StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation
Figure 2 for StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation
Figure 3 for StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation
Figure 4 for StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation
Viaarxiv icon

Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models

Add code
Jul 16, 2024
Figure 1 for Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models
Figure 2 for Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models
Figure 3 for Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models
Figure 4 for Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models
Viaarxiv icon

Towards Scalable Automated Alignment of LLMs: A Survey

Add code
Jun 03, 2024
Viaarxiv icon

Towards Universal Dense Blocking for Entity Resolution

Add code
Apr 25, 2024
Figure 1 for Towards Universal Dense Blocking for Entity Resolution
Figure 2 for Towards Universal Dense Blocking for Entity Resolution
Figure 3 for Towards Universal Dense Blocking for Entity Resolution
Figure 4 for Towards Universal Dense Blocking for Entity Resolution
Viaarxiv icon