Picture for Yanxu Zhu

Yanxu Zhu

Not Just the Destination, But the Journey: Reasoning Traces Causally Shape Generalization Behaviors

Add code
Mar 12, 2026
Viaarxiv icon

Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms

Add code
Nov 17, 2025
Viaarxiv icon

KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions

Add code
Jul 08, 2024
Figure 1 for KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Figure 2 for KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Figure 3 for KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Figure 4 for KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Viaarxiv icon

Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

Add code
Feb 01, 2024
Viaarxiv icon

CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models

Add code
Nov 28, 2023
Figure 1 for CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
Figure 2 for CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
Figure 3 for CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
Figure 4 for CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
Viaarxiv icon