Picture for Jiacheng Chen

Jiacheng Chen

Sherman

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Add code
Sep 10, 2025
Figure 1 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 2 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 3 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 4 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Viaarxiv icon

InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling

Add code
Aug 12, 2025
Figure 1 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 2 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 3 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 4 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Viaarxiv icon

PairEdit: Learning Semantic Variations for Exemplar-based Image Editing

Add code
Jun 09, 2025
Viaarxiv icon

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Add code
Jun 04, 2025
Figure 1 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 2 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 3 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 4 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Viaarxiv icon

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Add code
May 28, 2025
Viaarxiv icon

Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving

Add code
May 20, 2025
Figure 1 for Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving
Figure 2 for Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving
Figure 3 for Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving
Figure 4 for Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving
Viaarxiv icon

Feedback-Free Resource Scheduling: Towards Flexible Multi-BS Cooperation in FD-RAN

Add code
Feb 25, 2025
Figure 1 for Feedback-Free Resource Scheduling: Towards Flexible Multi-BS Cooperation in FD-RAN
Figure 2 for Feedback-Free Resource Scheduling: Towards Flexible Multi-BS Cooperation in FD-RAN
Figure 3 for Feedback-Free Resource Scheduling: Towards Flexible Multi-BS Cooperation in FD-RAN
Figure 4 for Feedback-Free Resource Scheduling: Towards Flexible Multi-BS Cooperation in FD-RAN
Viaarxiv icon

ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning

Add code
Dec 10, 2024
Viaarxiv icon

Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition

Add code
Nov 07, 2024
Figure 1 for Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition
Figure 2 for Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition
Figure 3 for Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition
Figure 4 for Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition
Viaarxiv icon

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Add code
Oct 14, 2024
Figure 1 for MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Figure 2 for MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Figure 3 for MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Figure 4 for MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Viaarxiv icon