Picture for Jing Xiong

Jing Xiong

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Add code
Jan 18, 2026
Viaarxiv icon

MMFormalizer: Multimodal Autoformalization in the Wild

Add code
Jan 06, 2026
Viaarxiv icon

DoPE: Denoising Rotary Position Embedding

Add code
Nov 12, 2025
Viaarxiv icon

A1: Asynchronous Test-Time Scaling via Conformal Prediction

Add code
Sep 18, 2025
Figure 1 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 2 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 3 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 4 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Viaarxiv icon

LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction

Add code
Sep 09, 2025
Figure 1 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 2 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 3 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 4 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Viaarxiv icon

SAS: Simulated Attention Score

Add code
Jul 10, 2025
Viaarxiv icon

Mathesis: Towards Formal Theorem Proving from Natural Languages

Add code
Jun 08, 2025
Figure 1 for Mathesis: Towards Formal Theorem Proving from Natural Languages
Figure 2 for Mathesis: Towards Formal Theorem Proving from Natural Languages
Figure 3 for Mathesis: Towards Formal Theorem Proving from Natural Languages
Figure 4 for Mathesis: Towards Formal Theorem Proving from Natural Languages
Viaarxiv icon

From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes

Add code
Jun 05, 2025
Figure 1 for From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Figure 2 for From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Figure 3 for From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Figure 4 for From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Viaarxiv icon

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

Add code
May 29, 2025
Figure 1 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 2 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 3 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 4 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Viaarxiv icon