Picture for Junzhuo Li

Junzhuo Li

Semantic Consistency Policy Optimization for Reinforcement Learning of LLM Agents

Add code
Jun 24, 2026
Viaarxiv icon

Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation

Add code
Apr 04, 2026
Viaarxiv icon

Optimal Expert-Attention Allocation in Mixture-of-Experts: A Scalable Law for Dynamic Model Design

Add code
Mar 11, 2026
Viaarxiv icon

Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models

Add code
Jan 13, 2026
Viaarxiv icon

Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis

Add code
May 30, 2025
Viaarxiv icon

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Add code
May 27, 2025
Viaarxiv icon

LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning

Add code
May 24, 2025
Viaarxiv icon

Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs

Add code
May 20, 2025
Figure 1 for Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs
Figure 2 for Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs
Figure 3 for Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs
Figure 4 for Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs
Viaarxiv icon

Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models

Add code
Feb 20, 2025
Figure 1 for Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
Figure 2 for Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
Figure 3 for Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
Figure 4 for Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
Viaarxiv icon

The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis

Add code
Apr 01, 2024
Figure 1 for The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Figure 2 for The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Figure 3 for The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Figure 4 for The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Viaarxiv icon