Picture for Hongli Xu

Hongli Xu

Utility-Guided Agent Orchestration for Efficient LLM Tool Use

Add code
Mar 20, 2026
Viaarxiv icon

Xray-Visual Models: Scaling Vision models on Industry Scale Data

Add code
Feb 18, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size

Add code
Oct 26, 2025
Figure 1 for SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size
Figure 2 for SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size
Figure 3 for SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size
Figure 4 for SABlock: Semantic-Aware KV Cache Eviction with Adaptive Compression Block Size
Viaarxiv icon

Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks

Add code
Oct 22, 2025
Figure 1 for Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
Figure 2 for Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
Figure 3 for Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
Figure 4 for Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
Viaarxiv icon

Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism

Add code
Sep 10, 2025
Figure 1 for Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
Figure 2 for Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
Figure 3 for Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
Figure 4 for Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
Viaarxiv icon

Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning

Add code
Jun 06, 2025
Figure 1 for Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Figure 2 for Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Figure 3 for Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Figure 4 for Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Viaarxiv icon

PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation

Add code
Apr 06, 2025
Viaarxiv icon

Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data

Add code
Mar 27, 2025
Figure 1 for Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data
Figure 2 for Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data
Figure 3 for Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data
Figure 4 for Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data
Viaarxiv icon

Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout

Add code
Mar 13, 2025
Figure 1 for Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout
Figure 2 for Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout
Figure 3 for Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout
Figure 4 for Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout
Viaarxiv icon