Picture for Xiaowen Chu

Xiaowen Chu

Efficient MoE Inference with Fine-Grained Scheduling of Disaggregated Expert Parallelism

Add code
Dec 25, 2025
Viaarxiv icon

Venus: An Efficient Edge Memory-and-Retrieval System for VLM-based Online Video Understanding

Add code
Dec 08, 2025
Viaarxiv icon

Towards Universal Video Retrieval: Generalizing Video Embedding via Synthesized Multimodal Pyramid Curriculum

Add code
Oct 31, 2025
Viaarxiv icon

SGMAGNet: A Baseline Model for 3D Cloud Phase Structure Reconstruction on a New Passive Active Satellite Benchmark

Add code
Sep 19, 2025
Viaarxiv icon

AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models

Add code
Jun 24, 2025
Viaarxiv icon

RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories

Add code
Jun 18, 2025
Figure 1 for RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories
Figure 2 for RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories
Figure 3 for RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories
Figure 4 for RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories
Viaarxiv icon

Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Add code
May 26, 2025
Viaarxiv icon

FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management

Add code
May 21, 2025
Figure 1 for FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management
Figure 2 for FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management
Figure 3 for FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management
Figure 4 for FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management
Viaarxiv icon

Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices

Add code
Apr 11, 2025
Figure 1 for Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
Figure 2 for Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
Figure 3 for Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
Figure 4 for Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
Viaarxiv icon

MRD-RAG: Enhancing Medical Diagnosis with Multi-Round Retrieval-Augmented Generation

Add code
Apr 10, 2025
Viaarxiv icon