Picture for Xin Jin

Xin Jin

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism

Add code
Apr 03, 2025
Figure 1 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 2 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 3 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 4 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Viaarxiv icon

DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution

Add code
Mar 30, 2025
Figure 1 for DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Figure 2 for DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Figure 3 for DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Figure 4 for DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Viaarxiv icon

Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning

Add code
Mar 18, 2025
Figure 1 for Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
Figure 2 for Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
Figure 3 for Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
Figure 4 for Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
Viaarxiv icon

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Add code
Mar 15, 2025
Figure 1 for UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection
Figure 2 for UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection
Figure 3 for UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection
Figure 4 for UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection
Viaarxiv icon

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

Add code
Mar 11, 2025
Viaarxiv icon

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

Add code
Mar 08, 2025
Viaarxiv icon

Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices

Add code
Mar 08, 2025
Viaarxiv icon

Unified Arbitrary-Time Video Frame Interpolation and Prediction

Add code
Mar 04, 2025
Viaarxiv icon

Exploring Simple Siamese Network for High-Resolution Video Quality Assessment

Add code
Mar 04, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Figure 1 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 2 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 3 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 4 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Viaarxiv icon