Picture for Ling Shao

Ling Shao

Terminus Group, Beijing, China

Enhanced Multi-Scale Cross-Attention for Person Image Generation

Add code
Jan 15, 2025
Figure 1 for Enhanced Multi-Scale Cross-Attention for Person Image Generation
Figure 2 for Enhanced Multi-Scale Cross-Attention for Person Image Generation
Figure 3 for Enhanced Multi-Scale Cross-Attention for Person Image Generation
Figure 4 for Enhanced Multi-Scale Cross-Attention for Person Image Generation
Viaarxiv icon

Multimodal 3D Reasoning Segmentation with Complex Scenes

Add code
Nov 21, 2024
Figure 1 for Multimodal 3D Reasoning Segmentation with Complex Scenes
Figure 2 for Multimodal 3D Reasoning Segmentation with Complex Scenes
Figure 3 for Multimodal 3D Reasoning Segmentation with Complex Scenes
Figure 4 for Multimodal 3D Reasoning Segmentation with Complex Scenes
Viaarxiv icon

Novel View Extrapolation with Video Diffusion Priors

Add code
Nov 21, 2024
Viaarxiv icon

AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations

Add code
Nov 16, 2024
Viaarxiv icon

Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey

Add code
Nov 05, 2024
Figure 1 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 2 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 3 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 4 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Viaarxiv icon

GWQ: Gradient-Aware Weight Quantization for Large Language Models

Add code
Oct 30, 2024
Figure 1 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 2 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 3 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 4 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Viaarxiv icon

Historical Test-time Prompt Tuning for Vision Foundation Models

Add code
Oct 27, 2024
Figure 1 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 2 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 3 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 4 for Historical Test-time Prompt Tuning for Vision Foundation Models
Viaarxiv icon

LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models

Add code
Oct 15, 2024
Figure 1 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 2 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 3 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 4 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Viaarxiv icon

MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders

Add code
May 13, 2024
Viaarxiv icon

StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting

Add code
Mar 12, 2024
Figure 1 for StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
Figure 2 for StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
Figure 3 for StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
Figure 4 for StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
Viaarxiv icon