Picture for Rynson W. H. Lau

Rynson W. H. Lau

Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding

Add code
Sep 18, 2025
Viaarxiv icon

StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance

Add code
Sep 16, 2025
Viaarxiv icon

HOComp: Interaction-Aware Human-Object Composition

Add code
Jul 22, 2025
Viaarxiv icon

Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation

Add code
Jun 04, 2025
Viaarxiv icon

Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection

Add code
Mar 10, 2025
Figure 1 for Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Figure 2 for Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Figure 3 for Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Figure 4 for Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Viaarxiv icon

Do Multimodal Large Language Models See Like Humans?

Add code
Dec 12, 2024
Figure 1 for Do Multimodal Large Language Models See Like Humans?
Figure 2 for Do Multimodal Large Language Models See Like Humans?
Figure 3 for Do Multimodal Large Language Models See Like Humans?
Figure 4 for Do Multimodal Large Language Models See Like Humans?
Viaarxiv icon

Revisiting the Integration of Convolution and Attention for Vision Backbone

Add code
Nov 21, 2024
Viaarxiv icon

LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes

Add code
Nov 11, 2024
Figure 1 for LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes
Figure 2 for LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes
Figure 3 for LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes
Figure 4 for LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes
Viaarxiv icon

Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension

Add code
Oct 02, 2024
Figure 1 for Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension
Figure 2 for Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension
Figure 3 for Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension
Figure 4 for Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension
Viaarxiv icon

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Add code
Sep 17, 2024
Figure 1 for Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Figure 2 for Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Figure 3 for Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Figure 4 for Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Viaarxiv icon