Picture for Gordon Wetzstein

Gordon Wetzstein

Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation

Add code
Jun 05, 2025
Figure 1 for Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation
Figure 2 for Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation
Figure 3 for Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation
Figure 4 for Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation
Viaarxiv icon

Video World Models with Long-term Spatial Memory

Add code
Jun 05, 2025
Figure 1 for Video World Models with Long-term Spatial Memory
Figure 2 for Video World Models with Long-term Spatial Memory
Figure 3 for Video World Models with Long-term Spatial Memory
Figure 4 for Video World Models with Long-term Spatial Memory
Viaarxiv icon

Multiscale guidance of AlphaFold3 with heterogeneous cryo-EM data

Add code
Jun 04, 2025
Viaarxiv icon

Long-Context State-Space Video World Models

Add code
May 26, 2025
Viaarxiv icon

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Add code
May 23, 2025
Viaarxiv icon

Dual Ascent Diffusion for Inverse Problems

Add code
May 23, 2025
Viaarxiv icon

Interspatial Attention for Efficient 4D Human Video Generation

Add code
May 21, 2025
Figure 1 for Interspatial Attention for Efficient 4D Human Video Generation
Figure 2 for Interspatial Attention for Efficient 4D Human Video Generation
Figure 3 for Interspatial Attention for Efficient 4D Human Video Generation
Figure 4 for Interspatial Attention for Efficient 4D Human Video Generation
Viaarxiv icon

R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation

Add code
May 04, 2025
Viaarxiv icon

Neural Ganglion Sensors: Learning Task-specific Event Cameras Inspired by the Neural Circuit of the Human Retina

Add code
Apr 18, 2025
Viaarxiv icon

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Add code
Apr 14, 2025
Figure 1 for Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Figure 2 for Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Figure 3 for Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Figure 4 for Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Viaarxiv icon