Picture for Sanghyuk Chun

Sanghyuk Chun

Toward Interactive Regional Understanding in Vision-Large Language Models

Add code
Mar 27, 2024
Figure 1 for Toward Interactive Regional Understanding in Vision-Large Language Models
Figure 2 for Toward Interactive Regional Understanding in Vision-Large Language Models
Figure 3 for Toward Interactive Regional Understanding in Vision-Large Language Models
Figure 4 for Toward Interactive Regional Understanding in Vision-Large Language Models
Viaarxiv icon

Language-only Efficient Training of Zero-shot Composed Image Retrieval

Add code
Dec 04, 2023
Viaarxiv icon

Longer-range Contextualized Masked Autoencoder

Add code
Oct 20, 2023
Viaarxiv icon

Improved Probabilistic Image-Text Representations

Add code
May 29, 2023
Figure 1 for Improved Probabilistic Image-Text Representations
Figure 2 for Improved Probabilistic Image-Text Representations
Figure 3 for Improved Probabilistic Image-Text Representations
Figure 4 for Improved Probabilistic Image-Text Representations
Viaarxiv icon

RoCOCO: Robust Benchmark MS-COCO to Stress-test Robustness of Image-Text Matching Models

Add code
Apr 21, 2023
Viaarxiv icon

Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild

Add code
Apr 10, 2023
Figure 1 for Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild
Figure 2 for Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild
Figure 3 for Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild
Figure 4 for Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild
Viaarxiv icon

CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

Add code
Mar 21, 2023
Viaarxiv icon

SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage

Add code
Mar 20, 2023
Figure 1 for SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
Figure 2 for SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
Figure 3 for SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
Figure 4 for SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
Viaarxiv icon

Re-weighting Based Group Fairness Regularization via Classwise Robust Optimization

Add code
Mar 01, 2023
Viaarxiv icon

Group Generalized Mean Pooling for Vision Transformer

Add code
Dec 08, 2022
Figure 1 for Group Generalized Mean Pooling for Vision Transformer
Figure 2 for Group Generalized Mean Pooling for Vision Transformer
Figure 3 for Group Generalized Mean Pooling for Vision Transformer
Figure 4 for Group Generalized Mean Pooling for Vision Transformer
Viaarxiv icon