Alert button

"Image": models, code, and papers
Alert button

Structure Preserving Diffusion Models

Feb 29, 2024
Haoye Lu, Spencer Szabados, Yaoliang Yu

Viaarxiv icon

Neural Radiance Fields in Medical Imaging: Challenges and Next Steps

Mar 02, 2024
Xin Wang, Shu Hu, Heng Fan, Hongtu Zhu, Xin Li

Viaarxiv icon

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing

Add code
Bookmark button
Alert button
Mar 04, 2024
Ling Yang, Zhilong Zhang, Zhaochen Yu, Jingwei Liu, Minkai Xu, Stefano Ermon, Bin Cui

Viaarxiv icon

RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Mar 04, 2024
Howard H. Qian, Yangxiao Lu, Kejia Ren, Gaotian Wang, Ninad Khargonkar, Yu Xiang, Kaiyu Hang

Figure 1 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features
Figure 2 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features
Figure 3 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features
Figure 4 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features
Viaarxiv icon

3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario

Mar 04, 2024
Feng Shuang, Wenbo He, Shaodong Li

Figure 1 for 3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario
Figure 2 for 3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario
Figure 3 for 3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario
Figure 4 for 3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario
Viaarxiv icon

xT: Nested Tokenization for Larger Context in Large Images

Add code
Bookmark button
Alert button
Mar 04, 2024
Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam

Figure 1 for xT: Nested Tokenization for Larger Context in Large Images
Figure 2 for xT: Nested Tokenization for Larger Context in Large Images
Figure 3 for xT: Nested Tokenization for Larger Context in Large Images
Figure 4 for xT: Nested Tokenization for Larger Context in Large Images
Viaarxiv icon

Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective

Mar 03, 2024
Huixuan Zhang, Junzhe Zhang, Xiaojun Wan

Figure 1 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 2 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 3 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 4 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Viaarxiv icon

Analysis of the Two-Step Heterogeneous Transfer Learning for Laryngeal Blood Vessel Classification: Issue and Improvement

Mar 05, 2024
Xinyi Fang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Wei Ke, Tiankui Zhang, Sio-Kei Im

Viaarxiv icon

ADS: Approximate Densest Subgraph for Novel Image Discovery

Feb 13, 2024
Shanfeng Hu

Viaarxiv icon

A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection

Add code
Bookmark button
Alert button
Feb 29, 2024
Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang

Viaarxiv icon