Alert button

"Image": models, code, and papers
Alert button

Towards More Unified In-context Visual Understanding

Dec 05, 2023
Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Viaarxiv icon

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

Dec 05, 2023
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah

Viaarxiv icon

DiffusionAtlas: High-Fidelity Consistent Diffusion Video Editing

Dec 05, 2023
Shao-Yu Chang, Hwann-Tzong Chen, Tyng-Luh Liu

Viaarxiv icon

Breast Ultrasound Report Generation using LangChain

Dec 05, 2023
Jaeyoung Huh, Hyun Jeong Park, Jong Chul Ye

Viaarxiv icon

Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on Segmentation Models

Nov 24, 2023
Francesco Croce, Matthias Hein

Viaarxiv icon

Text Augmented Spatial-aware Zero-shot Referring Image Segmentation

Oct 27, 2023
Yucheng Suo, Linchao Zhu, Yi Yang

Viaarxiv icon

Residual Graph Convolutional Network for Bird's-Eye-View Semantic Segmentation

Dec 07, 2023
Qiuxiao Chen, Xiaojun Qi

Viaarxiv icon

CapsFusion: Rethinking Image-Text Data at Scale

Nov 02, 2023
Qiying Yu, Quan Sun, Xiaosong Zhang, Yufeng Cui, Fan Zhang, Yue Cao, Xinlong Wang, Jingjing Liu

Viaarxiv icon

Likelihood-Aware Semantic Alignment for Full-Spectrum Out-of-Distribution Detection

Dec 04, 2023
Fan Lu, Kai Zhu, Kecheng Zheng, Wei Zhai, Yang Cao

Viaarxiv icon

APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

Dec 04, 2023
Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha

Viaarxiv icon