Alert button

"Image": models, code, and papers
Alert button

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Jan 24, 2024
Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried

Viaarxiv icon

Research about the Ability of LLM in the Tamper-Detection Area

Jan 24, 2024
Xinyu Yang, Jizhe Zhou

Viaarxiv icon

Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models

Jan 24, 2024
Hongzhan Lin, Ziyang Luo, Wei Gao, Jing Ma, Bo Wang, Ruichao Yang

Viaarxiv icon

Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Jan 16, 2024
Qi Bi, Wei Ji, Jingjun Yi, Haolan Zhan, Gui-Song Xia

Viaarxiv icon

Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Jan 18, 2024
Xin Yuan, Jinoo Baek, Keyang Xu, Omer Tov, Hongliang Fei

Viaarxiv icon

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Jan 18, 2024
Andreas Engelhardt, Amit Raj, Mark Boss, Yunzhi Zhang, Abhishek Kar, Yuanzhen Li, Deqing Sun, Ricardo Martin Brualla, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

Viaarxiv icon

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Add code
Bookmark button
Alert button
Jan 15, 2024
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai

Viaarxiv icon

Enhancing Small Object Encoding in Deep Neural Networks: Introducing Fast&Focused-Net with Volume-wise Dot Product Layer

Jan 18, 2024
Ali Tofik, Roy Partha Pratim

Viaarxiv icon

Statistical Test for Attention Map in Vision Transformer

Jan 16, 2024
Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka, Vo Nguyen Le Duy, Koichi Taji, Ichiro Takeuchi

Viaarxiv icon

UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures

Add code
Bookmark button
Alert button
Jan 20, 2024
Mingyuan Zhou, Rakib Hyder, Ziwei Xuan, Guojun Qi

Viaarxiv icon