Alert button

"Image": models, code, and papers
Alert button

SPC-NeRF: Spatial Predictive Compression for Voxel Based Radiance Field

Add code
Bookmark button
Alert button
Feb 26, 2024
Zetian Song, Wenhong Duan, Yuhuai Zhang, Shiqi Wang, Siwei Ma, Wen Gao

Viaarxiv icon

MMSR: Symbolic Regression is a Multimodal Task

Feb 28, 2024
Yanjie Li, Jingyi Liu, Weijun Li, Lina Yu, Min Wu, Wenqiang Li, Meilan Hao, Su Wei, Yusong Deng

Viaarxiv icon

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

Add code
Bookmark button
Alert button
Feb 28, 2024
Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, Ruslan Salakhutdinov

Viaarxiv icon

Descripción automática de secciones delgadas de rocas: una aplicación Web

Feb 23, 2024
Stalyn Paucar, Christian Mejía-Escobar y Víctor Collaguazo

Viaarxiv icon

MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

Add code
Bookmark button
Alert button
Feb 22, 2024
Xin-Yang Zheng, Hao Pan, Yu-Xiao Guo, Xin Tong, Yang Liu

Viaarxiv icon

Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?

Feb 14, 2024
Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayanan

Viaarxiv icon

ConVQG: Contrastive Visual Question Generation with Multimodal Guidance

Add code
Bookmark button
Alert button
Feb 20, 2024
Li Mi, Syrielle Montariol, Javiera Castillo-Navarro, Xianjie Dai, Antoine Bosselut, Devis Tuia

Viaarxiv icon

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Add code
Bookmark button
Alert button
Feb 13, 2024
Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin

Viaarxiv icon

$λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Add code
Bookmark button
Alert button
Feb 07, 2024
Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yang

Viaarxiv icon

Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation

Feb 11, 2024
Ziyang Wang, Chao Ma

Viaarxiv icon