Alert button

"Image": models, code, and papers
Alert button

A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

Add code
Bookmark button
Alert button
Nov 27, 2023
Jinyu Miao, Kun Jiang, Tuopu Wen, Yunlong Wang, Peijing Jia, Xuhe Zhao, Zhongyang Xiao, Jin Huang, Zhihua Zhong, Diange Yang

Viaarxiv icon

Geometric Data Augmentations to Mitigate Distribution Shifts in Pollen Classification from Microscopic Images

Nov 18, 2023
Nam Cao, Olga Saukh

Viaarxiv icon

Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation

Nov 18, 2023
Nurbanu Aksoy, Serge Sharoff, Selcuk Baser, Nishant Ravikumar, Alejandro F Frangi

Viaarxiv icon

Bayesian Methods for Media Mix Modelling with shape and funnel effects

Nov 21, 2023
Javier Marin

Viaarxiv icon

Scattering Vision Transformer: Spectral Mixing Matters

Add code
Bookmark button
Alert button
Nov 20, 2023
Badri N. Patro, Vijay Srinivas Agneeswaran

Viaarxiv icon

Energy efficiency in Edge TPU vs. embedded GPU for computer-aided medical imaging segmentation and classification

Nov 20, 2023
José María Rodríguez Corral, Javier Civit-Masot, Francisco Luna-Perejón, Ignacio Díaz-Cano, Arturo Morgado-Estévez, Manuel Domínguez-Morales

Viaarxiv icon

DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model

Nov 19, 2023
Zhenghao Pan, Haijin Zeng, Jiezhang Cao, Kai Zhang, Yongyong Chen

Figure 1 for DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Figure 2 for DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Figure 3 for DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Figure 4 for DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Viaarxiv icon

M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

Add code
Bookmark button
Alert button
Nov 19, 2023
Atin Sakkeer Hussain, Shansong Liu, Chenshuo Sun, Ying Shan

Figure 1 for M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
Figure 2 for M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
Figure 3 for M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
Figure 4 for M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
Viaarxiv icon

Asynchronous Bioplausible Neuron for Spiking Neural Networks for Event-Based Vision

Nov 20, 2023
Sanket Kachole, Hussain Sajwani, Fariborz Baghaei Naeini, Dimitrios Makris, Yahya Zweiri

Viaarxiv icon

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Oct 23, 2023
Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang

Viaarxiv icon