Alert button

"Image": models, code, and papers
Alert button

Exploring Vision-Language Models for Imbalanced Learning

Apr 04, 2023
Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang

Figure 1 for Exploring Vision-Language Models for Imbalanced Learning
Figure 2 for Exploring Vision-Language Models for Imbalanced Learning
Figure 3 for Exploring Vision-Language Models for Imbalanced Learning
Figure 4 for Exploring Vision-Language Models for Imbalanced Learning
Viaarxiv icon

Uncertainty estimation in Deep Learning for Panoptic segmentation

Apr 04, 2023
Michael Smith, Frank Ferrie

Figure 1 for Uncertainty estimation in Deep Learning for Panoptic segmentation
Figure 2 for Uncertainty estimation in Deep Learning for Panoptic segmentation
Figure 3 for Uncertainty estimation in Deep Learning for Panoptic segmentation
Figure 4 for Uncertainty estimation in Deep Learning for Panoptic segmentation
Viaarxiv icon

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Apr 04, 2023
Yong-Lu Li, Xiaoqian Wu, Xinpeng Liu, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu

Figure 1 for From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Figure 2 for From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Figure 3 for From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Figure 4 for From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Viaarxiv icon

Stare at What You See: Masked Image Modeling without Reconstruction

Nov 16, 2022
Hongwei Xue, Peng Gao, Hongyang Li, Yu Qiao, Hao Sun, Houqiang Li, Jiebo Luo

Figure 1 for Stare at What You See: Masked Image Modeling without Reconstruction
Figure 2 for Stare at What You See: Masked Image Modeling without Reconstruction
Figure 3 for Stare at What You See: Masked Image Modeling without Reconstruction
Figure 4 for Stare at What You See: Masked Image Modeling without Reconstruction
Viaarxiv icon

SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation

Mar 21, 2023
Yuzhi Zhao, Lai-Man Po, Kangcheng Liu, Xuehui Wang, Wing-Yin Yu, Pengfei Xian, Yujia Zhang, Mengyang Liu

Figure 1 for SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Figure 2 for SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Figure 3 for SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Figure 4 for SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Viaarxiv icon

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Dec 13, 2022
Renrui Zhang, Liuhui Wang, Yu Qiao, Peng Gao, Hongsheng Li

Figure 1 for Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Figure 2 for Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Figure 3 for Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Figure 4 for Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Viaarxiv icon

Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors

Jan 22, 2023
Md Abdullah-Al Kaiser, Gourav Datta, Zixu Wang, Ajey P. Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

Figure 1 for Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors
Figure 2 for Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors
Figure 3 for Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors
Figure 4 for Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors
Viaarxiv icon

Making Vision Transformers Efficient from A Token Sparsification View

Mar 15, 2023
Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou

Figure 1 for Making Vision Transformers Efficient from A Token Sparsification View
Figure 2 for Making Vision Transformers Efficient from A Token Sparsification View
Figure 3 for Making Vision Transformers Efficient from A Token Sparsification View
Figure 4 for Making Vision Transformers Efficient from A Token Sparsification View
Viaarxiv icon

Leveraging per Image-Token Consistency for Vision-Language Pre-training

Nov 20, 2022
Yunhao Gou, Tom Ko, Hansi Yang, James Kwok, Yu Zhang, Mingxuan Wang

Figure 1 for Leveraging per Image-Token Consistency for Vision-Language Pre-training
Figure 2 for Leveraging per Image-Token Consistency for Vision-Language Pre-training
Figure 3 for Leveraging per Image-Token Consistency for Vision-Language Pre-training
Figure 4 for Leveraging per Image-Token Consistency for Vision-Language Pre-training
Viaarxiv icon

Optimizing Prompts for Text-to-Image Generation

Dec 19, 2022
Yaru Hao, Zewen Chi, Li Dong, Furu Wei

Figure 1 for Optimizing Prompts for Text-to-Image Generation
Figure 2 for Optimizing Prompts for Text-to-Image Generation
Figure 3 for Optimizing Prompts for Text-to-Image Generation
Figure 4 for Optimizing Prompts for Text-to-Image Generation
Viaarxiv icon