Picture for Jie Lei

Jie Lei

Zhejiang University Of Technology

Toward matrix multiplication for deep learning inference on the Xilinx Versal

Add code
Feb 15, 2023
Viaarxiv icon

Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching

Add code
Dec 31, 2022
Figure 1 for Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching
Figure 2 for Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching
Figure 3 for Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching
Figure 4 for Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching
Viaarxiv icon

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Add code
Dec 15, 2022
Figure 1 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 2 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 3 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 4 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Viaarxiv icon

VindLU: A Recipe for Effective Video-and-Language Pretraining

Add code
Dec 09, 2022
Figure 1 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 2 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 3 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 4 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Viaarxiv icon

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention

Add code
Nov 21, 2022
Viaarxiv icon

SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery

Add code
Sep 27, 2022
Figure 1 for SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Figure 2 for SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Figure 3 for SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Figure 4 for SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Viaarxiv icon

Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition

Add code
Jul 27, 2022
Figure 1 for Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition
Figure 2 for Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition
Figure 3 for Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition
Viaarxiv icon

Revealing Single Frame Bias for Video-and-Language Learning

Add code
Jun 07, 2022
Figure 1 for Revealing Single Frame Bias for Video-and-Language Learning
Figure 2 for Revealing Single Frame Bias for Video-and-Language Learning
Figure 3 for Revealing Single Frame Bias for Video-and-Language Learning
Figure 4 for Revealing Single Frame Bias for Video-and-Language Learning
Viaarxiv icon

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

Add code
May 29, 2022
Figure 1 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 2 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 3 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 4 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Viaarxiv icon

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound

Add code
Apr 06, 2022
Figure 1 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 2 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 3 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 4 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Viaarxiv icon