Picture for Shen Yan

Shen Yan

Streaming Dense Video Captioning

Add code
Apr 01, 2024
Viaarxiv icon

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter

Add code
Feb 16, 2024
Viaarxiv icon

UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization

Add code
Jan 11, 2024
Viaarxiv icon

Efficient Large Language Models: A Survey

Add code
Dec 23, 2023
Figure 1 for Efficient Large Language Models: A Survey
Figure 2 for Efficient Large Language Models: A Survey
Figure 3 for Efficient Large Language Models: A Survey
Figure 4 for Efficient Large Language Models: A Survey
Viaarxiv icon

Pixel Aligned Language Models

Add code
Dec 14, 2023
Figure 1 for Pixel Aligned Language Models
Figure 2 for Pixel Aligned Language Models
Figure 3 for Pixel Aligned Language Models
Figure 4 for Pixel Aligned Language Models
Viaarxiv icon

UnLoc: A Unified Framework for Video Localization Tasks

Add code
Aug 21, 2023
Figure 1 for UnLoc: A Unified Framework for Video Localization Tasks
Figure 2 for UnLoc: A Unified Framework for Video Localization Tasks
Figure 3 for UnLoc: A Unified Framework for Video Localization Tasks
Figure 4 for UnLoc: A Unified Framework for Video Localization Tasks
Viaarxiv icon

AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Add code
Apr 20, 2023
Figure 1 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning
Figure 2 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning
Figure 3 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning
Figure 4 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning
Viaarxiv icon

Long-term Visual Localization with Mobile Sensors

Add code
Apr 16, 2023
Figure 1 for Long-term Visual Localization with Mobile Sensors
Figure 2 for Long-term Visual Localization with Mobile Sensors
Figure 3 for Long-term Visual Localization with Mobile Sensors
Figure 4 for Long-term Visual Localization with Mobile Sensors
Viaarxiv icon

Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks

Add code
Mar 07, 2023
Figure 1 for Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks
Figure 2 for Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks
Figure 3 for Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks
Figure 4 for Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks
Viaarxiv icon