Picture for Xingyi Zhou

Xingyi Zhou

STT: Stateful Tracking with Transformers for Autonomous Driving

Add code
Apr 30, 2024
Figure 1 for STT: Stateful Tracking with Transformers for Autonomous Driving
Figure 2 for STT: Stateful Tracking with Transformers for Autonomous Driving
Figure 3 for STT: Stateful Tracking with Transformers for Autonomous Driving
Figure 4 for STT: Stateful Tracking with Transformers for Autonomous Driving
Viaarxiv icon

Streaming Dense Video Captioning

Add code
Apr 01, 2024
Viaarxiv icon

Distilling Vision-Language Models on Millions of Videos

Add code
Jan 11, 2024
Figure 1 for Distilling Vision-Language Models on Millions of Videos
Figure 2 for Distilling Vision-Language Models on Millions of Videos
Figure 3 for Distilling Vision-Language Models on Millions of Videos
Figure 4 for Distilling Vision-Language Models on Millions of Videos
Viaarxiv icon

Pixel Aligned Language Models

Add code
Dec 14, 2023
Figure 1 for Pixel Aligned Language Models
Figure 2 for Pixel Aligned Language Models
Figure 3 for Pixel Aligned Language Models
Figure 4 for Pixel Aligned Language Models
Viaarxiv icon

MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

Add code
Dec 11, 2023
Figure 1 for MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Figure 2 for MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Figure 3 for MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Figure 4 for MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Viaarxiv icon

Does Visual Pretraining Help End-to-End Reasoning?

Add code
Jul 17, 2023
Figure 1 for Does Visual Pretraining Help End-to-End Reasoning?
Figure 2 for Does Visual Pretraining Help End-to-End Reasoning?
Figure 3 for Does Visual Pretraining Help End-to-End Reasoning?
Figure 4 for Does Visual Pretraining Help End-to-End Reasoning?
Viaarxiv icon

How can objects help action recognition?

Add code
Jun 20, 2023
Figure 1 for How can objects help action recognition?
Figure 2 for How can objects help action recognition?
Figure 3 for How can objects help action recognition?
Figure 4 for How can objects help action recognition?
Viaarxiv icon

Dense Video Object Captioning from Disjoint Supervision

Add code
Jun 20, 2023
Figure 1 for Dense Video Object Captioning from Disjoint Supervision
Figure 2 for Dense Video Object Captioning from Disjoint Supervision
Figure 3 for Dense Video Object Captioning from Disjoint Supervision
Figure 4 for Dense Video Object Captioning from Disjoint Supervision
Viaarxiv icon

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

Add code
Jun 02, 2023
Figure 1 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 2 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 3 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 4 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Viaarxiv icon

NMS Strikes Back

Add code
Dec 12, 2022
Figure 1 for NMS Strikes Back
Figure 2 for NMS Strikes Back
Figure 3 for NMS Strikes Back
Figure 4 for NMS Strikes Back
Viaarxiv icon