Picture for Song Bai

Song Bai

Alibaba Group, University of Oxford

The Runner-up Solution for YouTube-VIS Long Video Challenge 2022

Add code
Nov 18, 2022
Viaarxiv icon

Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning

Add code
Oct 24, 2022
Viaarxiv icon

Is synthetic data from generative models ready for image recognition?

Add code
Oct 14, 2022
Figure 1 for Is synthetic data from generative models ready for image recognition?
Figure 2 for Is synthetic data from generative models ready for image recognition?
Figure 3 for Is synthetic data from generative models ready for image recognition?
Figure 4 for Is synthetic data from generative models ready for image recognition?
Viaarxiv icon

Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning

Add code
Oct 01, 2022
Figure 1 for Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Figure 2 for Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Figure 3 for Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Figure 4 for Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Viaarxiv icon

1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

Add code
Aug 04, 2022
Figure 1 for 1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition
Figure 2 for 1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition
Figure 3 for 1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition
Figure 4 for 1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition
Viaarxiv icon

Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation

Add code
Jul 29, 2022
Figure 1 for Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation
Figure 2 for Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation
Figure 3 for Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation
Figure 4 for Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation
Viaarxiv icon

Contextual Text Block Detection towards Scene Text Understanding

Add code
Jul 26, 2022
Figure 1 for Contextual Text Block Detection towards Scene Text Understanding
Figure 2 for Contextual Text Block Detection towards Scene Text Understanding
Figure 3 for Contextual Text Block Detection towards Scene Text Understanding
Figure 4 for Contextual Text Block Detection towards Scene Text Understanding
Viaarxiv icon

In Defense of Online Models for Video Instance Segmentation

Add code
Jul 21, 2022
Figure 1 for In Defense of Online Models for Video Instance Segmentation
Figure 2 for In Defense of Online Models for Video Instance Segmentation
Figure 3 for In Defense of Online Models for Video Instance Segmentation
Figure 4 for In Defense of Online Models for Video Instance Segmentation
Viaarxiv icon

CenterNet++ for Object Detection

Add code
Apr 18, 2022
Figure 1 for CenterNet++ for Object Detection
Figure 2 for CenterNet++ for Object Detection
Figure 3 for CenterNet++ for Object Detection
Figure 4 for CenterNet++ for Object Detection
Viaarxiv icon

An Empirical Study of End-to-End Temporal Action Detection

Add code
Apr 06, 2022
Figure 1 for An Empirical Study of End-to-End Temporal Action Detection
Figure 2 for An Empirical Study of End-to-End Temporal Action Detection
Figure 3 for An Empirical Study of End-to-End Temporal Action Detection
Figure 4 for An Empirical Study of End-to-End Temporal Action Detection
Viaarxiv icon