Alert button
Picture for Kun Yao

Kun Yao

Alert button

FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition

Add code
Bookmark button
Alert button
Feb 05, 2024
Xiaohu Huang, Hao Zhou, Kun Yao, Kai Han

Viaarxiv icon

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

Add code
Bookmark button
Alert button
Oct 31, 2023
Junkun Yuan, Xinyu Zhang, Hao Zhou, Jian Wang, Zhongwei Qiu, Zhiyin Shao, Shaofeng Zhang, Sifan Long, Kun Kuang, Kun Yao, Junyu Han, Errui Ding, Lanfen Lin, Fei Wu, Jingdong Wang

Figure 1 for HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Figure 2 for HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Figure 3 for HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Figure 4 for HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Viaarxiv icon

GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction

Add code
Bookmark button
Alert button
Sep 26, 2023
Pengyuan Lyu, Weihong Ma, Hongyi Wang, Yuechen Yu, Chengquan Zhang, Kun Yao, Yang Xue, Jingdong Wang

Figure 1 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 2 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 3 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 4 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Viaarxiv icon

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation

Add code
Bookmark button
Alert button
Aug 14, 2023
Huan Liu, Qiang Chen, Zichang Tan, Jiang-Jiang Liu, Jian Wang, Xiangbo Su, Xiaolong Li, Kun Yao, Junyu Han, Errui Ding, Yao Zhao, Jingdong Wang

Figure 1 for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
Figure 2 for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
Figure 3 for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
Figure 4 for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
Viaarxiv icon

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Add code
Bookmark button
Alert button
Aug 14, 2023
Xugong Qin, Pengyuan Lyu, Chengquan Zhang, Yu Zhou, Kun Yao, Peng Zhang, Hailun Lin, Weiping Wang

Figure 1 for Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning
Figure 2 for Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning
Figure 3 for Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning
Figure 4 for Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning
Viaarxiv icon

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary

Add code
Bookmark button
Alert button
Jul 24, 2023
Beiya Dai, Xing li, Qunyi Xie, Yulin Li, Xiameng Qin, Chengquan Zhang, Kun Yao, Junyu Han

Figure 1 for MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
Figure 2 for MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
Figure 3 for MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
Figure 4 for MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
Viaarxiv icon

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation

Add code
Bookmark button
Alert button
Jun 29, 2023
Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei Fu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

Figure 1 for Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
Figure 2 for Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
Figure 3 for Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
Figure 4 for Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Add code
Bookmark button
Alert button
Jun 05, 2023
Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

Figure 1 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 2 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 3 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 4 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Viaarxiv icon

Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding

Add code
Bookmark button
Alert button
May 19, 2023
Mingliang Zhai, Yulin Li, Xiameng Qin, Chen Yi, Qunyi Xie, Chengquan Zhang, Kun Yao, Yuwei Wu, Yunde Jia

Figure 1 for Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Figure 2 for Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Figure 3 for Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Figure 4 for Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Viaarxiv icon

StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training

Add code
Bookmark button
Alert button
Mar 01, 2023
Yuechen Yu, Yulin Li, Chengquan Zhang, Xiaoqiang Zhang, Zengyuan Guo, Xiameng Qin, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

Figure 1 for StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Figure 2 for StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Figure 3 for StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Figure 4 for StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Viaarxiv icon