Alert button
Picture for Bei Liu

Bei Liu

Alert button

Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space

May 31, 2023
Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

Figure 1 for Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
Figure 2 for Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
Figure 3 for Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
Figure 4 for Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
Viaarxiv icon

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

May 30, 2023
Chuhao Jin, Wenhui Tan, Jiange Yang, Bei Liu, Ruihua Song, Limin Wang, Jianlong Fu

Figure 1 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 2 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 3 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 4 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Viaarxiv icon

Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

May 18, 2023
Hang Shao, Wei Wang, Bei Liu, Xun Gong, Haoyu Wang, Yanmin Qian

Figure 1 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 2 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 3 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Viaarxiv icon

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Dec 19, 2022
Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo

Figure 1 for MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Figure 2 for MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Figure 3 for MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Figure 4 for MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Viaarxiv icon

Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022

Nov 02, 2022
Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

Figure 1 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 2 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 3 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 4 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Viaarxiv icon

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning

Oct 12, 2022
Yuchong Sun, Hongwei Xue, Ruihua Song, Bei Liu, Huan Yang, Jianlong Fu

Figure 1 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 2 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 3 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 4 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Viaarxiv icon

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Sep 23, 2022
Hongwei Xue, Yuchong Sun, Bei Liu, Jianlong Fu, Ruihua Song, Houqiang Li, Jiebo Luo

Figure 1 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 2 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 3 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 4 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Viaarxiv icon

SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022

Sep 20, 2022
Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

Figure 1 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 2 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 3 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 4 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Viaarxiv icon

AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation

Sep 08, 2022
Yiyang Ma, Huan Yang, Bei Liu, Jianlong Fu, Jiaying Liu

Figure 1 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 2 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 3 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 4 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Viaarxiv icon