Picture for Bei Liu

Bei Liu

ViCo: Engaging Video Comment Generation with Human Preference Rewards

Add code
Aug 22, 2023
Figure 1 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 2 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 3 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 4 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Viaarxiv icon

Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations

Add code
Aug 21, 2023
Viaarxiv icon

Revisiting Latent Space of GAN Inversion for Real Image Editing

Add code
Jul 18, 2023
Figure 1 for Revisiting Latent Space of GAN Inversion for Real Image Editing
Figure 2 for Revisiting Latent Space of GAN Inversion for Real Image Editing
Figure 3 for Revisiting Latent Space of GAN Inversion for Real Image Editing
Figure 4 for Revisiting Latent Space of GAN Inversion for Real Image Editing
Viaarxiv icon

SINC: Self-Supervised In-Context Learning for Vision-Language Tasks

Add code
Jul 15, 2023
Figure 1 for SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Figure 2 for SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Figure 3 for SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Figure 4 for SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Viaarxiv icon

Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots

Add code
Jun 25, 2023
Viaarxiv icon

Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space

Add code
May 31, 2023
Viaarxiv icon

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

Add code
May 30, 2023
Figure 1 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 2 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 3 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 4 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Viaarxiv icon

Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

Add code
May 18, 2023
Figure 1 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 2 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 3 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 4 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Viaarxiv icon

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Add code
Dec 19, 2022
Viaarxiv icon

Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022

Add code
Nov 02, 2022
Viaarxiv icon