Picture for Po-Yao Huang

Po-Yao Huang

Perception Encoder: The best visual embeddings are not at the output of the network

Add code
Apr 17, 2025
Viaarxiv icon

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Add code
Apr 17, 2025
Viaarxiv icon

Altogether: Image Captioning via Re-aligning Alt-text

Add code
Oct 22, 2024
Figure 1 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 2 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 3 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 4 for Altogether: Image Captioning via Re-aligning Alt-text
Viaarxiv icon

Self-Supervised Audio-Visual Soundscape Stylization

Add code
Sep 22, 2024
Viaarxiv icon

Text Quality-Based Pruning for Efficient Training of Language Models

Add code
Apr 26, 2024
Figure 1 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 2 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 3 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 4 for Text Quality-Based Pruning for Efficient Training of Language Models
Viaarxiv icon

MoDE: CLIP Data Experts via Clustering

Add code
Apr 24, 2024
Figure 1 for MoDE: CLIP Data Experts via Clustering
Figure 2 for MoDE: CLIP Data Experts via Clustering
Figure 3 for MoDE: CLIP Data Experts via Clustering
Figure 4 for MoDE: CLIP Data Experts via Clustering
Viaarxiv icon

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

Add code
Mar 25, 2024
Viaarxiv icon

Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

Add code
Mar 24, 2024
Figure 1 for Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
Figure 2 for Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
Figure 3 for Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
Figure 4 for Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
Viaarxiv icon

FLAP: Fast Language-Audio Pre-training

Add code
Nov 02, 2023
Viaarxiv icon

Demystifying CLIP Data

Add code
Oct 02, 2023
Figure 1 for Demystifying CLIP Data
Figure 2 for Demystifying CLIP Data
Figure 3 for Demystifying CLIP Data
Figure 4 for Demystifying CLIP Data
Viaarxiv icon