Picture for Yan Lu

Yan Lu

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Add code
Apr 14, 2025
Figure 1 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 2 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 3 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 4 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Viaarxiv icon

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Add code
Apr 08, 2025
Viaarxiv icon

GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Add code
Mar 24, 2025
Figure 1 for GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting
Figure 2 for GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting
Figure 3 for GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting
Figure 4 for GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting
Viaarxiv icon

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Add code
Mar 15, 2025
Figure 1 for Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Figure 2 for Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Figure 3 for Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Figure 4 for Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Viaarxiv icon

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Add code
Mar 08, 2025
Figure 1 for StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Figure 2 for StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Figure 3 for StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Figure 4 for StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Viaarxiv icon

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Add code
Mar 03, 2025
Figure 1 for DLF: Extreme Image Compression with Dual-generative Latent Fusion
Figure 2 for DLF: Extreme Image Compression with Dual-generative Latent Fusion
Figure 3 for DLF: Extreme Image Compression with Dual-generative Latent Fusion
Figure 4 for DLF: Extreme Image Compression with Dual-generative Latent Fusion
Viaarxiv icon

Towards Practical Real-Time Neural Video Compression

Add code
Feb 28, 2025
Viaarxiv icon

UVRM: A Scalable 3D Reconstruction Model from Unposed Videos

Add code
Jan 16, 2025
Figure 1 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 2 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 3 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 4 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Viaarxiv icon

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Add code
Dec 23, 2024
Viaarxiv icon

GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Add code
Dec 22, 2024
Figure 1 for GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs
Figure 2 for GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs
Figure 3 for GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs
Figure 4 for GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs
Viaarxiv icon