Picture for Yan Lu

Yan Lu

UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis

Add code
Apr 16, 2025
Viaarxiv icon

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Add code
Apr 14, 2025
Viaarxiv icon

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Add code
Apr 08, 2025
Viaarxiv icon

GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Add code
Mar 24, 2025
Viaarxiv icon

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Add code
Mar 15, 2025
Viaarxiv icon

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Add code
Mar 08, 2025
Viaarxiv icon

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Add code
Mar 03, 2025
Viaarxiv icon

Towards Practical Real-Time Neural Video Compression

Add code
Feb 28, 2025
Viaarxiv icon

UVRM: A Scalable 3D Reconstruction Model from Unposed Videos

Add code
Jan 16, 2025
Figure 1 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 2 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 3 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Figure 4 for UVRM: A Scalable 3D Reconstruction Model from Unposed Videos
Viaarxiv icon

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Add code
Dec 23, 2024
Viaarxiv icon