Text


Show-o2: Improved Native Unified Multimodal Models

Add code
Jun 18, 2025
Viaarxiv icon

PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning

Add code
Jun 18, 2025
Viaarxiv icon

Control and Realism: Best of Both Worlds in Layout-to-Image without Training

Add code
Jun 18, 2025
Viaarxiv icon

PredGen: Accelerated Inference of Large Language Models through Input-Time Speculation for Real-Time Speech Interaction

Add code
Jun 18, 2025
Viaarxiv icon

Approximating Language Model Training Data from Weights

Add code
Jun 18, 2025
Viaarxiv icon

Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models

Add code
Jun 18, 2025
Viaarxiv icon

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects

Add code
Jun 18, 2025
Viaarxiv icon

Context-Informed Grounding Supervision

Add code
Jun 18, 2025
Viaarxiv icon

Creating User-steerable Projections with Interactive Semantic Mapping

Add code
Jun 18, 2025
Viaarxiv icon

Factorized RVQ-GAN For Disentangled Speech Tokenization

Add code
Jun 18, 2025
Viaarxiv icon