Text To Image Generation


Text-to-image generation is the process of generating images from textual descriptions using deep learning techniques.

RAIGen: Rare Attribute Identification in Text-to-Image Generative Models

Add code
Feb 06, 2026
Viaarxiv icon

Weak to Strong: VLM-Based Pseudo-Labeling as a Weakly Supervised Training Strategy in Multimodal Video-based Hidden Emotion Understanding Tasks

Add code
Feb 08, 2026
Viaarxiv icon

WristMIR: Coarse-to-Fine Region-Aware Retrieval of Pediatric Wrist Radiographs with Radiology Report-Driven Learning

Add code
Feb 08, 2026
Viaarxiv icon

SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens

Add code
Feb 07, 2026
Viaarxiv icon

From Dead Pixels to Editable Slides: Infographic Reconstruction into Native Google Slides via Vision-Language Region Understanding

Add code
Feb 07, 2026
Viaarxiv icon

ChatUMM: Robust Context Tracking for Conversational Interleaved Generation

Add code
Feb 06, 2026
Viaarxiv icon

Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

Add code
Feb 06, 2026
Viaarxiv icon

AEGPO: Adaptive Entropy-Guided Policy Optimization for Diffusion Models

Add code
Feb 06, 2026
Viaarxiv icon

Efficient Table Retrieval and Understanding with Multimodal Large Language Models

Add code
Feb 07, 2026
Viaarxiv icon

Di3PO -- Diptych Diffusion DPO for Targeted Improvements in Image

Add code
Feb 06, 2026
Viaarxiv icon