Picture for Zhenguo Li

Zhenguo Li

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

Add code
Oct 07, 2024
Figure 1 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 2 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 3 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 4 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Viaarxiv icon

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Add code
Oct 02, 2024
Figure 1 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 2 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 3 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 4 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Viaarxiv icon

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration

Add code
Sep 17, 2024
Figure 1 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 2 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 3 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 4 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Viaarxiv icon

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation

Add code
Jul 19, 2024
Figure 1 for T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Figure 2 for T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Figure 3 for T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Figure 4 for T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Viaarxiv icon

GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

Add code
Jul 08, 2024
Figure 1 for GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Figure 2 for GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Figure 3 for GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Figure 4 for GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Viaarxiv icon

Jailbreaking as a Reward Misspecification Problem

Add code
Jun 20, 2024
Viaarxiv icon

QuickLLaMA: Query-aware Inference Acceleration for Large Language Models

Add code
Jun 11, 2024
Viaarxiv icon

Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

Add code
Jun 06, 2024
Figure 1 for Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Figure 2 for Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Figure 3 for Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Figure 4 for Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Viaarxiv icon

Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion

Add code
May 24, 2024
Viaarxiv icon