Picture for Chao Zhang

Chao Zhang

refer to the report for detailed contributions

Adversarial Attacks Using Differentiable Rendering: A Survey

Add code
Nov 14, 2024
Viaarxiv icon

Fast Disentangled Slim Tensor Learning for Multi-view Clustering

Add code
Nov 12, 2024
Figure 1 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 2 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 3 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 4 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Viaarxiv icon

Matryoshka: Learning to Drive Black-Box LLMs with LLMs

Add code
Oct 28, 2024
Viaarxiv icon

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

Add code
Oct 09, 2024
Figure 1 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 2 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 3 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 4 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Viaarxiv icon

BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models

Add code
Oct 09, 2024
Figure 1 for BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models
Figure 2 for BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models
Figure 3 for BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models
Figure 4 for BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models
Viaarxiv icon

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer

Add code
Oct 07, 2024
Figure 1 for Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Figure 2 for Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Figure 3 for Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Viaarxiv icon

LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

Add code
Oct 04, 2024
Figure 1 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 2 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 3 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 4 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Viaarxiv icon

SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding

Add code
Sep 30, 2024
Figure 1 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 2 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 3 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Figure 4 for SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention Decoding
Viaarxiv icon

LW2G: Learning Whether to Grow for Prompt-based Continual Learning

Add code
Sep 27, 2024
Viaarxiv icon

MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events

Add code
Sep 25, 2024
Figure 1 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 2 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 3 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Figure 4 for MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Viaarxiv icon