Picture for Long Chen

Long Chen

University of Kaiserslautern-Landau, MODE Collaboration

Improving Diffusion-based Data Augmentation with Inversion Spherical Interpolation

Add code
Aug 29, 2024
Viaarxiv icon

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Add code
Aug 28, 2024
Viaarxiv icon

Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

Add code
Aug 15, 2024
Viaarxiv icon

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

Add code
Aug 02, 2024
Figure 1 for An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Figure 2 for An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Figure 3 for An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Figure 4 for An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Viaarxiv icon

From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation

Add code
Jul 12, 2024
Viaarxiv icon

SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning

Add code
Jul 10, 2024
Figure 1 for SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Figure 2 for SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Figure 3 for SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Figure 4 for SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Viaarxiv icon

LightStereo: Channel Boost Is All Your Need for Efficient 2D Cost Aggregation

Add code
Jun 28, 2024
Viaarxiv icon

TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems

Add code
Jun 25, 2024
Figure 1 for TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems
Figure 2 for TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems
Figure 3 for TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems
Figure 4 for TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems
Viaarxiv icon

MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

Add code
Jun 15, 2024
Figure 1 for MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Figure 2 for MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Figure 3 for MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Figure 4 for MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Viaarxiv icon

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation

Add code
Jun 15, 2024
Figure 1 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 2 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 3 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Figure 4 for CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Viaarxiv icon