Picture for Peng Jin

Peng Jin

Senior member, IEEE

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

Add code
Jul 15, 2024
Viaarxiv icon

LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference

Add code
Jun 26, 2024
Viaarxiv icon

RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter

Add code
May 29, 2024
Viaarxiv icon

LLMBind: A Unified Modality-Task Integration Framework

Add code
Mar 08, 2024
Viaarxiv icon

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

Add code
Feb 08, 2024
Viaarxiv icon

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Add code
Feb 04, 2024
Viaarxiv icon

Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation

Add code
Jan 18, 2024
Viaarxiv icon

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Add code
Dec 27, 2023
Figure 1 for Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
Figure 2 for Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
Figure 3 for Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
Figure 4 for Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
Viaarxiv icon

FreestyleRet: Retrieving Images from Style-Diversified Queries

Add code
Dec 08, 2023
Viaarxiv icon

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Add code
Nov 21, 2023
Viaarxiv icon