Picture for Peng Jin

Peng Jin

Senior member, IEEE

Dual-view Spatio-Temporal Feature Fusion with CNN-Transformer Hybrid Network for Chinese Isolated Sign Language Recognition

Add code
Jun 08, 2025
Viaarxiv icon

OpenPros: A Large-Scale Dataset for Limited View Prostate Ultrasound Computed Tomography

Add code
May 18, 2025
Viaarxiv icon

Uni-AIMS: AI-Powered Microscopy Image Analysis

Add code
May 11, 2025
Viaarxiv icon

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Add code
Mar 18, 2025
Viaarxiv icon

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

Hierarchical Banzhaf Interaction for General Video-Language Representation Learning

Add code
Dec 30, 2024
Figure 1 for Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Figure 2 for Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Figure 3 for Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Figure 4 for Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Viaarxiv icon

Next Patch Prediction for Autoregressive Visual Generation

Add code
Dec 19, 2024
Viaarxiv icon

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

Add code
Nov 25, 2024
Figure 1 for LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
Figure 2 for LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
Figure 3 for LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
Figure 4 for LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
Viaarxiv icon

Effort: Efficient Orthogonal Modeling for Generalizable AI-Generated Image Detection

Add code
Nov 23, 2024
Viaarxiv icon

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Add code
Nov 15, 2024
Figure 1 for LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Figure 2 for LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Figure 3 for LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Figure 4 for LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Viaarxiv icon