Picture for Qin Jin

Qin Jin

Renmin University of China

RTime-QA: A Benchmark for Atomic Temporal Event Understanding in Large Multi-modal Models

Add code
May 25, 2025
Viaarxiv icon

EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining

Add code
Mar 19, 2025
Figure 1 for EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Figure 2 for EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Figure 3 for EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Figure 4 for EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Viaarxiv icon

TimeZero: Temporal Video Grounding with Reasoning-Guided LVLM

Add code
Mar 17, 2025
Viaarxiv icon

WritingBench: A Comprehensive Benchmark for Generative Writing

Add code
Mar 07, 2025
Viaarxiv icon

Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval

Add code
Dec 26, 2024
Figure 1 for Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
Figure 2 for Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
Figure 3 for Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
Figure 4 for Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
Viaarxiv icon

Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models

Add code
Oct 04, 2024
Figure 1 for Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models
Figure 2 for Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models
Figure 3 for Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models
Figure 4 for Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models
Viaarxiv icon

Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues

Add code
Sep 29, 2024
Figure 1 for Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues
Figure 2 for Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues
Figure 3 for Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues
Figure 4 for Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm

Add code
Sep 11, 2024
Figure 1 for Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm
Figure 2 for Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm
Viaarxiv icon

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Add code
Sep 05, 2024
Figure 1 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 2 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 3 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 4 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Viaarxiv icon