Picture for Yitian Yuan

Yitian Yuan

InstructionBench: An Instructional Video Understanding Benchmark

Add code
Apr 07, 2025
Viaarxiv icon

Deep Learning-Based Diffusion MRI Tractography: Integrating Spatial and Anatomical Information

Add code
Mar 05, 2025
Viaarxiv icon

TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

Add code
Nov 27, 2024
Figure 1 for TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
Figure 2 for TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
Figure 3 for TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
Figure 4 for TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
Viaarxiv icon

VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models

Add code
Oct 15, 2024
Figure 1 for VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
Figure 2 for VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
Figure 3 for VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
Figure 4 for VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models
Viaarxiv icon

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Add code
Jul 13, 2024
Figure 1 for 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Figure 2 for 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Figure 3 for 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Figure 4 for 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Viaarxiv icon

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models

Add code
Jun 12, 2024
Figure 1 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 2 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 3 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Figure 4 for Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
Viaarxiv icon

Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment

Add code
Dec 15, 2023
Figure 1 for Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Figure 2 for Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Figure 3 for Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Figure 4 for Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Viaarxiv icon

A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach

Add code
Mar 10, 2022
Figure 1 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 2 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 3 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 4 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Viaarxiv icon

Controllable Video Captioning with an Exemplar Sentence

Add code
Dec 02, 2021
Figure 1 for Controllable Video Captioning with an Exemplar Sentence
Figure 2 for Controllable Video Captioning with an Exemplar Sentence
Figure 3 for Controllable Video Captioning with an Exemplar Sentence
Figure 4 for Controllable Video Captioning with an Exemplar Sentence
Viaarxiv icon

Syntax Customized Video Captioning by Imitating Exemplar Sentences

Add code
Dec 02, 2021
Figure 1 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 2 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 3 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 4 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Viaarxiv icon