Picture for Xinyu Chen

Xinyu Chen

VideoVista: A Versatile Benchmark for Video Understanding and Reasoning

Add code
Jun 17, 2024
Viaarxiv icon

On Unified Prompt Tuning for Request Quality Assurance in Public Code Review

Add code
Apr 11, 2024
Viaarxiv icon

LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs

Add code
Feb 21, 2024
Viaarxiv icon

Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment

Add code
Feb 21, 2024
Viaarxiv icon

A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering

Add code
Nov 13, 2023
Viaarxiv icon

QwenGrasp: A Usage of Large Vision-Language Model for Target-Oriented Grasping

Add code
Oct 08, 2023
Figure 1 for QwenGrasp: A Usage of Large Vision-Language Model for Target-Oriented Grasping
Figure 2 for QwenGrasp: A Usage of Large Vision-Language Model for Target-Oriented Grasping
Figure 3 for QwenGrasp: A Usage of Large Vision-Language Model for Target-Oriented Grasping
Viaarxiv icon

Understanding Deep Neural Networks via Linear Separability of Hidden Layers

Add code
Jul 26, 2023
Figure 1 for Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Figure 2 for Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Figure 3 for Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Figure 4 for Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Viaarxiv icon

Optimal Weighted Random Forests

Add code
May 17, 2023
Figure 1 for Optimal Weighted Random Forests
Figure 2 for Optimal Weighted Random Forests
Figure 3 for Optimal Weighted Random Forests
Figure 4 for Optimal Weighted Random Forests
Viaarxiv icon

A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues

Add code
May 08, 2023
Figure 1 for A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
Figure 2 for A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
Figure 3 for A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
Figure 4 for A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
Viaarxiv icon

LMEye: An Interactive Perception Network for Large Language Models

Add code
May 05, 2023
Figure 1 for LMEye: An Interactive Perception Network for Large Language Models
Figure 2 for LMEye: An Interactive Perception Network for Large Language Models
Figure 3 for LMEye: An Interactive Perception Network for Large Language Models
Figure 4 for LMEye: An Interactive Perception Network for Large Language Models
Viaarxiv icon