Picture for Kaiqi Huang

Kaiqi Huang

VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test

Add code
Aug 07, 2025
Viaarxiv icon

CausalStep: A Benchmark for Explicit Stepwise Causal Reasoning in Videos

Add code
Jul 22, 2025
Viaarxiv icon

WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models

Add code
Jun 12, 2025
Viaarxiv icon

Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World

Add code
Dec 27, 2024
Figure 1 for Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
Figure 2 for Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
Figure 3 for Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
Figure 4 for Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
Viaarxiv icon

How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking

Add code
Nov 23, 2024
Figure 1 for How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
Figure 2 for How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
Figure 3 for How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
Figure 4 for How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
Viaarxiv icon

DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM

Add code
Oct 03, 2024
Figure 1 for DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
Figure 2 for DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
Figure 3 for DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
Figure 4 for DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
Viaarxiv icon

Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark

Add code
Sep 13, 2024
Figure 1 for Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
Figure 2 for Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
Figure 3 for Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
Viaarxiv icon

Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness

Add code
Jul 12, 2024
Figure 1 for Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Figure 2 for Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Figure 3 for Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Figure 4 for Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Viaarxiv icon

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

Add code
May 21, 2024
Figure 1 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 2 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 3 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Figure 4 for SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Viaarxiv icon

DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM

Add code
May 20, 2024
Viaarxiv icon