Picture for Jin Tang

Jin Tang

Text-RGBT Person Retrieval: Multilevel Global-Local Cross-Modal Alignment and A High-quality Benchmark

Add code
Mar 11, 2025
Viaarxiv icon

Sign Language Translation using Frame and Event Stream: Benchmark Dataset and Algorithms

Add code
Mar 09, 2025
Viaarxiv icon

EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition

Add code
Feb 13, 2025
Viaarxiv icon

Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark

Add code
Feb 08, 2025
Viaarxiv icon

XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion

Add code
Feb 08, 2025
Viaarxiv icon

LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks

Add code
Jan 17, 2025
Viaarxiv icon

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation

Add code
Jan 07, 2025
Viaarxiv icon

Dynamic Disentangled Fusion Network for RGBT Tracking

Add code
Dec 11, 2024
Figure 1 for Dynamic Disentangled Fusion Network for RGBT Tracking
Figure 2 for Dynamic Disentangled Fusion Network for RGBT Tracking
Figure 3 for Dynamic Disentangled Fusion Network for RGBT Tracking
Figure 4 for Dynamic Disentangled Fusion Network for RGBT Tracking
Viaarxiv icon

Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering

Add code
Nov 24, 2024
Figure 1 for Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Figure 2 for Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Figure 3 for Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Figure 4 for Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Viaarxiv icon

UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Add code
Nov 06, 2024
Viaarxiv icon