Picture for Qiang Chen

Qiang Chen

APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Add code
Jun 09, 2025
Viaarxiv icon

Tweedie Regression for Video Recommendation System

Add code
May 09, 2025
Viaarxiv icon

Adversarial Attack for RGB-Event based Visual Object Tracking

Add code
Apr 19, 2025
Viaarxiv icon

RGB-Event based Pedestrian Attribute Recognition: A Benchmark Dataset and An Asymmetric RWKV Fusion Framework

Add code
Apr 14, 2025
Viaarxiv icon

XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion

Add code
Feb 08, 2025
Viaarxiv icon

Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning

Add code
Dec 17, 2024
Figure 1 for Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning
Figure 2 for Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning
Figure 3 for Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning
Figure 4 for Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning
Viaarxiv icon

Continual SFT Matches Multimodal RLHF with Negative Supervision

Add code
Nov 22, 2024
Figure 1 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 2 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 3 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 4 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Viaarxiv icon

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon

Automated Quantification of Hyperreflective Foci in SD-OCT With Diabetic Retinopathy

Add code
Jul 31, 2024
Figure 1 for Automated Quantification of Hyperreflective Foci in SD-OCT With Diabetic Retinopathy
Figure 2 for Automated Quantification of Hyperreflective Foci in SD-OCT With Diabetic Retinopathy
Figure 3 for Automated Quantification of Hyperreflective Foci in SD-OCT With Diabetic Retinopathy
Figure 4 for Automated Quantification of Hyperreflective Foci in SD-OCT With Diabetic Retinopathy
Viaarxiv icon

OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer

Add code
Jul 15, 2024
Figure 1 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Figure 2 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Viaarxiv icon