Picture for Junyu Gao

Junyu Gao

FusAD: Time-Frequency Fusion with Adaptive Denoising for General Time Series Analysis

Add code
Dec 16, 2025
Viaarxiv icon

Exploring the Underwater World Segmentation without Extra Training

Add code
Nov 11, 2025
Viaarxiv icon

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

Add code
Jul 29, 2025
Viaarxiv icon

LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges

Add code
Jun 09, 2025
Viaarxiv icon

WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code

Add code
Jun 09, 2025
Figure 1 for WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code
Figure 2 for WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code
Figure 3 for WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code
Figure 4 for WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code
Viaarxiv icon

Scale Efficient Training for Large Datasets

Add code
Mar 17, 2025
Figure 1 for Scale Efficient Training for Large Datasets
Figure 2 for Scale Efficient Training for Large Datasets
Figure 3 for Scale Efficient Training for Large Datasets
Figure 4 for Scale Efficient Training for Large Datasets
Viaarxiv icon

From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models

Add code
Mar 08, 2025
Figure 1 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 2 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 3 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 4 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Viaarxiv icon

A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning

Add code
Mar 06, 2025
Figure 1 for A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Figure 2 for A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Figure 3 for A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Figure 4 for A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Viaarxiv icon

FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation

Add code
Jan 03, 2025
Figure 1 for FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
Figure 2 for FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
Figure 3 for FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
Figure 4 for FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
Viaarxiv icon

SignEye: Traffic Sign Interpretation from Vehicle First-Person View

Add code
Nov 18, 2024
Figure 1 for SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Figure 2 for SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Figure 3 for SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Figure 4 for SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Viaarxiv icon