Picture for Kai Ye

Kai Ye

Evolving, Not Training: Zero-Shot Reasoning Segmentation via Evolutionary Prompting

Add code
Dec 31, 2025
Viaarxiv icon

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

Add code
Nov 10, 2025
Figure 1 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 2 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 3 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 4 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Viaarxiv icon

Understanding What Is Not Said:Referring Remote Sensing Image Segmentation with Scarce Expressions

Add code
Oct 26, 2025
Viaarxiv icon

How Far Are We from True Unlearnability?

Add code
Sep 09, 2025
Viaarxiv icon

ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation

Add code
Sep 09, 2025
Figure 1 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 2 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 3 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 4 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Viaarxiv icon

RIS-LAD: A Benchmark and Model for Referring Low-Altitude Drone Image Segmentation

Add code
Jul 28, 2025
Viaarxiv icon

The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge

Add code
Jun 11, 2025
Figure 1 for The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge
Figure 2 for The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge
Figure 3 for The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge
Figure 4 for The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge
Viaarxiv icon

Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning

Add code
May 24, 2025
Figure 1 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 2 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 3 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 4 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Viaarxiv icon

More Clear, More Flexible, More Precise: A Comprehensive Oriented Object Detection benchmark for UAV

Add code
Apr 28, 2025
Viaarxiv icon

Cross-Frequency Implicit Neural Representation with Self-Evolving Parameters

Add code
Apr 15, 2025
Viaarxiv icon