Picture for Zhaoyang Li

Zhaoyang Li

Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

Add code
Jan 08, 2026
Viaarxiv icon

All Changes May Have Invariant Principles: Improving Ever-Shifting Harmful Meme Detection via Design Concept Reproduction

Add code
Jan 08, 2026
Viaarxiv icon

ReSPIRe: Informative and Reusable Belief Tree Search for Robot Probabilistic Search and Tracking in Unknown Environments

Add code
Dec 31, 2025
Viaarxiv icon

OmniSparse: Training-Aware Fine-Grained Sparse Attention for Long-Video MLLMs

Add code
Nov 18, 2025
Viaarxiv icon

ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking

Add code
Nov 13, 2025
Figure 1 for ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
Figure 2 for ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
Figure 3 for ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
Figure 4 for ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
Viaarxiv icon

Mono3DVG-EnSD: Enhanced Spatial-aware and Dimension-decoupled Text Encoding for Monocular 3D Visual Grounding

Add code
Nov 10, 2025
Viaarxiv icon

A Dual-stage Prompt-driven Privacy-preserving Paradigm for Person Re-Identification

Add code
Nov 07, 2025
Figure 1 for A Dual-stage Prompt-driven Privacy-preserving Paradigm for Person Re-Identification
Figure 2 for A Dual-stage Prompt-driven Privacy-preserving Paradigm for Person Re-Identification
Figure 3 for A Dual-stage Prompt-driven Privacy-preserving Paradigm for Person Re-Identification
Figure 4 for A Dual-stage Prompt-driven Privacy-preserving Paradigm for Person Re-Identification
Viaarxiv icon

ORIC: Benchmarking Object Recognition in Incongruous Context for Large Vision-Language Models

Add code
Sep 19, 2025
Viaarxiv icon

OCELOT 2023: Cell Detection from Cell-Tissue Interaction Challenge

Add code
Sep 11, 2025
Viaarxiv icon

Dual Enhancement on 3D Vision-Language Perception for Monocular 3D Visual Grounding

Add code
Aug 26, 2025
Viaarxiv icon