Picture for Yu Qiao

Yu Qiao

ShenZhen Key Lab of Computer Vision and Pattern Recognition, SIAT-SenseTime Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society

SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding

Add code
Oct 15, 2024
Figure 1 for SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
Figure 2 for SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
Figure 3 for SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
Figure 4 for SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
Viaarxiv icon

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Add code
Oct 14, 2024
Figure 1 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 2 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 3 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 4 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Viaarxiv icon

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Add code
Oct 10, 2024
Figure 1 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 2 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 3 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 4 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Viaarxiv icon

Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

Add code
Oct 10, 2024
Figure 1 for Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Figure 2 for Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Figure 3 for Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Figure 4 for Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Viaarxiv icon

ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments

Add code
Oct 10, 2024
Figure 1 for ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments
Figure 2 for ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments
Figure 3 for ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments
Figure 4 for ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments
Viaarxiv icon

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Add code
Oct 07, 2024
Viaarxiv icon

MinerU: An Open-Source Solution for Precise Document Content Extraction

Add code
Sep 27, 2024
Figure 1 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 2 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 3 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 4 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Viaarxiv icon

Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving

Add code
Sep 26, 2024
Figure 1 for Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
Figure 2 for Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
Figure 3 for Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
Figure 4 for Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
Viaarxiv icon

Inference-Time Language Model Alignment via Integrated Value Guidance

Add code
Sep 26, 2024
Figure 1 for Inference-Time Language Model Alignment via Integrated Value Guidance
Figure 2 for Inference-Time Language Model Alignment via Integrated Value Guidance
Figure 3 for Inference-Time Language Model Alignment via Integrated Value Guidance
Figure 4 for Inference-Time Language Model Alignment via Integrated Value Guidance
Viaarxiv icon

CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation

Add code
Sep 24, 2024
Viaarxiv icon