Picture for Ziyang Zhang

Ziyang Zhang

MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm

Add code
Jun 05, 2025
Viaarxiv icon

AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models

Add code
Jun 04, 2025
Viaarxiv icon

TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine

Add code
May 29, 2025
Viaarxiv icon

Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs

Add code
May 07, 2025
Viaarxiv icon

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Add code
Apr 14, 2025
Viaarxiv icon

Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering

Add code
Mar 18, 2025
Viaarxiv icon

HIF: Height Interval Filtering for Efficient Dynamic Points Removal

Add code
Mar 10, 2025
Viaarxiv icon

MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations

Add code
Mar 06, 2025
Figure 1 for MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Figure 2 for MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Figure 3 for MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Figure 4 for MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Viaarxiv icon

E4: Energy-Efficient DNN Inference for Edge Video Analytics Via Early-Exit and DVFS

Add code
Mar 06, 2025
Viaarxiv icon

NavG: Risk-Aware Navigation in Crowded Environments Based on Reinforcement Learning with Guidance Points

Add code
Mar 03, 2025
Viaarxiv icon