Picture for Hao Yang

Hao Yang

DIMT25@ICDAR2025: HW-TSC's End-to-End Document Image Machine Translation System Leveraging Large Vision-Language Model

Add code
Apr 24, 2025
Viaarxiv icon

Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models

Add code
Apr 22, 2025
Viaarxiv icon

Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends

Add code
Apr 21, 2025
Viaarxiv icon

NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

Add code
Apr 19, 2025
Viaarxiv icon

PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks

Add code
Apr 12, 2025
Viaarxiv icon

Kimi-VL Technical Report

Add code
Apr 10, 2025
Viaarxiv icon

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Add code
Apr 08, 2025
Viaarxiv icon

DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation

Add code
Apr 07, 2025
Viaarxiv icon

MAVERIX: Multimodal Audio-Visual Evaluation Reasoning IndeX

Add code
Mar 27, 2025
Viaarxiv icon

Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental Learning

Add code
Mar 25, 2025
Viaarxiv icon