Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kaihe Xu

MeDocVL: A Visual Language Model for Medical Document Understanding and Parsing

Feb 06, 2026

Wenjie Wang, Wei Wu, Ying Liu, Yuan Zhao, Xiaole Lv, Liang Diao, Zengjian Fan, Wenfeng Xie, Ziling Lin, De Shi(+3 more)

Abstract:Medical document OCR is challenging due to complex layouts, domain-specific terminology, and noisy annotations, while requiring strict field-level exact matching. Existing OCR systems and general-purpose vision-language models often fail to reliably parse such documents. We propose MeDocVL, a post-trained vision-language model for query-driven medical document parsing. Our framework combines Training-driven Label Refinement to construct high-quality supervision from noisy annotations, with a Noise-aware Hybrid Post-training strategy that integrates reinforcement learning and supervised fine-tuning to achieve robust and precise extraction. Experiments on medical invoice benchmarks show that MeDocVL consistently outperforms conventional OCR systems and strong VLM baselines, achieving state-of-the-art performance under noisy supervision.

* 20 pages, 8 figures. Technical report

Via

Access Paper or Ask Questions

Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Mar 18, 2024

Wendi Li, Wei Wei, Kaihe Xu, Wenfeng Xie, Dangyang Chen, Yu Cheng

Figure 1 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 2 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 3 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Figure 4 for Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Abstract:To meet the requirements of real-world applications, it is essential to control generations of large language models (LLMs). Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a "first-quantize-then-noise" paradigm to enhance the robustness of the RL algorithm.Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG

* Accepted to NAACL 2024 Findings

Via

Access Paper or Ask Questions