Picture for Yi Tu

Yi Tu

LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding

Add code
Dec 22, 2025
Figure 1 for LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
Figure 2 for LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
Figure 3 for LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
Figure 4 for LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
Viaarxiv icon

SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder

Add code
Nov 11, 2025
Figure 1 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 2 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 3 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 4 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Viaarxiv icon

Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models

Add code
Aug 28, 2025
Figure 1 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 2 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 3 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 4 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Viaarxiv icon

Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting

Add code
Apr 27, 2025
Figure 1 for Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting
Figure 2 for Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting
Figure 3 for Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting
Figure 4 for Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting
Viaarxiv icon

InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models

Add code
Feb 19, 2025
Figure 1 for InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
Figure 2 for InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
Figure 3 for InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
Figure 4 for InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
Viaarxiv icon

Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding

Add code
Sep 29, 2024
Viaarxiv icon

UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents

Add code
Aug 02, 2024
Figure 1 for UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Figure 2 for UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Figure 3 for UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Figure 4 for UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Viaarxiv icon

SAFETY-J: Evaluating Safety with Critique

Add code
Jul 25, 2024
Viaarxiv icon

Rethinking the Evaluation of Pre-trained Text-and-Layout Models from an Entity-Centric Perspective

Add code
Feb 04, 2024
Viaarxiv icon

Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction

Add code
Oct 17, 2023
Viaarxiv icon