Picture for Pengfei Hu

Pengfei Hu

t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving

Add code
Oct 13, 2024
Viaarxiv icon

See then Tell: Enhancing Key Information Extraction with Vision Grounding

Add code
Sep 29, 2024
Figure 1 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 2 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 3 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 4 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Viaarxiv icon

DocMamba: Efficient Document Pre-training with State Space Model

Add code
Sep 18, 2024
Figure 1 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 2 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 3 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 4 for DocMamba: Efficient Document Pre-training with State Space Model
Viaarxiv icon

Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

Add code
Jun 21, 2024
Viaarxiv icon

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

Add code
Jun 13, 2024
Viaarxiv icon

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Add code
May 20, 2024
Viaarxiv icon

Poisson-Gamma Dynamical Systems with Non-Stationary Transition Dynamics

Add code
Feb 26, 2024
Figure 1 for Poisson-Gamma Dynamical Systems with Non-Stationary Transition Dynamics
Figure 2 for Poisson-Gamma Dynamical Systems with Non-Stationary Transition Dynamics
Figure 3 for Poisson-Gamma Dynamical Systems with Non-Stationary Transition Dynamics
Figure 4 for Poisson-Gamma Dynamical Systems with Non-Stationary Transition Dynamics
Viaarxiv icon

Bidirectional Trained Tree-Structured Decoder for Handwritten Mathematical Expression Recognition

Add code
Dec 31, 2023
Viaarxiv icon

Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

Add code
Nov 17, 2023
Viaarxiv icon

Incentivizing Massive Unknown Workers for Budget-Limited Crowdsensing: From Off-Line and On-Line Perspectives

Add code
Sep 21, 2023
Viaarxiv icon