Picture for Hongtao Xie

Hongtao Xie

ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

Add code
Dec 15, 2025
Viaarxiv icon

LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting

Add code
Nov 08, 2025
Viaarxiv icon

DMA: Online RAG Alignment with Human Feedback

Add code
Nov 06, 2025
Figure 1 for DMA: Online RAG Alignment with Human Feedback
Figure 2 for DMA: Online RAG Alignment with Human Feedback
Figure 3 for DMA: Online RAG Alignment with Human Feedback
Figure 4 for DMA: Online RAG Alignment with Human Feedback
Viaarxiv icon

RegionRAG: Region-level Retrieval-Augumented Generation for Visually-Rich Documents

Add code
Oct 31, 2025
Viaarxiv icon

UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models

Add code
Oct 02, 2025
Figure 1 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 2 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 3 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 4 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Viaarxiv icon

Test-Time Scaling with Reflective Generative Model

Add code
Jul 02, 2025
Viaarxiv icon

From Evaluation to Defense: Advancing Safety in Video Large Language Models

Add code
May 22, 2025
Figure 1 for From Evaluation to Defense: Advancing Safety in Video Large Language Models
Figure 2 for From Evaluation to Defense: Advancing Safety in Video Large Language Models
Figure 3 for From Evaluation to Defense: Advancing Safety in Video Large Language Models
Figure 4 for From Evaluation to Defense: Advancing Safety in Video Large Language Models
Viaarxiv icon

PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering

Add code
Apr 09, 2025
Viaarxiv icon

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Add code
Mar 25, 2025
Figure 1 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 2 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 3 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 4 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Viaarxiv icon

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon