Picture for Tao Zhang

Tao Zhang

Ocean-OCR: Towards General OCR Application via a Vision-Language Model

Add code
Jan 26, 2025
Figure 1 for Ocean-OCR: Towards General OCR Application via a Vision-Language Model
Figure 2 for Ocean-OCR: Towards General OCR Application via a Vision-Language Model
Figure 3 for Ocean-OCR: Towards General OCR Application via a Vision-Language Model
Figure 4 for Ocean-OCR: Towards General OCR Application via a Vision-Language Model
Viaarxiv icon

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

Generative Regression Based Watch Time Prediction for Video Recommendation: Model and Performance

Add code
Dec 28, 2024
Figure 1 for Generative Regression Based Watch Time Prediction for Video Recommendation: Model and Performance
Figure 2 for Generative Regression Based Watch Time Prediction for Video Recommendation: Model and Performance
Figure 3 for Generative Regression Based Watch Time Prediction for Video Recommendation: Model and Performance
Figure 4 for Generative Regression Based Watch Time Prediction for Video Recommendation: Model and Performance
Viaarxiv icon

RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement

Add code
Dec 17, 2024
Figure 1 for RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Figure 2 for RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Figure 3 for RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Figure 4 for RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Viaarxiv icon

THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings

Add code
Dec 16, 2024
Figure 1 for THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings
Figure 2 for THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings
Figure 3 for THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings
Figure 4 for THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings
Viaarxiv icon

Wavelet Diffusion Neural Operator

Add code
Dec 06, 2024
Viaarxiv icon

Compositional Generative Multiphysics and Multi-component Simulation

Add code
Dec 05, 2024
Figure 1 for Compositional Generative Multiphysics and Multi-component Simulation
Figure 2 for Compositional Generative Multiphysics and Multi-component Simulation
Figure 3 for Compositional Generative Multiphysics and Multi-component Simulation
Figure 4 for Compositional Generative Multiphysics and Multi-component Simulation
Viaarxiv icon

Detection of Performance Interference Among Network Slices in 5G/6G Systems

Add code
Dec 02, 2024
Figure 1 for Detection of Performance Interference Among Network Slices in 5G/6G Systems
Figure 2 for Detection of Performance Interference Among Network Slices in 5G/6G Systems
Figure 3 for Detection of Performance Interference Among Network Slices in 5G/6G Systems
Figure 4 for Detection of Performance Interference Among Network Slices in 5G/6G Systems
Viaarxiv icon

Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps

Add code
Nov 26, 2024
Figure 1 for Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps
Figure 2 for Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps
Figure 3 for Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps
Figure 4 for Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps
Viaarxiv icon