Picture for Jing Zhang

Jing Zhang

The University of Sydney, Australia

Rethink Sparse Signals for Pose-guided Text-to-image Generation

Add code
Jun 26, 2025
Viaarxiv icon

Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images

Add code
Jun 24, 2025
Figure 1 for Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images
Figure 2 for Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images
Figure 3 for Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images
Figure 4 for Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images
Viaarxiv icon

Progressive Modality Cooperation for Multi-Modality Domain Adaptation

Add code
Jun 24, 2025
Viaarxiv icon

ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models

Add code
Jun 11, 2025
Viaarxiv icon

Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation

Add code
May 30, 2025
Viaarxiv icon

The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector

Add code
May 29, 2025
Viaarxiv icon

GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking

Add code
May 28, 2025
Figure 1 for GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking
Figure 2 for GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking
Figure 3 for GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking
Figure 4 for GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking
Viaarxiv icon

What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Add code
May 28, 2025
Viaarxiv icon

Prototype Embedding Optimization for Human-Object Interaction Detection in Livestreaming

Add code
May 28, 2025
Viaarxiv icon

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Add code
May 27, 2025
Figure 1 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 2 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 3 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 4 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Viaarxiv icon