Picture for Jing Zhang

Jing Zhang

The University of Sydney, Australia

Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation

Add code
May 30, 2025
Viaarxiv icon

The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector

Add code
May 29, 2025
Viaarxiv icon

Prototype Embedding Optimization for Human-Object Interaction Detection in Livestreaming

Add code
May 28, 2025
Viaarxiv icon

What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Add code
May 28, 2025
Viaarxiv icon

GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking

Add code
May 28, 2025
Viaarxiv icon

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Add code
May 27, 2025
Viaarxiv icon

PathBench: A comprehensive comparison benchmark for pathology foundation models towards precision oncology

Add code
May 26, 2025
Viaarxiv icon

Reasoning-OCR: Can Large Multimodal Models Solve Complex Logical Reasoning Problems from OCR Cues?

Add code
May 19, 2025
Viaarxiv icon

LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?

Add code
May 18, 2025
Viaarxiv icon

Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field

Add code
May 15, 2025
Viaarxiv icon