Picture for Ziyang Gong

Ziyang Gong

CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation

Add code
Mar 12, 2026
Viaarxiv icon

ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments

Add code
Mar 03, 2026
Viaarxiv icon

Out of the Memory Barrier: A Highly Memory Efficient Training System for LLMs with Million-Token Contexts

Add code
Feb 02, 2026
Viaarxiv icon

MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique

Add code
Nov 12, 2025
Figure 1 for MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Figure 2 for MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Figure 3 for MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Figure 4 for MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Viaarxiv icon

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Add code
Oct 22, 2025
Viaarxiv icon

Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval

Add code
Aug 27, 2025
Figure 1 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 2 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 3 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 4 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Viaarxiv icon

Partial Weakly-Supervised Oriented Object Detection

Add code
Jul 03, 2025
Figure 1 for Partial Weakly-Supervised Oriented Object Detection
Figure 2 for Partial Weakly-Supervised Oriented Object Detection
Figure 3 for Partial Weakly-Supervised Oriented Object Detection
Figure 4 for Partial Weakly-Supervised Oriented Object Detection
Viaarxiv icon

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Add code
Jun 12, 2025
Viaarxiv icon

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence

Add code
Jun 09, 2025
Viaarxiv icon

Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions

Add code
May 04, 2025
Viaarxiv icon