Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziyang Chu

Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Aug 07, 2025

Yuhan Zhang, Long Zhuo, Ziyang Chu, Tong Wu, Zhibing Li, Liang Pan, Dahua Lin, Ziwei Liu

Figure 1 for Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Figure 2 for Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Figure 3 for Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Figure 4 for Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Abstract:Despite rapid advances in 3D content generation, quality assessment for the generated 3D assets remains challenging. Existing methods mainly rely on image-based metrics and operate solely at the object level, limiting their ability to capture spatial coherence, material authenticity, and high-fidelity local details. 1) To address these challenges, we introduce Hi3DEval, a hierarchical evaluation framework tailored for 3D generative content. It combines both object-level and part-level evaluation, enabling holistic assessments across multiple dimensions as well as fine-grained quality analysis. Additionally, we extend texture evaluation beyond aesthetic appearance by explicitly assessing material realism, focusing on attributes such as albedo, saturation, and metallicness. 2) To support this framework, we construct Hi3DBench, a large-scale dataset comprising diverse 3D assets and high-quality annotations, accompanied by a reliable multi-agent annotation pipeline. We further propose a 3D-aware automated scoring system based on hybrid 3D representations. Specifically, we leverage video-based representations for object-level and material-subject evaluations to enhance modeling of spatio-temporal consistency and employ pretrained 3D features for part-level perception. Extensive experiments demonstrate that our approach outperforms existing image-based metrics in modeling 3D characteristics and achieves superior alignment with human preference, providing a scalable alternative to manual evaluations. The project page is available at https://zyh482.github.io/Hi3DEval/.

* Page: https://zyh482.github.io/Hi3DEval/

Via

Access Paper or Ask Questions

X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Dec 02, 2024

Zeyi Sun, Ziyang Chu, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Figure 1 for X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Figure 2 for X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Figure 3 for X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Figure 4 for X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Abstract:In-context generation is a key component of large language models' (LLMs) open-task generalization capability. By leveraging a few examples as context, LLMs can perform both in-domain and out-of-domain tasks. Recent advancements in auto-regressive vision-language models (VLMs) built upon LLMs have showcased impressive performance in text-to-image generation. However, the potential of in-context learning for general image generation tasks remains largely unexplored. To address this, we introduce X-Prompt, a purely auto-regressive large-vision language model designed to deliver competitive performance across a wide range of both seen and unseen image generation tasks, all within a unified in-context learning framework. X-Prompt incorporates a specialized design that efficiently compresses valuable features from in-context examples, supporting longer in-context token sequences and improving its ability to generalize to unseen tasks. A unified training task for both text and image prediction enables X-Prompt to handle general image generation with enhanced task awareness from in-context examples. Extensive experiments validate the model's performance across diverse seen image generation tasks and its capacity to generalize to previously unseen tasks.

* code: https://github.com/SunzeY/X-Prompt

Via

Access Paper or Ask Questions