Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xujie Zhu

X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning

Aug 11, 2025

Jian Ma, Xujie Zhu, Zihao Pan, Qirong Peng, Xu Guo, Chen Chen, Haonan Lu

Abstract:Existing open-source datasets for arbitrary-instruction image editing remain suboptimal, while a plug-and-play editing module compatible with community-prevalent generative models is notably absent. In this paper, we first introduce the X2Edit Dataset, a comprehensive dataset covering 14 diverse editing tasks, including subject-driven generation. We utilize the industry-leading unified image generation models and expert models to construct the data. Meanwhile, we design reasonable editing instructions with the VLM and implement various scoring mechanisms to filter the data. As a result, we construct 3.7 million high-quality data with balanced categories. Second, to better integrate seamlessly with community image generation models, we design task-aware MoE-LoRA training based on FLUX.1, with only 8\% of the parameters of the full model. To further improve the final performance, we utilize the internal representations of the diffusion model and define positive/negative samples based on image editing types to introduce contrastive learning. Extensive experiments demonstrate that the model's editing performance is competitive among many excellent models. Additionally, the constructed dataset exhibits substantial advantages over existing open-source datasets. The open-source code, checkpoints, and datasets for X2Edit can be found at the following link: https://github.com/OPPO-Mente-Lab/X2Edit.

* https://github.com/OPPO-Mente-Lab/X2Edit

Via

Access Paper or Ask Questions

A Novel WaveInst-based Network for Tree Trunk Structure Extraction and Pattern Analysis in Forest Inventory

May 03, 2025

Chenyang Fan, Xujie Zhu, Taige Luo, Sheng Xu, Zhulin Chen, Hongxin Yang

Abstract:The pattern analysis of tree structure holds significant scientific value for genetic breeding and forestry management. The current trunk and branch extraction technologies are mainly LiDAR-based or UAV-based. The former approaches obtain high-precision 3D data, but its equipment cost is high and the three-dimensional (3D) data processing is complex. The latter approaches efficiently capture canopy information, but they miss the 3-D structure of trees. In order to deal with the branch information extraction from the complex background interference and occlusion, this work proposes a novel WaveInst instance segmentation framework, involving a discrete wavelet transform, to enhance multi-scale edge information for accurately improving tree structure extraction. Experimental results of the proposed model show superior performance on SynthTree43k, CaneTree100, Urban Street and our PoplarDataset. Moreover, we present a new Phenotypic dataset PoplarDataset, which is dedicated to extract tree structure and pattern analysis from artificial forest. The proposed method achieves a mean average precision of 49.6 and 24.3 for the structure extraction of mature and juvenile trees, respectively, surpassing the existing state-of-the-art method by 9.9. Furthermore, by in tegrating the segmentation model within the regression model, we accurately achieve significant tree grown parameters, such as the location of trees, the diameter-at-breast-height of individual trees, and the plant height, from 2D images directly. This study provides a scientific and plenty of data for tree structure analysis in related to the phenotype research, offering a platform for the significant applications in precision forestry, ecological monitoring, and intelligent breeding.

Via

Access Paper or Ask Questions