Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Menglu Wang

PhotoAgent: Agentic Photo Editing with Exploratory Visual Aesthetic Planning

Feb 26, 2026

Mingde Yao, Zhiyuan You, Tam-King Man, Menglu Wang, Tianfan Xue

Abstract:With the recent fast development of generative models, instruction-based image editing has shown great potential in generating high-quality images. However, the quality of editing highly depends on carefully designed instructions, placing the burden of task decomposition and sequencing entirely on the user. To achieve autonomous image editing, we present PhotoAgent, a system that advances image editing through explicit aesthetic planning. Specifically, PhotoAgent formulates autonomous image editing as a long-horizon decision-making problem. It reasons over user aesthetic intent, plans multi-step editing actions via tree search, and iteratively refines results through closed-loop execution with memory and visual feedback, without requiring step-by-step user prompts. To support reliable evaluation in real-world scenarios, we introduce UGC-Edit, an aesthetic evaluation benchmark consisting of 7,000 photos and a learned aesthetic reward model. We also construct a test set containing 1,017 photos to systematically assess autonomous photo editing performance. Extensive experiments demonstrate that PhotoAgent consistently improves both instruction adherence and visual quality compared with baseline methods. The project page is https://github.com/mdyao/PhotoAgent.

* A fully automated, intelligent photo-editing agent that autonomously plans multi-step aesthetic enhancements, smartly chooses diverse editing tools, and enables everyday users to achieve professional-looking results without crafting complex prompts. Project page: https://github.com/mdyao/PhotoAgent

Via

Access Paper or Ask Questions

FilDeep: Learning Large Deformations of Elastic-Plastic Solids with Multi-Fidelity Data

Jan 15, 2026

Jianheng Tang, Shilong Tao, Zhe Feng, Haonan Sun, Menglu Wang, Zhanxing Zhu, Yunhuai Liu

Abstract:The scientific computation of large deformations in elastic-plastic solids is crucial in various manufacturing applications. Traditional numerical methods exhibit several inherent limitations, prompting Deep Learning (DL) as a promising alternative. The effectiveness of current DL techniques typically depends on the availability of high-quantity and high-accuracy datasets, which are yet difficult to obtain in large deformation problems. During the dataset construction process, a dilemma stands between data quantity and data accuracy, leading to suboptimal performance in the DL models. To address this challenge, we focus on a representative application of large deformations, the stretch bending problem, and propose FilDeep, a Fidelity-based Deep Learning framework for large Deformation of elastic-plastic solids. Our FilDeep aims to resolve the quantity-accuracy dilemma by simultaneously training with both low-fidelity and high-fidelity data, where the former provides greater quantity but lower accuracy, while the latter offers higher accuracy but in less quantity. In FilDeep, we provide meticulous designs for the practical large deformation problem. Particularly, we propose attention-enabled cross-fidelity modules to effectively capture long-range physical interactions across MF data. To the best of our knowledge, our FilDeep presents the first DL framework for large deformation problems using MF data. Extensive experiments demonstrate that our FilDeep consistently achieves state-of-the-art performance and can be efficiently deployed in manufacturing.

* Accepted in Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1 (KDD '26)

Via

Access Paper or Ask Questions

Crop Lodging Prediction from UAV-Acquired Images of Wheat and Canola using a DCNN Augmented with Handcrafted Texture Features

Jun 18, 2019

Sara Mardanisamani, Farhad Maleki, Sara Hosseinzadeh Kassani, Sajith Rajapaksa, Hema Duddu, Menglu Wang, Steve Shirtliffe, Seungbum Ryu, Anique Josuttes, Ti Zhang(+5 more)

Figure 1 for Crop Lodging Prediction from UAV-Acquired Images of Wheat and Canola using a DCNN Augmented with Handcrafted Texture Features

Figure 2 for Crop Lodging Prediction from UAV-Acquired Images of Wheat and Canola using a DCNN Augmented with Handcrafted Texture Features

Figure 3 for Crop Lodging Prediction from UAV-Acquired Images of Wheat and Canola using a DCNN Augmented with Handcrafted Texture Features

Figure 4 for Crop Lodging Prediction from UAV-Acquired Images of Wheat and Canola using a DCNN Augmented with Handcrafted Texture Features

Abstract:Lodging, the permanent bending over of food crops, leads to poor plant growth and development. Consequently, lodging results in reduced crop quality, lowers crop yield, and makes harvesting difficult. Plant breeders routinely evaluate several thousand breeding lines, and therefore, automatic lodging detection and prediction is of great value aid in selection. In this paper, we propose a deep convolutional neural network (DCNN) architecture for lodging classification using five spectral channel orthomosaic images from canola and wheat breeding trials. Also, using transfer learning, we trained 10 lodging detection models using well-established deep convolutional neural network architectures. Our proposed model outperforms the state-of-the-art lodging detection methods in the literature that use only handcrafted features. In comparison to 10 DCNN lodging detection models, our proposed model achieves comparable results while having a substantially lower number of parameters. This makes the proposed model suitable for applications such as real-time classification using inexpensive hardware for high-throughput phenotyping pipelines. The GitHub repository at https://github.com/FarhadMaleki/LodgedNet contains code and models.

Via

Access Paper or Ask Questions