Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xijun Liu

LongWebBench: Evaluating Structural and Functional Webpage Generation in Long-Horizon Settings

Jun 16, 2026

Yi Zhao, Zhen Yang, Mengpan Chen, Mingde Xu, Shanghui Gong, Xijun Liu, Jibing Gong, Jie Tang

Abstract:Recent vision-language models (VLMs) have shown promising progress in generating webpages from visual inputs, yet existing evaluations mainly focus on short, single-screen, and largely static webpages. We introduce LongWebBench, a benchmark for evaluating long-horizon webpage generation from both structural and functional perspectives. LongWebBench contains 490 real-world long webpages for structural fidelity evaluation and 507 goal-oriented interaction tasks over 129 webpages for functional evaluation. It employs two complementary protocols: a multi-dimensional VLM-based metric for assessing long-range structural coherence, and a DOM-augmented agent-based pipeline for end-to-end functional verification. We further examine the automatic evaluation protocols through human agreement analysis. Experiments with state-of-the-art open-source and proprietary VLMs under single-image and multi-image settings reveal that structural fidelity degrades as webpage length increases, while visually plausible generations often fail to support executable multi-step interactions. These results highlight the need to evaluate long webpage generation beyond visual similarity, with executable interaction as a core criterion. Our code and data are available at https://github.com/zheny2751-dotcom/LongWebBench.

* 49 pages, 38 figures

Via

Access Paper or Ask Questions

An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization

Dec 19, 2024

Xijun Liu, Yifan Zhou, Yuxiang Guo, Rama Chellappa, Cheng Peng

Figure 1 for An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization

Figure 2 for An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization

Figure 3 for An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization

Figure 4 for An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization

Abstract:Significant progress has been made in photo-realistic scene reconstruction over recent years. Various disparate efforts have enabled capabilities such as multi-appearance or large-scale modeling; however, there lacks a welldesigned dataset that can evaluate the holistic progress of scene reconstruction. We introduce a collection of imagery of the Johns Hopkins Homewood Campus, acquired at different seasons, times of day, in multiple elevations, and across a large scale. We perform a multi-stage calibration process, which efficiently recover camera parameters from phone and drone cameras. This dataset can enable researchers to rigorously explore challenges in unconstrained settings, including effects of inconsistent illumination, reconstruction from large scale and from significantly different perspectives, etc.

* 4 pages, 3 figures

Via

Access Paper or Ask Questions

BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Mar 07, 2024

Cheng Peng, Yutao Tang, Yifan Zhou, Nengyu Wang, Xijun Liu, Deming Li, Rama Chellappa

Figure 1 for BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Figure 2 for BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Figure 3 for BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Figure 4 for BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Abstract:Recent efforts in using 3D Gaussians for scene reconstruction and novel view synthesis can achieve impressive results on curated benchmarks; however, images captured in real life are often blurry. In this work, we analyze the robustness of Gaussian-Splatting-based methods against various image blur, such as motion blur, defocus blur, downscaling blur, \etc. Under these degradations, Gaussian-Splatting-based methods tend to overfit and produce worse results than Neural-Radiance-Field-based methods. To address this issue, we propose Blur Agnostic Gaussian Splatting (BAGS). BAGS introduces additional 2D modeling capacities such that a 3D-consistent and high quality scene can be reconstructed despite image-wise blur. Specifically, we model blur by estimating per-pixel convolution kernels from a Blur Proposal Network (BPN). BPN is designed to consider spatial, color, and depth variations of the scene to maximize modeling capacity. Additionally, BPN also proposes a quality-assessing mask, which indicates regions where blur occur. Finally, we introduce a coarse-to-fine kernel optimization scheme; this optimization scheme is fast and avoids sub-optimal solutions due to a sparse point cloud initialization, which often occurs when we apply Structure-from-Motion on blurry images. We demonstrate that BAGS achieves photorealistic renderings under various challenging blur conditions and imaging geometry, while significantly improving upon existing approaches.

Via

Access Paper or Ask Questions