Abstract:Visualization, a domain-specific yet widely used form of imagery, is an effective way to turn complex datasets into intuitive insights, and its value depends on whether data are faithfully represented, clearly communicated, and aesthetically designed. However, evaluating visualization quality is challenging: unlike natural images, it requires simultaneous judgment across data encoding accuracy, information expressiveness, and visual aesthetics. Although multimodal large language models (MLLMs) have shown promising performance in aesthetic assessment of natural images, no systematic benchmark exists for measuring their capabilities in evaluating visualizations. To address this, we propose VisJudge-Bench, the first comprehensive benchmark for evaluating MLLMs' performance in assessing visualization aesthetics and quality. It contains 3,090 expert-annotated samples from real-world scenarios, covering single visualizations, multiple visualizations, and dashboards across 32 chart types. Systematic testing on this benchmark reveals that even the most advanced MLLMs (such as GPT-5) still exhibit significant gaps compared to human experts in judgment, with a Mean Absolute Error (MAE) of 0.551 and a correlation with human ratings of only 0.429. To address this issue, we propose VisJudge, a model specifically designed for visualization aesthetics and quality assessment. Experimental results demonstrate that VisJudge significantly narrows the gap with human judgment, reducing the MAE to 0.442 (a 19.8% reduction) and increasing the consistency with human experts to 0.681 (a 58.7% improvement) compared to GPT-5. The benchmark is available at https://github.com/HKUSTDial/VisJudgeBench.
Abstract:Chart annotations enhance visualization accessibility but suffer from fragmented, non-standardized representations that limit cross-platform reuse. We propose ChartMark, a structured grammar that separates annotation semantics from visualization implementations. ChartMark features a hierarchical framework mapping onto annotation dimensions (e.g., task, chart context), supporting both abstract intents and precise visual details. Our toolkit demonstrates converting ChartMark specifications into Vega-Lite visualizations, highlighting its flexibility, expressiveness, and practical applicability.
Abstract:The growing importance of data visualization in business intelligence and data science emphasizes the need for tools that can efficiently generate meaningful visualizations from large datasets. Existing tools fall into two main categories: human-powered tools (e.g., Tableau and PowerBI), which require intensive expert involvement, and AI-powered automated tools (e.g., Draco and Table2Charts), which often fall short of guessing specific user needs. In this paper, we aim to achieve the best of both worlds. Our key idea is to initially auto-generate a set of high-quality visualizations to minimize manual effort, then refine this process iteratively with user feedback to more closely align with their needs. To this end, we present HAIChart, a reinforcement learning-based framework designed to iteratively recommend good visualizations for a given dataset by incorporating user feedback. Specifically, we propose a Monte Carlo Graph Search-based visualization generation algorithm paired with a composite reward function to efficiently explore the visualization space and automatically generate good visualizations. We devise a visualization hints mechanism to actively incorporate user feedback, thus progressively refining the visualization generation module. We further prove that the top-k visualization hints selection problem is NP-hard and design an efficient algorithm. We conduct both quantitative evaluations and user studies, showing that HAIChart significantly outperforms state-of-the-art human-powered tools (21% better at Recall and 1.8 times faster) and AI-powered automatic tools (25.1% and 14.9% better in terms of Hit@3 and R10@30, respectively).




Abstract:Looming detection plays an important role in insect collision prevention systems. As a vital capability evolutionary survival, it has been extensively studied in neuroscience and is attracting increasing research interest in robotics due to its close relationship with collision detection and navigation. Visual cues such as angular size, angular velocity, and expansion have been widely studied for looming detection by means of optic flow or elementary neural computing research. However, a critical visual motion cue has been long neglected because it is so easy to be confused with expansion, that is radial-opponent-motion (ROM). Recent research on the discovery of LPLC2, a ROM-sensitive neuron in Drosophila, has revealed its ultra-selectivity because it only responds to stimuli with focal, outward movement. This characteristic of ROM-sensitivity is consistent with the demand for collision detection because it is strongly associated with danger looming that is moving towards the center of the observer. Thus, we hope to extend the well-studied neural model of the lobula giant movement detector (LGMD) with ROM-sensibility in order to enhance robustness and accuracy at the same time. In this paper, we investigate the potential to extend an image velocity-based looming detector, the lobula giant movement detector (LGMD), with ROM-sensibility. To achieve this, we propose the mathematical definition of ROM and its main property, the radial motion opponency (RMO). Then, a synaptic neuropile that analogizes the synaptic processing of LPLC2 is proposed in the form of lateral inhibition and attention. Thus, our proposed model is the first to perform both image velocity selectivity and ROM sensitivity. Systematic experiments are conducted to exhibit the huge potential of the proposed bio-inspired looming detector.