Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiyuan Wang

QuadBox: Accelerating 3D Gaussian Splatting with Geometry-Aware Boxes

May 06, 2026

Xinze Li, Bohan Yang, Pengxu Chen, Yiyuan Wang, Hongcheng Luo, Wentao Cheng, Weifeng Su

Abstract:3D Gaussian Splatting (3DGS) has emerged as an advanced technique for real-time novel view synthesis by representing scene geometry and appearance using differentiable Gaussian primitives. However, efficiently computing precise Gaussian-tile intersections remains a critical task in the rasterization pipeline. To this end, we propose QuadBox, a method that leverages four axis-aligned bounding boxes to tightly encapsulate projected Gaussians in a discrete manner. First, we derive a geometry-aware stretching factor that enables the construction of a tile-aligned QuadBox, which covers the elliptical projection and largely excludes irrelevant tiles. Second, we introduce QPass, a single-pass tile traversal algorithm that exhaustively exploits the discrete nature of QuadBox, ensuring that the tile intersection check is performed with simple interval tests. Experiments on public datasets show that our method accelerates the rendering speed of 3DGS by 1.85$\times$. Code is available at \href{https://github.com/Powertony102/QuadBox}{https://github.com/Powertony102/QuadBox}.

* 6 pages, 4 figures. Accepted by ICIP 26

Via

Access Paper or Ask Questions

S-VGGT: Structure-Aware Subscene Decomposition for Scalable 3D Foundation Models

Mar 18, 2026

Xinze Li, Pengxu Chen, Yiyuan Wang, Weifeng Su, Wentao Cheng

Abstract:Feed-forward 3D foundation models face a key challenge: the quadratic computational cost introduced by global attention, which severely limits scalability as input length increases. Concurrent acceleration methods, such as token merging, operate at the token level. While they offer local savings, the required nearest-neighbor searches introduce undesirable overhead. Consequently, these techniques fail to tackle the fundamental issue of structural redundancy dominant in dense capture data. In this work, we introduce \textbf{S-VGGT}, a novel approach that addresses redundancy at the structural frame level, drastically shifting the optimization focus. We first leverage the initial features to build a dense scene graph, which characterizes structural scene redundancy and guides the subsequent scene partitioning. Using this graph, we softly assign frames to a small number of subscenes, guaranteeing balanced groups and smooth geometric transitions. The core innovation lies in designing the subscenes to share a common reference frame, establishing a parallel geometric bridge that enables independent and highly efficient processing without explicit geometric alignment. This structural reorganization provides strong intrinsic acceleration by cutting the global attention cost at its source. Crucially, S-VGGT is entirely orthogonal to token-level acceleration methods, allowing the two to be seamlessly combined for compounded speedups without compromising reconstruction fidelity. Code is available at https://github.com/Powertony102/S-VGGT.

* 7 pages, 5 figures. Accepted by ICME 2026

Via

Access Paper or Ask Questions

Conversational AI-Enhanced Exploration System to Query Large-Scale Digitised Collections of Natural History Museums

Mar 11, 2026

Yiyuan Wang, Andrew Johnston, Zoë Sadokierski, Rhiannon Stephens, Shane T. Ahyong

Abstract:Recent digitisation efforts in natural history museums have produced large volumes of collection data, yet their scale and scientific complexity often hinder public access and understanding. Conventional data management tools, such as databases, restrict exploration through keyword-based search or require specialised schema knowledge. This paper presents a system design that uses conversational AI to query nearly 1.7 million digitised specimen records from the life-science collections of the Australian Museum. Designed and developed through a human-centred design process, the system contains an interactive map for visual-spatial exploration and a natural-language conversational agent that retrieves detailed specimen data and answers collection-specific questions. The system leverages function-calling capabilities of contemporary large language models to dynamically retrieve structured data from external APIs, enabling fast, real-time interaction with extensive yet frequently updated datasets. Our work provides a new approach of connecting large museum collections with natural language-based queries and informs future designs of scientific AI agents for natural history museums.

* 25 pages, 9 figures

Via

Access Paper or Ask Questions

Robots in the Wild: Contextually-Adaptive Human-Robot Interactions in Urban Public Environments

Dec 10, 2024

Xinyan Yu, Yiyuan Wang, Tram Thi Minh Tran, Yi Zhao, Julie Stephany Berrio Perez, Marius Hoggenmuller, Justine Humphry, Lian Loke, Lynn Masuda, Callum Parker(+2 more)

Figure 1 for Robots in the Wild: Contextually-Adaptive Human-Robot Interactions in Urban Public Environments

Figure 2 for Robots in the Wild: Contextually-Adaptive Human-Robot Interactions in Urban Public Environments

Figure 3 for Robots in the Wild: Contextually-Adaptive Human-Robot Interactions in Urban Public Environments

Abstract:The increasing transition of human-robot interaction (HRI) context from controlled settings to dynamic, real-world public environments calls for enhanced adaptability in robotic systems. This can go beyond algorithmic navigation or traditional HRI strategies in structured settings, requiring the ability to navigate complex public urban systems containing multifaceted dynamics and various socio-technical needs. Therefore, our proposed workshop seeks to extend the boundaries of adaptive HRI research beyond predictable, semi-structured contexts and highlight opportunities for adaptable robot interactions in urban public environments. This half-day workshop aims to explore design opportunities and challenges in creating contextually-adaptive HRI within these spaces and establish a network of interested parties within the OzCHI research community. By fostering ongoing discussions, sharing of insights, and collaborations, we aim to catalyse future research that empowers robots to navigate the inherent uncertainties and complexities of real-world public interactions.

Via

Access Paper or Ask Questions

Learning with Partial Labels from Semi-supervised Perspective

Nov 30, 2022

Ximing Li, Yuanzhi Jiang, Changchun Li, Yiyuan Wang, Jihong Ouyang

Abstract:Partial Label (PL) learning refers to the task of learning from the partially labeled data, where each training instance is ambiguously equipped with a set of candidate labels but only one is valid. Advances in the recent deep PL learning literature have shown that the deep learning paradigms, e.g., self-training, contrastive learning, or class activate values, can achieve promising performance. Inspired by the impressive success of deep Semi-Supervised (SS) learning, we transform the PL learning problem into the SS learning problem, and propose a novel PL learning method, namely Partial Label learning with Semi-supervised Perspective (PLSP). Specifically, we first form the pseudo-labeled dataset by selecting a small number of reliable pseudo-labeled instances with high-confidence prediction scores and treating the remaining instances as pseudo-unlabeled ones. Then we design a SS learning objective, consisting of a supervised loss for pseudo-labeled instances and a semantic consistency regularization for pseudo-unlabeled instances. We further introduce a complementary regularization for those non-candidate labels to constrain the model predictions on them to be as small as possible. Empirical results demonstrate that PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels. Code available: https://github.com/changchunli/PLSP.

Via

Access Paper or Ask Questions

Generating Long Financial Report using Conditional Variational Autoencoders with Knowledge Distillation

Oct 23, 2020

Yunpeng Ren, Ziao Wang, Yiyuan Wang, Xiaofeng Zhang

Figure 1 for Generating Long Financial Report using Conditional Variational Autoencoders with Knowledge Distillation

Figure 2 for Generating Long Financial Report using Conditional Variational Autoencoders with Knowledge Distillation

Figure 3 for Generating Long Financial Report using Conditional Variational Autoencoders with Knowledge Distillation

Figure 4 for Generating Long Financial Report using Conditional Variational Autoencoders with Knowledge Distillation

Abstract:Automatically generating financial report from a piece of news is quite a challenging task. Apparently, the difficulty of this task lies in the lack of sufficient background knowledge to effectively generate long financial report. To address this issue, this paper proposes the conditional variational autoencoders (CVAE) based approach which distills external knowledge from a corpus of news-report data. Particularly, we choose Bi-GRU as the encoder and decoder component of CVAE, and learn the latent variable distribution from input news. A higher level latent variable distribution is learnt from a corpus set of news-report data, respectively extr acted for each input news, to provide background knowledge to previously learnt latent variable distribution. Then, a teacher-student network is employed to distill knowledge to refine theoutput of the decoder component. To evaluate the model performance of the proposed approach, extensive experiments are preformed on a public dataset and two widely adopted evaluation criteria, i.e., BLEU and ROUGE, are chosen in the experiment. The promising experimental results demonstrate that the proposed approach is superior to the rest compared methods.

Via

Access Paper or Ask Questions

Local Search for Minimum Weight Dominating Set with Two-Level Configuration Checking and Frequency Based Scoring Function

Feb 15, 2017

Yiyuan Wang, Shaowei Cai, Minghao Yin

Figure 1 for Local Search for Minimum Weight Dominating Set with Two-Level Configuration Checking and Frequency Based Scoring Function

Figure 2 for Local Search for Minimum Weight Dominating Set with Two-Level Configuration Checking and Frequency Based Scoring Function

Figure 3 for Local Search for Minimum Weight Dominating Set with Two-Level Configuration Checking and Frequency Based Scoring Function

Figure 4 for Local Search for Minimum Weight Dominating Set with Two-Level Configuration Checking and Frequency Based Scoring Function

Abstract:The Minimum Weight Dominating Set (MWDS) problem is an important generalization of the Minimum Dominating Set (MDS) problem with extensive applications. This paper proposes a new local search algorithm for the MWDS problem, which is based on two new ideas. The first idea is a heuristic called two-level configuration checking (CC2), which is a new variant of a recent powerful configuration checking strategy (CC) for effectively avoiding the recent search paths. The second idea is a novel scoring function based on the frequency of being uncovered of vertices. Our algorithm is called CC2FS, according to the names of the two ideas. The experimental results show that, CC2FS performs much better than some state-of-the-art algorithms in terms of solution quality on a broad range of MWDS benchmarks.

* JAIR 58 (2017) 267-295
* 29 pages, 1 figure

Via

Access Paper or Ask Questions