Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaohang Yu

PRIMA: Boosting Animal Mesh Recovery with Biological Priors and Test-Time Adaptation

Jun 01, 2026

Xiaohang Yu, Ti Wang, Mackenzie Weygandt Mathis

Abstract:We present PRIMA (*PRI*ors for *M*esh *A*daptation), a framework for robust 3D quadruped mesh recovery under severe species and pose imbalance. Existing animal reconstruction methods often regress toward mean shapes and poses due to limited 3D supervision and long-tailed species distributions, resulting in poor generalization to underrepresented animals and rare articulations. PRIMA addresses this challenge through three key contributions. First, we incorporate BioCLIP embeddings as biological priors to inject semantic and morphological knowledge into the reconstruction process, enabling more accurate and generalizable shape prediction across diverse quadrupeds. Second, we introduce a test-time adaptation (TTA) strategy that refines SMAL predictions using 2D reprojection constraints together with auxiliary keypoint guidance, improving pose and shape estimation while enabling the generation of high-quality pseudo-3D annotations from existing 2D datasets. Third, leveraging this TTA framework, we construct Quadruped3D, a large-scale pseudo-3D dataset that covers diverse species and pose variations to systematically improve model performance. Extensive experiments on Animal3D, CtrlAni3D, Quadruped2D, and Animal Kingdom demonstrate that PRIMA achieves state-of-the-art results, with particularly strong improvements on underrepresented species and challenging poses. Our results highlight the importance of biological priors and adaptation-driven data expansion for scalable and generalizable animal mesh recovery. Code is available at https://github.com/AdaptiveMotorControlLab/PRIMA.

Via

Access Paper or Ask Questions

SUDP: Secret-Use Delegation Protocol for Agentic Systems

Apr 27, 2026

Xiaohang Yu, Hejia Geng, William Knottenbelt

Abstract:Agentic systems increasingly act with user secrets for APIs, messaging platforms, and cloud services. Today's bearer-secret interfaces implement authorization by exposure: enabling action often means placing a reusable secret, or a reusable artifact derived from it, within a model-steerable boundary, so a transient prompt-injection or tool-side compromise becomes durable account compromise. Existing defenses cover adjacent pieces such as secret storage, scoped delegation, sender-constrained tokens, and runtime monitoring, but leave the combined agentic obligation without a common specification: an untrusted autonomous requester should be able to cause a user-authorized secret-backed operation without exposing reusable authority to the requester. We formalize this problem as Agent Secret Use (ASU). From ASU we derive a security-property taxonomy that separates the problem's structural obligations from the realization-level robustness conditions any concrete construction must establish, enabling principled comparison of existing agentic-secret defenses against a problem-grounded specification. We propose the Secret-Use Delegation Protocol (SUDP), a three-role protocol realizing ASU: a requester proposes a canonical operation; the user authorizes it with a fresh authenticator-backed grant; and a custodian redeems the grant once to perform the bounded use, so reusable authority never crosses the requester boundary. We specialize SUDP for agentic deployments: agents propose operations; they do not retrieve secrets. Under explicit assumptions, we show that SUDP satisfies the ASU requirements: authorization is verifiable, operation-bound, and single-use. SUDP also provides storage confidentiality and wrapping-epoch key isolation under stated sealing and erasure assumptions; plaintext-level forward secrecy of the underlying secret additionally requires the environment to rotate and revoke it.

Via

Access Paper or Ask Questions

LOCARD: An Agentic Framework for Blockchain Forensics

Apr 05, 2026

Xiaohang Yu, William Knottenbelt

Abstract:Blockchain forensics inherently involves dynamic and iterative investigations, while many existing approaches primarily model it through static inference pipelines. We propose a paradigm shift towards Agentic Blockchain Forensics (ABF), modeling forensic investigation as a sequential decision-making process. To instantiate this paradigm, we introduce LOCARD, the first agentic framework for blockchain forensics. LOCARD operationalizes this perspective through a Tri-Core Cognitive Architecture that decouples strategic planning, operational execution, and evaluative validation. Unlike generic LLM-based agents, it incorporates a Structured Belief State mechanism to enforce forensic rigor and guide exploration under explicit state constraints. To demonstrate the efficacy of the ABF paradigm, we apply LOCARD to the inherently complex domain of cross-chain transaction tracing. We introduce Thor25, a benchmark dataset comprising over 151k real-world cross-chain forensic records, and evaluate LOCARD on the Group-Transfer Tracing task for dismantling Sybil clusters. Validated against representative laundering sub-flows from the Bybit hack, LOCARD achieves high-fidelity tracing results, providing empirical evidence that modeling blockchain forensics as an autonomous agentic task is both viable and effective. These results establish a concrete foundation for future agentic approaches to large-scale blockchain forensic analysis. Code and dataset are publicly available at https://github.com/xhyumiracle/locard and https://github.com/xhyumiracle/thorchain-crosschain-data.

Via

Access Paper or Ask Questions

FMPose3D: monocular 3D pose estimation via flow matching

Feb 05, 2026

Ti Wang, Xiaohang Yu, Mackenzie Weygandt Mathis

Abstract:Monocular 3D pose estimation is fundamentally ill-posed due to depth ambiguity and occlusions, thereby motivating probabilistic methods that generate multiple plausible 3D pose hypotheses. In particular, diffusion-based models have recently demonstrated strong performance, but their iterative denoising process typically requires many timesteps for each prediction, making inference computationally expensive. In contrast, we leverage Flow Matching (FM) to learn a velocity field defined by an Ordinary Differential Equation (ODE), enabling efficient generation of 3D pose samples with only a few integration steps. We propose a novel generative pose estimation framework, FMPose3D, that formulates 3D pose estimation as a conditional distribution transport problem. It continuously transports samples from a standard Gaussian prior to the distribution of plausible 3D poses conditioned only on 2D inputs. Although ODE trajectories are deterministic, FMPose3D naturally generates various pose hypotheses by sampling different noise seeds. To obtain a single accurate prediction from those hypotheses, we further introduce a Reprojection-based Posterior Expectation Aggregation (RPEA) module, which approximates the Bayesian posterior expectation over 3D hypotheses. FMPose3D surpasses existing methods on the widely used human pose estimation benchmarks Human3.6M and MPI-INF-3DHP, and further achieves state-of-the-art performance on the 3D animal pose datasets Animal3D and CtrlAni3D, demonstrating strong performance across both 3D pose domains. The code is available at https://github.com/AdaptiveMotorControlLab/FMPose3D.

Via

Access Paper or Ask Questions

Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience

Mar 15, 2024

Xiaohang Yu, Zhengxian Yang, Shi Pan, Yuqi Han, Haoxiang Wang, Jun Zhang, Shi Yan, Borong Lin, Lei Yang, Tao Yu(+1 more)

Figure 1 for Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience

Figure 2 for Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience

Figure 3 for Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience

Figure 4 for Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience

Abstract:We have built a custom mobile multi-camera large-space dense light field capture system, which provides a series of high-quality and sufficiently dense light field images for various scenarios. Our aim is to contribute to the development of popular 3D scene reconstruction algorithms such as IBRnet, NeRF, and 3D Gaussian splitting. More importantly, the collected dataset, which is much denser than existing datasets, may also inspire space-oriented light field reconstruction, which is potentially different from object-centric 3D reconstruction, for immersive VR/AR experiences. We utilized a total of 40 GoPro 10 cameras, capturing images of 5k resolution. The number of photos captured for each scene is no less than 1000, and the average density (view number within a unit sphere) is 134.68. It is also worth noting that our system is capable of efficiently capturing large outdoor scenes. Addressing the current lack of large-space and dense light field datasets, we made efforts to include elements such as sky, reflections, lights and shadows that are of interest to researchers in the field of 3D reconstruction during the data capture process. Finally, we validated the effectiveness of our provided dataset on three popular algorithms and also integrated the reconstructed 3DGS results into the Unity engine, demonstrating the potential of utilizing our datasets to enhance the realism of virtual reality (VR) and create feasible interactive spaces. The dataset is available at our project website.

Via

Access Paper or Ask Questions

opp/ai: Optimistic Privacy-Preserving AI on Blockchain

Feb 22, 2024

Cathie So, KD Conway, Xiaohang Yu, Suning Yao, Kartin Wong

Figure 1 for opp/ai: Optimistic Privacy-Preserving AI on Blockchain

Figure 2 for opp/ai: Optimistic Privacy-Preserving AI on Blockchain

Figure 3 for opp/ai: Optimistic Privacy-Preserving AI on Blockchain

Figure 4 for opp/ai: Optimistic Privacy-Preserving AI on Blockchain

Abstract:The convergence of Artificial Intelligence (AI) and blockchain technology is reshaping the digital world, offering decentralized, secure, and efficient AI services on blockchain platforms. Despite the promise, the high computational demands of AI on blockchain raise significant privacy and efficiency concerns. The Optimistic Privacy-Preserving AI (opp/ai) framework is introduced as a pioneering solution to these issues, striking a balance between privacy protection and computational efficiency. The framework integrates Zero-Knowledge Machine Learning (zkML) for privacy with Optimistic Machine Learning (opML) for efficiency, creating a hybrid model tailored for blockchain AI services. This study presents the opp/ai framework, delves into the privacy features of zkML, and assesses the framework's performance and adaptability across different scenarios.

Via

Access Paper or Ask Questions

ImmersiveNeRF: Hybrid Radiance Fields for Unbounded Immersive Light Field Reconstruction

Sep 04, 2023

Xiaohang Yu, Haoxiang Wang, Yuqi Han, Lei Yang, Tao Yu, Qionghai Dai

Abstract:This paper proposes a hybrid radiance field representation for unbounded immersive light field reconstruction which supports high-quality rendering and aggressive view extrapolation. The key idea is to first formally separate the foreground and the background and then adaptively balance learning of them during the training process. To fulfill this goal, we represent the foreground and background as two separate radiance fields with two different spatial mapping strategies. We further propose an adaptive sampling strategy and a segmentation regularizer for more clear segmentation and robust convergence. Finally, we contribute a novel immersive light field dataset, named THUImmersive, with the potential to achieve much larger space 6DoF immersive rendering effects compared with existing datasets, by capturing multiple neighboring viewpoints for the same scene, to stimulate the research and AR/VR applications in the immersive light field domain. Extensive experiments demonstrate the strong performance of our method for unbounded immersive light field reconstruction.

Via

Access Paper or Ask Questions

Super-NeRF: View-consistent Detail Generation for NeRF super-resolution

Apr 26, 2023

Yuqi Han, Tao Yu, Xiaohang Yu, Yuwang Wang, Qionghai Dai

Figure 1 for Super-NeRF: View-consistent Detail Generation for NeRF super-resolution

Figure 2 for Super-NeRF: View-consistent Detail Generation for NeRF super-resolution

Figure 3 for Super-NeRF: View-consistent Detail Generation for NeRF super-resolution

Figure 4 for Super-NeRF: View-consistent Detail Generation for NeRF super-resolution

Abstract:The neural radiance field (NeRF) achieved remarkable success in modeling 3D scenes and synthesizing high-fidelity novel views. However, existing NeRF-based methods focus more on the make full use of the image resolution to generate novel views, but less considering the generation of details under the limited input resolution. In analogy to the extensive usage of image super-resolution, NeRF super-resolution is an effective way to generate the high-resolution implicit representation of 3D scenes and holds great potential applications. Up to now, such an important topic is still under-explored. In this paper, we propose a NeRF super-resolution method, named Super-NeRF, to generate high-resolution NeRF from only low-resolution inputs. Given multi-view low-resolution images, Super-NeRF constructs a consistency-controlling super-resolution module to generate view-consistent high-resolution details for NeRF. Specifically, an optimizable latent code is introduced for each low-resolution input image to control the 2D super-resolution images to converge to the view-consistent output. The latent codes of each low-resolution image are optimized synergistically with the target Super-NeRF representation to fully utilize the view consistency constraint inherent in NeRF construction. We verify the effectiveness of Super-NeRF on synthetic, real-world, and AI-generated NeRF datasets. Super-NeRF achieves state-of-the-art NeRF super-resolution performance on high-resolution detail generation and cross-view consistency.

Via

Access Paper or Ask Questions