Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yao Wang

Low Latency Point Cloud Rendering with Learned Splatting

Sep 24, 2024

Yueyu Hu, Ran Gong, Qi Sun, Yao Wang

Figure 1 for Low Latency Point Cloud Rendering with Learned Splatting

Figure 2 for Low Latency Point Cloud Rendering with Learned Splatting

Figure 3 for Low Latency Point Cloud Rendering with Learned Splatting

Figure 4 for Low Latency Point Cloud Rendering with Learned Splatting

Abstract:Point cloud is a critical 3D representation with many emerging applications. Because of the point sparsity and irregularity, high-quality rendering of point clouds is challenging and often requires complex computations to recover the continuous surface representation. On the other hand, to avoid visual discomfort, the motion-to-photon latency has to be very short, under 10 ms. Existing rendering solutions lack in either quality or speed. To tackle these challenges, we present a framework that unlocks interactive, free-viewing and high-fidelity point cloud rendering. We train a generic neural network to estimate 3D elliptical Gaussians from arbitrary point clouds and use differentiable surface splatting to render smooth texture and surface normal for arbitrary views. Our approach does not require per-scene optimization, and enable real-time rendering of dynamic point cloud. Experimental results demonstrate the proposed solution enjoys superior visual quality and speed, as well as generalizability to different scene content and robustness to compression artifacts. The code is available at https://github.com/huzi96/gaussian-pcloud-render .

* Published at CVPR 2024 Workshop on AIS: Vision, Graphics and AI for Streaming (https://ai4streaming-workshop.github.io/)

Via

Access Paper or Ask Questions

GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention

Jul 19, 2024

Nikola Janjusevic, Amirhossein Khalilian-Gourtani, Adeen Flinker, Li Feng, Yao Wang

Figure 1 for GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention

Figure 2 for GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention

Figure 3 for GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention

Figure 4 for GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention

Abstract:Nonlocal self-similarity within images has become an increasingly popular prior in deep-learning models. Despite their successful image restoration performance, such models remain largely uninterpretable due to their black-box construction. Our previous studies have shown that interpretable construction of a fully convolutional denoiser (CDLNet), with performance on par with state-of-the-art black-box counterparts, is achievable by unrolling a convolutional dictionary learning algorithm. In this manuscript, we seek an interpretable construction of a convolutional network with a nonlocal self-similarity prior that performs on par with black-box nonlocal models. We show that such an architecture can be effectively achieved by upgrading the L1 sparsity prior (soft-thresholding) of CDLNet to an image-adaptive group-sparsity prior (group-thresholding). The proposed learned group-thresholding makes use of nonlocal attention to perform spatially varying soft-thresholding on the latent representation. To enable effective training and inference on large images with global artifacts, we propose a novel circulant-sparse attention. We achieve competitive natural-image denoising performance compared to black-box nonlocal DNNs and transformers. The interpretable construction of our network allows for a straightforward extension to Compressed Sensing MRI (CS-MRI), yielding state-of-the-art performance. Lastly, we show robustness to noise-level mismatches between training and inference for denoising and CS-MRI reconstruction.

* 13 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:2306.01950

Via

Access Paper or Ask Questions

Standard compliant video coding using low complexity, switchable neural wrappers

Jul 10, 2024

Yueyu Hu, Chenhao Zhang, Onur G. Guleryuz, Debargha Mukherjee, Yao Wang

Abstract:The proliferation of high resolution videos posts great storage and bandwidth pressure on cloud video services, driving the development of next-generation video codecs. Despite great progress made in neural video coding, existing approaches are still far from economical deployment considering the complexity and rate-distortion performance tradeoff. To clear the roadblocks for neural video coding, in this paper we propose a new framework featuring standard compatibility, high performance, and low decoding complexity. We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video codec, to encode videos at different resolutions. The rate-distorion optimal downsampling ratio is signaled to the decoder at the per-sequence level for each target rate. We design a low complexity neural post-processor architecture that can handle different upsampling ratios. The change of resolution exploits the spatial redundancy in high-resolution videos, while the neural wrapper further achieves rate-distortion performance improvement through end-to-end optimization with a codec proxy. Our light-weight post-processor architecture has a complexity of 516 MACs / pixel, and achieves 9.3% BD-Rate reduction over VVC on the UVG dataset, and 6.4% on AOM CTC Class A1. Our approach has the potential to further advance the performance of the latest video coding standards using neural processing with minimal added complexity.

* Accepted by IEEE ICIP 2024

Via

Access Paper or Ask Questions

Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates

Jul 05, 2024

Ryotaro Okabe, Mouyang Cheng, Abhijatmedhi Chotrattanapituk, Nguyen Tuan Hung, Xiang Fu, Bowen Han, Yao Wang, Weiwei Xie, Robert J. Cava, Tommi S. Jaakkola(+2 more)

Abstract:Billions of organic molecules are known, but only a tiny fraction of the functional inorganic materials have been discovered, a particularly relevant problem to the community searching for new quantum materials. Recent advancements in machine-learning-based generative models, particularly diffusion models, show great promise for generating new, stable materials. However, integrating geometric patterns into materials generation remains a challenge. Here, we introduce Structural Constraint Integration in the GENerative model (SCIGEN). Our approach can modify any trained generative diffusion model by strategic masking of the denoised structure with a diffused constrained structure prior to each diffusion step to steer the generation toward constrained outputs. Furthermore, we mathematically prove that SCIGEN effectively performs conditional sampling from the original distribution, which is crucial for generating stable constrained materials. We generate eight million compounds using Archimedean lattices as prototype constraints, with over 10% surviving a multi-staged stability pre-screening. High-throughput density functional theory (DFT) on 26,000 survived compounds shows that over 50% passed structural optimization at the DFT level. Since the properties of quantum materials are closely related to geometric patterns, our results indicate that SCIGEN provides a general framework for generating quantum materials candidates.

* 512 pages total, 4 main figures + 218 supplementary figures

Via

Access Paper or Ask Questions

Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering

Jun 09, 2024

Yueyu Hu, Ran Gong, Yao Wang

Figure 1 for Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering

Figure 2 for Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering

Figure 3 for Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering

Figure 4 for Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering

Abstract:Point cloud is a promising 3D representation for volumetric streaming in emerging AR/VR applications. Despite recent advances in point cloud compression, decoding and rendering high-quality images from lossy compressed point clouds is still challenging in terms of quality and complexity, making it a major roadblock to achieve real-time 6-Degree-of-Freedom video streaming. In this paper, we address this problem by developing a point cloud compression scheme that generates a bit stream that can be directly decoded to renderable 3D Gaussians. The encoder and decoder are jointly optimized to consider both bit-rates and rendering quality. It significantly improves the rendering quality while substantially reducing decoding and rendering time, compared to existing point cloud compression methods. Furthermore, the proposed scheme generates a scalable bit stream, allowing multiple levels of details at different bit-rate ranges. Our method supports real-time color decoding and rendering of high quality point clouds, thus paving the way for interactive 3D streaming applications with free view points.

Via

Access Paper or Ask Questions

PlantTracing: Tracing Arabidopsis Thaliana Apex with CenterTrack

May 18, 2024

Yuanzhe Liu, Yixiang Mao, Yao Wang

Abstract:This work applies an encoder-decoder-based machine learning network to detect and track the motion and growth of the flowering stem apex of Arabidopsis Thaliana. Based on the CenterTrack, a machine learning back-end network, we trained a model based on ten time-lapsed labeled videos and tested against three videos.

* 4 pages, 13 figures

Via

Access Paper or Ask Questions

A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

May 14, 2024

Yao Wang, Xin Liu, Weikun Kong, Hai-Tao Yu, Teeradaj Racharak, Kyoung-Sook Kim, Minh Le Nguyen

Figure 1 for A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

Figure 2 for A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

Figure 3 for A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

Figure 4 for A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

Abstract:Named Entity Recognition and Relation Extraction are two crucial and challenging subtasks in the field of Information Extraction. Despite the successes achieved by the traditional approaches, fundamental research questions remain open. First, most recent studies use parameter sharing for a single subtask or shared features for both two subtasks, ignoring their semantic differences. Second, information interaction mainly focuses on the two subtasks, leaving the fine-grained informtion interaction among the subtask-specific features of encoding subjects, relations, and objects unexplored. Motivated by the aforementioned limitations, we propose a novel model to jointly extract entities and relations. The main novelties are as follows: (1) We propose to decouple the feature encoding process into three parts, namely encoding subjects, encoding objects, and encoding relations. Thanks to this, we are able to use fine-grained subtask-specific features. (2) We propose novel inter-aggregation and intra-aggregation strategies to enhance the information interaction and construct individual fine-grained subtask-specific features, respectively. The experimental results demonstrate that our model outperforms several previous state-of-the-art models. Extensive additional experiments further confirm the effectiveness of our model.

Via

Access Paper or Ask Questions

Socially Adaptive Path Planning Based on Generative Adversarial Network

Apr 29, 2024

Yao Wang, Yuqi Kong, Wenzheng Chi, Lining Sun

Figure 1 for Socially Adaptive Path Planning Based on Generative Adversarial Network

Figure 2 for Socially Adaptive Path Planning Based on Generative Adversarial Network

Figure 3 for Socially Adaptive Path Planning Based on Generative Adversarial Network

Figure 4 for Socially Adaptive Path Planning Based on Generative Adversarial Network

Abstract:The natural interaction between robots and pedestrians in the process of autonomous navigation is crucial for the intelligent development of mobile robots, which requires robots to fully consider social rules and guarantee the psychological comfort of pedestrians. Among the research results in the field of robotic path planning, the learning-based socially adaptive algorithms have performed well in some specific human-robot interaction environments. However, human-robot interaction scenarios are diverse and constantly changing in daily life, and the generalization of robot socially adaptive path planning remains to be further investigated. In order to address this issue, this work proposes a new socially adaptive path planning algorithm by combining the generative adversarial network (GAN) with the Optimal Rapidly-exploring Random Tree (RRT*) navigation algorithm. Firstly, a GAN model with strong generalization performance is proposed to adapt the navigation algorithm to more scenarios. Secondly, a GAN model based Optimal Rapidly-exploring Random Tree navigation algorithm (GAN-RRT*) is proposed to generate paths in human-robot interaction environments. Finally, we propose a socially adaptive path planning framework named GAN-RTIRL, which combines the GAN model with Rapidly-exploring random Trees Inverse Reinforcement Learning (RTIRL) to improve the homotopy rate between planned and demonstration paths. In the GAN-RTIRL framework, the GAN-RRT* path planner can update the GAN model from the demonstration path. In this way, the robot can generate more anthropomorphic paths in human-robot interaction environments and has stronger generalization in more complex environments. Experimental results reveal that our proposed method can effectively improve the anthropomorphic degree of robot motion planning and the homotopy rate between planned and demonstration paths.

Via

Access Paper or Ask Questions

Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems

Apr 23, 2024

Xiaoshuang Chen, Gengrui Zhang, Yao Wang, Yulin Wu, Shuo Su, Kaiqiao Zhan, Ben Wang

Abstract:Modern large-scale recommender systems are built upon computation-intensive infrastructure and usually suffer from a huge difference in traffic between peak and off-peak periods. In peak periods, it is challenging to perform real-time computation for each request due to the limited budget of computational resources. The recommendation with a cache is a solution to this problem, where a user-wise result cache is used to provide recommendations when the recommender system cannot afford a real-time computation. However, the cached recommendations are usually suboptimal compared to real-time computation, and it is challenging to determine the items in the cache for each user. In this paper, we provide a cache-aware reinforcement learning (CARL) method to jointly optimize the recommendation by real-time computation and by the cache. We formulate the problem as a Markov decision process with user states and a cache state, where the cache state represents whether the recommender system performs recommendations by real-time computation or by the cache. The computational load of the recommender system determines the cache state. We perform reinforcement learning based on such a model to improve user engagement over multiple requests. Moreover, we show that the cache will introduce a challenge called critic dependency, which deteriorates the performance of reinforcement learning. To tackle this challenge, we propose an eigenfunction learning (EL) method to learn independent critics for CARL. Experiments show that CARL can significantly improve the users' engagement when considering the result cache. CARL has been fully launched in Kwai app, serving over 100 million users.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Apr 17, 2024

Long Cao, Liwei Ge, Daochi Zhang, Xiang Li, Yao Wang, Rui-Xue Xu, YiJing Yan, Xiao Zheng

Figure 1 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 2 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 3 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 4 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Abstract:Simulating the dynamics of open quantum systems coupled to non-Markovian environments remains an outstanding challenge due to exponentially scaling computational costs. We present an artificial intelligence strategy to overcome this obstacle by integrating the neural quantum states approach into the dissipaton-embedded quantum master equation in second quantization (DQME-SQ). Our approach utilizes restricted Boltzmann machines (RBMs) to compactly represent the reduced density tensor, explicitly encoding the combined effects of system-environment correlations and nonMarkovian memory. Applied to model systems exhibiting prominent effects of system-environment correlation and non-Markovian memory, our approach achieves comparable accuracy to conventional hierarchical equations of motion, while requiring significantly fewer dynamical variables. The novel RBM-based DQME-SQ approach paves the way for investigating non-Markovian open quantum dynamics in previously intractable regimes, with implications spanning various frontiers of modern science.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions