Alert button
Picture for Fuyang Li

Fuyang Li

Alert button

Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction

Jun 07, 2023
Chen Min, Xinli Xu, Fuyang Li, Shubin Si, Hanzhang Xue, Weizhong Jiang, Zhichao Zhang, Jimei Li, Dawei Zhao, Liang Xiao, Jiaolong Xu, Yiming Nie, Bin Dai

Figure 1 for Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction
Figure 2 for Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction
Figure 3 for Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction
Figure 4 for Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction

Multi-camera 3D perception has emerged as a prominent research field in autonomous driving, offering a viable and cost-effective alternative to LiDAR-based solutions. However, existing multi-camera algorithms primarily rely on monocular image pre-training, which overlooks the spatial and temporal correlations among different camera views. To address this limitation, we propose the first multi-camera unified pre-training framework called Occ-BEV, which involves initially reconstructing the 3D scene as the foundational stage and subsequently fine-tuning the model on downstream tasks. Specifically, a 3D decoder is designed for leveraging Bird's Eye View (BEV) features from multi-view images to predict the 3D geometric occupancy to enable the model to capture a more comprehensive understanding of the 3D environment. A significant benefit of Occ-BEV is its capability of utilizing a considerable volume of unlabeled image-LiDAR pairs for pre-training purposes. The proposed multi-camera unified pre-training framework demonstrates promising results in key tasks such as multi-camera 3D object detection and surrounding semantic scene completion. When compared to monocular pre-training methods on the nuScenes dataset, Occ-BEV shows a significant improvement of about 2.0% in mAP and 2.0% in NDS for multi-camera 3D object detection, as well as a 3% increase in mIoU for surrounding semantic scene completion. Codes are publicly available at https://github.com/chaytonmin/Occ-BEV.

* 8 pages, 5 figures 
Viaarxiv icon

A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process

Oct 30, 2022
Fuyang Li, Jiying Zhang, Xi Xiao, Bin Zhang, Dijun Luo

Figure 1 for A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process
Figure 2 for A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process
Figure 3 for A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process
Figure 4 for A Simple Hypergraph Kernel Convolution based on Discounted Markov Diffusion Process

Kernels on discrete structures evaluate pairwise similarities between objects which capture semantics and inherent topology information. Existing kernels on discrete structures are only developed by topology information(such as adjacency matrix of graphs), without considering original attributes of objects. This paper proposes a two-phase paradigm to aggregate comprehensive information on discrete structures leading to a Discount Markov Diffusion Learnable Kernel (DMDLK). Specifically, based on the underlying projection of DMDLK, we design a Simple Hypergraph Kernel Convolution (SHKC) for hidden representation of vertices. SHKC can adjust diffusion steps rather than stacking convolution layers to aggregate information from long-range neighborhoods which prevents over-smoothing issues of existing hypergraph convolutions. Moreover, we utilize the uniform stability bound theorem in transductive learning to analyze critical factors for the effectiveness and generalization ability of SHKC from a theoretical perspective. The experimental results on several benchmark datasets for node classification tasks verified the superior performance of SHKC over state-of-the-art methods.

* Accepted by NeurIPS 2022 New Frontiers in Graph Learning Workshop 
Viaarxiv icon

Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs

Apr 06, 2022
Jiying Zhang, Fuyang Li, Xi Xiao, Tingyang Xu, Yu Rong, Junzhou Huang, Yatao Bian

Figure 1 for Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs
Figure 2 for Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs
Figure 3 for Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs
Figure 4 for Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs

As a powerful tool for modeling complex relationships, hypergraphs are gaining popularity from the graph learning community. However, commonly used frameworks in deep hypergraph learning focus on hypergraphs with \textit{edge-independent vertex weights}(EIVWs), without considering hypergraphs with \textit{edge-dependent vertex weights} (EDVWs) that have more modeling power. To compensate for this, in this paper, we present General Hypergraph Spectral Convolution(GHSC), a general learning framework that not only can handle EDVW and EIVW hypergraphs, but more importantly, enables theoretically explicitly utilizing the existing powerful Graph Convolutional Neural Networks (GCNNs) such that largely ease the design of Hypergraph Neural Networks. In this framework, the graph Laplacian of the given undirected GCNNs is replaced with a unified hypergraph Laplacian that incorporates vertex weight information from a random walk perspective by equating our defined generalized hypergraphs with simple undirected graphs. Extensive experiments from various domains including social network analysis, visual objective classification, protein learning demonstrate that the proposed framework can achieve state-of-the-art performance.

Viaarxiv icon