Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junming Zhang

Design and Control of a Perching Drone Inspired by the Prey-Capturing Mechanism of Venus Flytrap

Sep 16, 2025

Ye Li, Daming Liu, Yanhe Zhu, Junming Zhang, Yongsheng Luo, Ziqi Wang, Chenyu Liu, Jie Zhao

Abstract:The endurance and energy efficiency of drones remain critical challenges in their design and operation. To extend mission duration, numerous studies explored perching mechanisms that enable drones to conserve energy by temporarily suspending flight. This paper presents a new perching drone that utilizes an active flexible perching mechanism inspired by the rapid predation mechanism of the Venus flytrap, achieving perching in less than 100 ms. The proposed system is designed for high-speed adaptability to the perching targets. The overall drone design is outlined, followed by the development and validation of the biomimetic perching structure. To enhance the system stability, a cascade extended high-gain observer (EHGO) based control method is developed, which can estimate and compensate for the external disturbance in real time. The experimental results demonstrate the adaptability of the perching structure and the superiority of the cascaded EHGO in resisting wind and perching disturbances.

Via

Access Paper or Ask Questions

FDC-Net: Rethinking the association between EEG artifact removal and multi-dimensional affective computing

Aug 07, 2025

Wenjia Dong, Xueyuan Xu, Tianze Yu, Junming Zhang, Li Zhuo

Abstract:Electroencephalogram (EEG)-based emotion recognition holds significant value in affective computing and brain-computer interfaces. However, in practical applications, EEG recordings are susceptible to the effects of various physiological artifacts. Current approaches typically treat denoising and emotion recognition as independent tasks using cascaded architectures, which not only leads to error accumulation, but also fails to exploit potential synergies between these tasks. Moreover, conventional EEG-based emotion recognition models often rely on the idealized assumption of "perfectly denoised data", lacking a systematic design for noise robustness. To address these challenges, a novel framework that deeply couples denoising and emotion recognition tasks is proposed for end-to-end noise-robust emotion recognition, termed as Feedback-Driven Collaborative Network for Denoising-Classification Nexus (FDC-Net). Our primary innovation lies in establishing a dynamic collaborative mechanism between artifact removal and emotion recognition through: (1) bidirectional gradient propagation with joint optimization strategies; (2) a gated attention mechanism integrated with frequency-adaptive Transformer using learnable band-position encoding. Two most popular EEG-based emotion datasets (DEAP and DREAMER) with multi-dimensional emotional labels were employed to compare the artifact removal and emotion recognition performance between ASLSL and nine state-of-the-art methods. In terms of the denoising task, FDC-Net obtains a maximum correlation coefficient (CC) value of 96.30% on DEAP and a maximum CC value of 90.31% on DREAMER. In terms of the emotion recognition task under physiological artifact interference, FDC-Net achieves emotion recognition accuracies of 82.3+7.1% on DEAP and 88.1+0.8% on DREAMER.

Via

Access Paper or Ask Questions

ADSEL: Adaptive dual self-expression learning for EEG feature selection via incomplete multi-dimensional emotional tagging

Aug 07, 2025

Tianze Yu, Junming Zhang, Wenjia Dong, Xueyuan Xu, Li Zhuo

Abstract:EEG based multi-dimension emotion recognition has attracted substantial research interest in human computer interfaces. However, the high dimensionality of EEG features, coupled with limited sample sizes, frequently leads to classifier overfitting and high computational complexity. Feature selection constitutes a critical strategy for mitigating these challenges. Most existing EEG feature selection methods assume complete multi-dimensional emotion labels. In practice, open acquisition environment, and the inherent subjectivity of emotion perception often result in incomplete label data, which can compromise model generalization. Additionally, existing feature selection methods for handling incomplete multi-dimensional labels primarily focus on correlations among various dimensions during label recovery, neglecting the correlation between samples in the label space and their interaction with various dimensions. To address these issues, we propose a novel incomplete multi-dimensional feature selection algorithm for EEG-based emotion recognition. The proposed method integrates an adaptive dual self-expression learning (ADSEL) with least squares regression. ADSEL establishes a bidirectional pathway between sample-level and dimension-level self-expression learning processes within the label space. It could facilitate the cross-sharing of learned information between these processes, enabling the simultaneous exploitation of effective information across both samples and dimensions for label reconstruction. Consequently, ADSEL could enhances label recovery accuracy and effectively identifies the optimal EEG feature subset for multi-dimensional emotion recognition.

Via

Access Paper or Ask Questions

Hyperspherical Embedding for Point Cloud Completion

Jul 11, 2023

Junming Zhang, Haomeng Zhang, Ram Vasudevan, Matthew Johnson-Roberson

Abstract:Most real-world 3D measurements from depth sensors are incomplete, and to address this issue the point cloud completion task aims to predict the complete shapes of objects from partial observations. Previous works often adapt an encoder-decoder architecture, where the encoder is trained to extract embeddings that are used as inputs to generate predictions from the decoder. However, the learned embeddings have sparse distribution in the feature space, which leads to worse generalization results during testing. To address these problems, this paper proposes a hyperspherical module, which transforms and normalizes embeddings from the encoder to be on a unit hypersphere. With the proposed module, the magnitude and direction of the output hyperspherical embedding are decoupled and only the directional information is optimized. We theoretically analyze the hyperspherical embedding and show that it enables more stable training with a wider range of learning rates and more compact embedding distributions. Experiment results show consistent improvement of point cloud completion in both single-task and multi-task learning, which demonstrates the effectiveness of the proposed method.

Via

Access Paper or Ask Questions

Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Jan 02, 2021

Junming Zhang, Ming-Yuan Yu, Ram Vasudevan, Matthew Johnson-Roberson

Figure 1 for Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Figure 2 for Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Figure 3 for Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Figure 4 for Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Abstract:Point cloud analysis is an area of increasing interest due to the development of 3D sensors that are able to rapidly measure the depth of scenes accurately. Unfortunately, applying deep learning techniques to perform point cloud analysis is non-trivial due to the inability of these methods to generalize to unseen rotations. To address this limitation, one usually has to augment the training data, which can lead to extra computation and require larger model complexity. This paper proposes a new neural network called the Aligned Edge Convolutional Neural Network (AECNN) that learns a feature representation of point clouds relative to Local Reference Frames (LRFs) to ensure invariance to rotation. In particular, features are learned locally and aligned with respect to the LRF of an automatically computed reference point. The proposed approach is evaluated on point cloud classification and part segmentation tasks. This paper illustrates that the proposed technique outperforms a variety of state of the art approaches (even those trained on augmented datasets) in terms of robustness to rotation without requiring any additional data augmentation.

* 3D Vision Conference 2020

Via

Access Paper or Ask Questions

Point Set Voting for Partial Point Cloud Analysis

Jul 09, 2020

Junming Zhang, Weijia Chen, Yuping Wang, Ram Vasudevan, Matthew Johnson-Roberson

Figure 1 for Point Set Voting for Partial Point Cloud Analysis

Figure 2 for Point Set Voting for Partial Point Cloud Analysis

Figure 3 for Point Set Voting for Partial Point Cloud Analysis

Figure 4 for Point Set Voting for Partial Point Cloud Analysis

Abstract:The continual improvement of 3D sensors has driven the development of algorithms to perform point cloud analysis. In fact, techniques for point cloud classification and segmentation have in recent years achieved incredible performance driven in part by leveraging large synthetic datasets. Unfortunately these same state-of-the-art approaches perform poorly when applied to incomplete point clouds. This limitation of existing algorithms is particularly concerning since point clouds generated by 3D sensors in the real world are usually incomplete due to perspective view or occlusion by other objects. This paper proposes a general model for partial point clouds analysis wherein the latent feature encoding a complete point clouds is inferred by applying a local point set voting strategy. In particular, each local point set constructs a vote that corresponds to a distribution in the latent space, and the optimal latent feature is the one with the highest probability. This approach ensures that any subsequent point cloud analysis is robust to partial observation while simultaneously guaranteeing that the proposed model is able to output multiple possible results. This paper illustrates that this proposed method achieves state-of-the-art performance on shape classification, part segmentation and point cloud completion.

Via

Access Paper or Ask Questions

LiStereo: Generate Dense Depth Maps from LIDAR and Stereo Imagery

May 07, 2019

Junming Zhang, Manikandasriram Srinivasan Ramanagopalg, Ram Vasudevan, Matthew Johnson-Roberson

Figure 1 for LiStereo: Generate Dense Depth Maps from LIDAR and Stereo Imagery

Figure 2 for LiStereo: Generate Dense Depth Maps from LIDAR and Stereo Imagery

Figure 3 for LiStereo: Generate Dense Depth Maps from LIDAR and Stereo Imagery

Figure 4 for LiStereo: Generate Dense Depth Maps from LIDAR and Stereo Imagery

Abstract:An accurate depth map of the environment is critical to the safe operation of autonomous robots and vehicles. Currently, either light detection and ranging (LIDAR) or stereo matching algorithms are used to acquire such depth information. However, a high-resolution LIDAR is expensive and produces sparse depth map at large range; stereo matching algorithms are able to generate denser depth maps but are typically less accurate than LIDAR at long range. This paper combines these approaches together to generate high-quality dense depth maps. Unlike previous approaches that are trained using ground-truth labels, the proposed model adopts a self-supervised training process. Experiments show that the proposed method is able to generate high-quality dense depth maps and performs robustly even with low-resolution inputs. This shows the potential to reduce the cost by using LIDARs with lower resolution in concert with stereo systems while maintaining high resolution.

* 14 pages, 3 figures, 5 tables

Via

Access Paper or Ask Questions

DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Sep 13, 2018

Junming Zhang, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson

Figure 1 for DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Figure 2 for DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Figure 3 for DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Figure 4 for DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation from Stereo Imagery

Abstract:Recent work has shown that convolutional neural networks (CNNs) can be applied successfully in disparity estimation, but these methods still suffer from errors in regions of low-texture, occlusions and reflections. Concurrently, deep learning for semantic segmentation has shown great progress in recent years. In this paper, we design a CNN architecture that combines these two tasks to improve the quality and accuracy of disparity estimation with the help of semantic segmentation. Specifically, we propose a network structure in which these two tasks are highly coupled. One key novelty of this approach is the two-stage refinement process. Initial disparity estimates are refined with an embedding learned from the semantic segmentation branch of the network. The proposed model is trained using an unsupervised approach, in which images from one half of the stereo pair are warped and compared against images from the other camera. Another key advantage of the proposed approach is that a single network is capable of outputting disparity estimates and semantic labels. These outputs are of great use in autonomous vehicle operation; with real-time constraints being key, such performance improvements increase the viability of driving applications. Experiments on KITTI and Cityscapes datasets show that our model can achieve state-of-the-art results and that leveraging embedding learned from semantic segmentation improves the performance of disparity estimation.

* 8 pages, 4 figures, 4 tables

Via

Access Paper or Ask Questions