Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xudong Zhang

Scalable unsupervised feature selection via weight stability

Jun 06, 2025

Xudong Zhang, Renato Cordeiro de Amorim

Abstract:Unsupervised feature selection is critical for improving clustering performance in high-dimensional data, where irrelevant features can obscure meaningful structure. In this work, we introduce the Minkowski weighted $k$-means++, a novel initialisation strategy for the Minkowski Weighted $k$-means. Our initialisation selects centroids probabilistically using feature relevance estimates derived from the data itself. Building on this, we propose two new feature selection algorithms, FS-MWK++, which aggregates feature weights across a range of Minkowski exponents to identify stable and informative features, and SFS-MWK++, a scalable variant based on subsampling. We support our approach with a theoretical guarantee under mild assumptions and extensive experiments showing that our methods consistently outperform existing alternatives.

Via

Access Paper or Ask Questions

Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Sep 25, 2024

Ning Sun, Yufei Wang, Yuwei Zhang, Jixiang Wan, Shenyue Wang, Ping Liu, Xudong Zhang

Figure 1 for Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Figure 2 for Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Figure 3 for Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Figure 4 for Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Abstract:Human Activity Recognition (HAR) has gained great attention from researchers due to the popularity of mobile devices and the need to observe users' daily activity data for better human-computer interaction. In this work, we collect a human activity recognition dataset called OPPOHAR consisting of phone IMU data. To facilitate the employment of HAR system in mobile phone and to achieve user-specific activity recognition, we propose a novel light-weight network called Non-stationary BERT with a two-stage training method. We also propose a simple yet effective data augmentation method to explore the deeper relationship between the accelerator and gyroscope data from the IMU. The network achieves the state-of-the-art performance testing on various activity recognition datasets and the data augmentation method demonstrates its wide applicability.

Via

Access Paper or Ask Questions

Scale-Translation Equivariant Network for Oceanic Internal Solitary Wave Localization

Jun 18, 2024

Zhang Wan, Shuo Wang, Xudong Zhang

Abstract:Internal solitary waves (ISWs) are gravity waves that are often observed in the interior ocean rather than the surface. They hold significant importance due to their capacity to carry substantial energy, thus influence pollutant transport, oil platform operations, submarine navigation, etc. Researchers have studied ISWs through optical images, synthetic aperture radar (SAR) images, and altimeter data from remote sensing instruments. However, cloud cover in optical remote sensing images variably obscures ground information, leading to blurred or missing surface observations. As such, this paper aims at altimeter-based machine learning solutions to automatically locate ISWs. The challenges, however, lie in the following two aspects: 1) the altimeter data has low resolution, which requires a strong machine learner; 2) labeling data is extremely labor-intensive, leading to very limited data for training. In recent years, the grand progress of deep learning demonstrates strong learning capacity given abundant data. Besides, more recent studies on efficient learning and self-supervised learning laid solid foundations to tackle the aforementioned challenges. In this paper, we propose to inject prior knowledge to achieve a strong and efficient learner. Specifically, intrinsic patterns in altimetry data are efficiently captured using a scale-translation equivariant convolutional neural network (ST-ECNN). By considering inherent symmetries in neural network design, ST-ECNN achieves higher efficiency and better performance than baseline models. Furthermore, we also introduce prior knowledge from massive unsupervised data to enhance our solution using the SimCLR framework for pre-training. Our final solution achieves an overall better performance than baselines on our handcrafted altimetry dataset. Data and codes are available at https://github.com/ZhangWan-byte/Internal_Solitary_Wave_Localization .

* 29 pages, 5 figures

Via

Access Paper or Ask Questions

Monocular Localization with Semantics Map for Autonomous Vehicles

Jun 06, 2024

Jixiang Wan, Xudong Zhang, Shuzhou Dong, Yuwei Zhang, Yuchen Yang, Ruoxi Wu, Ye Jiang, Jijunnan Li, Jinquan Lin, Ming Yang

Figure 1 for Monocular Localization with Semantics Map for Autonomous Vehicles

Figure 2 for Monocular Localization with Semantics Map for Autonomous Vehicles

Figure 3 for Monocular Localization with Semantics Map for Autonomous Vehicles

Figure 4 for Monocular Localization with Semantics Map for Autonomous Vehicles

Abstract:Accurate and robust localization remains a significant challenge for autonomous vehicles. The cost of sensors and limitations in local computational efficiency make it difficult to scale to large commercial applications. Traditional vision-based approaches focus on texture features that are susceptible to changes in lighting, season, perspective, and appearance. Additionally, the large storage size of maps with descriptors and complex optimization processes hinder system performance. To balance efficiency and accuracy, we propose a novel lightweight visual semantic localization algorithm that employs stable semantic features instead of low-level texture features. First, semantic maps are constructed offline by detecting semantic objects, such as ground markers, lane lines, and poles, using cameras or LiDAR sensors. Then, online visual localization is performed through data association of semantic features and map objects. We evaluated our proposed localization framework in the publicly available KAIST Urban dataset and in scenarios recorded by ourselves. The experimental results demonstrate that our method is a reliable and practical localization solution in various autonomous driving localization tasks.

Via

Access Paper or Ask Questions

Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Apr 25, 2024

Shen Zhang, Haojie Zhang, Jing Zhang, Xudong Zhang, Yimeng Zhuang, Jinting Wu

Figure 1 for Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Figure 2 for Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Figure 3 for Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Figure 4 for Samsung Research China-Beijing at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Abstract:In human-computer interaction, it is crucial for agents to respond to human by understanding their emotions. Unraveling the causes of emotions is more challenging. A new task named Multimodal Emotion-Cause Pair Extraction in Conversations is responsible for recognizing emotion and identifying causal expressions. In this study, we propose a multi-stage framework to generate emotion and extract the emotion causal pairs given the target emotion. In the first stage, Llama-2-based InstructERC is utilized to extract the emotion category of each utterance in a conversation. After emotion recognition, a two-stream attention model is employed to extract the emotion causal pairs given the target emotion for subtask 2 while MuTEC is employed to extract causal span for subtask 1. Our approach achieved first place for both of the two subtasks in the competition.

Via

Access Paper or Ask Questions

Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Dec 16, 2023

Lebin Yu, Yunbo Qiu, Quanming Yao, Yuan Shen, Xudong Zhang, Jian Wang

Figure 1 for Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Figure 2 for Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Figure 3 for Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Figure 4 for Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Abstract:Communication in multi-agent reinforcement learning (MARL) has been proven to effectively promote cooperation among agents recently. Since communication in real-world scenarios is vulnerable to noises and adversarial attacks, it is crucial to develop robust communicative MARL technique. However, existing research in this domain has predominantly focused on passive defense strategies, where agents receive all messages equally, making it hard to balance performance and robustness. We propose an active defense strategy, where agents automatically reduce the impact of potentially harmful messages on the final decision. There are two challenges to implement this strategy, that are defining unreliable messages and adjusting the unreliable messages' impact on the final decision properly. To address them, we design an Active Defense Multi-Agent Communication framework (ADMAC), which estimates the reliability of received messages and adjusts their impact on the final decision accordingly with the help of a decomposable decision structure. The superiority of ADMAC over existing methods is validated by experiments in three communication-critical tasks under four types of attacks.

* Accepted by AAAI 2024

Via

Access Paper or Ask Questions

Data Augmentation of Bridging the Delay Gap for DL-based Massive MIMO CSI Feedback

Aug 01, 2023

Hengyu Zhang, Zhilin Lu, Xudong Zhang, Jintao Wang

Abstract:In massive multiple-input multiple-output (MIMO) systems under the frequency division duplexing (FDD) mode, the user equipment (UE) needs to feed channel state information (CSI) back to the base station (BS). Though deep learning approaches have made a hit in the CSI feedback problem, whether they can remain excellent in actual environments needs to be further investigated. In this letter, we point out that the real-time dataset in application often has the domain gap from the training dataset caused by the time delay. To bridge the gap, we propose bubble-shift (B-S) data augmentation, which attempts to offset performance degradation by changing the delay and remaining the channel information as much as possible. Moreover, random-generation (R-G) data augmentation is especially proposed for outdoor scenarios due to the complex distribution of its channels. It generalizes the characteristics of the channel matrix and alleviates the over-fitting problem. Simulation results show that the proposed data augmentation boosts the robustness of networks in both indoor and outdoor environments. The open source codes are available at https://github.com/zhanghy23/CRNet-Aug.

Via

Access Paper or Ask Questions

SGL: Structure Guidance Learning for Camera Localization

Apr 12, 2023

Xudong Zhang, Shuang Gao, Xiaohu Nan, Haikuan Ning, Yuchen Yang, Yishan Ping, Jixiang Wan, Shuzhou Dong, Jijunnan Li, Yandong Guo

Figure 1 for SGL: Structure Guidance Learning for Camera Localization

Figure 2 for SGL: Structure Guidance Learning for Camera Localization

Figure 3 for SGL: Structure Guidance Learning for Camera Localization

Figure 4 for SGL: Structure Guidance Learning for Camera Localization

Abstract:Camera localization is a classical computer vision task that serves various Artificial Intelligence and Robotics applications. With the rapid developments of Deep Neural Networks (DNNs), end-to-end visual localization methods are prosperous in recent years. In this work, we focus on the scene coordinate prediction ones and propose a network architecture named as Structure Guidance Learning (SGL) which utilizes the receptive branch and the structure branch to extract both high-level and low-level features to estimate the 3D coordinates. We design a confidence strategy to refine and filter the predicted 3D observations, which enables us to estimate the camera poses by employing the Perspective-n-Point (PnP) with RANSAC. In the training part, we design the Bundle Adjustment trainer to help the network fit the scenes better. Comparisons with some state-of-the-art (SOTA) methods and sufficient ablation experiments confirm the validity of our proposed architecture.

Via

Access Paper or Ask Questions

Promoting Cooperation in Multi-Agent Reinforcement Learning via Mutual Help

Feb 18, 2023

Yunbo Qiu, Yue Jin, Lebin Yu, Jian Wang, Xudong Zhang

Abstract:Multi-agent reinforcement learning (MARL) has achieved great progress in cooperative tasks in recent years. However, in the local reward scheme, where only local rewards for each agent are given without global rewards shared by all the agents, traditional MARL algorithms lack sufficient consideration of agents' mutual influence. In cooperative tasks, agents' mutual influence is especially important since agents are supposed to coordinate to achieve better performance. In this paper, we propose a novel algorithm Mutual-Help-based MARL (MH-MARL) to instruct agents to help each other in order to promote cooperation. MH-MARL utilizes an expected action module to generate expected other agents' actions for each particular agent. Then, the expected actions are delivered to other agents for selective imitation during training. Experimental results show that MH-MARL improves the performance of MARL both in success rate and cumulative reward.

* Accepted by 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

Via

Access Paper or Ask Questions

Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO Systems

Feb 15, 2023

Zhilin Lu, Xudong Zhang, Rui Zeng, Jintao Wang

Figure 1 for Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO Systems

Figure 2 for Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO Systems

Figure 3 for Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO Systems

Figure 4 for Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO Systems

Abstract:Hybrid beamforming is widely recognized as an important technique for millimeter wave (mmWave) multiple input multiple output (MIMO) systems. Generalized spatial modulation (GSM) is further introduced to improve the spectrum efficiency. However, most of the existing works on beamforming assume the perfect channel state information (CSI), which is unrealistic in practical systems. In this paper, joint optimization of downlink pilot training, channel estimation, CSI feedback, and hybrid beamforming is considered in GSM aided frequency division duplexing (FDD) mmWave MIMO systems. With the help of deep learning, the GSM hybrid beamformers are designed via unsupervised learning in an end-to-end way. Experiments show that the proposed multi-resolution network named GsmEFBNet can reach a better achievable rate with fewer feedback bits compared with the conventional algorithm.

* 4 pages, 4 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice

Via

Access Paper or Ask Questions