Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xin Hu

A Multi-purpose Real Haze Benchmark with Quantifiable Haze Levels and Ground Truth

Jun 13, 2022

Priya Narayanan, Xin Hu, Zhenyu Wu, Matthew D Thielke, John G Rogers, Andre V Harrison, John A D'Agostino, James D Brown, Long P Quang, James R Uplinger(+2 more)

Figure 1 for A Multi-purpose Real Haze Benchmark with Quantifiable Haze Levels and Ground Truth

Figure 2 for A Multi-purpose Real Haze Benchmark with Quantifiable Haze Levels and Ground Truth

Figure 3 for A Multi-purpose Real Haze Benchmark with Quantifiable Haze Levels and Ground Truth

Figure 4 for A Multi-purpose Real Haze Benchmark with Quantifiable Haze Levels and Ground Truth

Abstract:Imagery collected from outdoor visual environments is often degraded due to the presence of dense smoke or haze. A key challenge for research in scene understanding in these degraded visual environments (DVE) is the lack of representative benchmark datasets. These datasets are required to evaluate state-of-the-art object recognition and other computer vision algorithms in degraded settings. In this paper, we address some of these limitations by introducing the first paired real image benchmark dataset with hazy and haze-free images, and in-situ haze density measurements. This dataset was produced in a controlled environment with professional smoke generating machines that covered the entire scene, and consists of images captured from the perspective of both an unmanned aerial vehicle (UAV) and an unmanned ground vehicle (UGV). We also evaluate a set of representative state-of-the-art dehazing approaches as well as object detectors on the dataset. The full dataset presented in this paper, including the ground truth object classification bounding boxes and haze density measurements, is provided for the community to evaluate their algorithms at: https://a2i2-archangel.vision. A subset of this dataset has been used for the Object Detection in Haze Track of CVPR UG2 2022 challenge.

Via

Access Paper or Ask Questions

E^2TAD: An Energy-Efficient Tracking-based Action Detector

Apr 09, 2022

Xin Hu, Zhenyu Wu, Hao-Yu Miao, Siqi Fan, Taiyu Long, Zhenyu Hu, Pengcheng Pi, Yi Wu, Zhou Ren, Zhangyang Wang(+1 more)

Figure 1 for E^2TAD: An Energy-Efficient Tracking-based Action Detector

Figure 2 for E^2TAD: An Energy-Efficient Tracking-based Action Detector

Figure 3 for E^2TAD: An Energy-Efficient Tracking-based Action Detector

Figure 4 for E^2TAD: An Energy-Efficient Tracking-based Action Detector

Abstract:Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays. It has high practical impacts for many applications across robotics, security, healthcare, etc. The two-stage paradigm of Faster R-CNN inspires a standard paradigm of video action detection in object detection, i.e., firstly generating person proposals and then classifying their actions. However, none of the existing solutions could provide fine-grained action detection to the "who-when-where-what" level. This paper presents a tracking-based solution to accurately and efficiently localize predefined key actions spatially (by predicting the associated target IDs and locations) and temporally (by predicting the time in exact frame indices). This solution won first place in the UAV-Video Track of 2021 Low-Power Computer Vision Challenge (LPCVC).

Via

Access Paper or Ask Questions

Learning First-Order Rules with Relational Path Contrast for Inductive Relation Reasoning

Oct 17, 2021

Yudai Pan, Jun Liu, Lingling Zhang, Xin Hu, Tianzhe Zhao, Qika Lin

Figure 1 for Learning First-Order Rules with Relational Path Contrast for Inductive Relation Reasoning

Figure 2 for Learning First-Order Rules with Relational Path Contrast for Inductive Relation Reasoning

Figure 3 for Learning First-Order Rules with Relational Path Contrast for Inductive Relation Reasoning

Figure 4 for Learning First-Order Rules with Relational Path Contrast for Inductive Relation Reasoning

Abstract:Relation reasoning in knowledge graphs (KGs) aims at predicting missing relations in incomplete triples, whereas the dominant paradigm is learning the embeddings of relations and entities, which is limited to a transductive setting and has restriction on processing unseen entities in an inductive situation. Previous inductive methods are scalable and consume less resource. They utilize the structure of entities and triples in subgraphs to own inductive ability. However, in order to obtain better reasoning results, the model should acquire entity-independent relational semantics in latent rules and solve the deficient supervision caused by scarcity of rules in subgraphs. To address these issues, we propose a novel graph convolutional network (GCN)-based approach for interpretable inductive reasoning with relational path contrast, named RPC-IR. RPC-IR firstly extracts relational paths between two entities and learns representations of them, and then innovatively introduces a contrastive strategy by constructing positive and negative relational paths. A joint training strategy considering both supervised and contrastive information is also proposed. Comprehensive experiments on three inductive datasets show that RPC-IR achieves outstanding performance comparing with the latest inductive reasoning methods and could explicitly represent logical rules for interpretability.

* This work is going to be submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition

Sep 20, 2021

Xinke Shen, Xianggen Liu, Xin Hu, Dan Zhang, Sen Song

Figure 1 for Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition

Figure 2 for Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition

Figure 3 for Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition

Figure 4 for Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition

Abstract:Emotion recognition plays a vital role in human-machine interactions and daily healthcare. EEG signals have been reported to be informative and reliable for emotion recognition in recent years. However, the inter-subject variability of emotion-related EEG signals poses a great challenge for the practical use of EEG-based emotion recognition. Inspired by the recent neuroscience studies on inter-subject correlation, we proposed a Contrastive Learning method for Inter-Subject Alignment (CLISA) for reliable cross-subject emotion recognition. Contrastive learning was employed to minimize the inter-subject differences by maximizing the similarity in EEG signals across subjects when they received the same stimuli in contrast to different ones. Specifically, a convolutional neural network with depthwise spatial convolution and temporal convolution layers was applied to learn inter-subject aligned spatiotemporal representations from raw EEG signals. Then the aligned representations were used to extract differential entropy features for emotion classification. The performance of the proposed method was evaluated on our THU-EP dataset with 80 subjects and the publicly available SEED dataset with 15 subjects. Comparable or better cross-subject emotion recognition accuracy (i.e., 72.1% and 47.0% for binary and nine-class classification, respectively, on the THU-EP dataset and 86.3% on the SEED dataset for three-class classification) was achieved as compared to the state-of-the-art methods. The proposed method could be generalized well to unseen emotional stimuli as well. The CLISA method is therefore expected to considerably increase the practicality of EEG-based emotion recognition by operating in a "plug-and-play" manner. Furthermore, the learned spatiotemporal representations by CLISA could provide insights into the neural mechanisms of human emotion processing.

* 17 pages, 14 figures, journal paper

Via

Access Paper or Ask Questions

A Feature Fusion-Net Using Deep Spatial Context Encoder and Nonstationary Joint Statistical Model for High Resolution SAR Image Classification

May 11, 2021

Wenkai Liang, Yan Wu, Ming Li, Peng Zhang, Yice Cao, Xin Hu

Figure 1 for A Feature Fusion-Net Using Deep Spatial Context Encoder and Nonstationary Joint Statistical Model for High Resolution SAR Image Classification

Figure 2 for A Feature Fusion-Net Using Deep Spatial Context Encoder and Nonstationary Joint Statistical Model for High Resolution SAR Image Classification

Figure 3 for A Feature Fusion-Net Using Deep Spatial Context Encoder and Nonstationary Joint Statistical Model for High Resolution SAR Image Classification

Figure 4 for A Feature Fusion-Net Using Deep Spatial Context Encoder and Nonstationary Joint Statistical Model for High Resolution SAR Image Classification

Abstract:Convolutional neural networks (CNNs) have been applied to learn spatial features for high-resolution (HR) synthetic aperture radar (SAR) image classification. However, there has been little work on integrating the unique statistical distributions of SAR images which can reveal physical properties of terrain objects, into CNNs in a supervised feature learning framework. To address this problem, a novel end-to-end supervised classification method is proposed for HR SAR images by considering both spatial context and statistical features. First, to extract more effective spatial features from SAR images, a new deep spatial context encoder network (DSCEN) is proposed, which is a lightweight structure and can be effectively trained with a small number of samples. Meanwhile, to enhance the diversity of statistics, the nonstationary joint statistical model (NS-JSM) is adopted to form the global statistical features. Specifically, SAR images are transformed into the Gabor wavelet domain and the produced multi-subbands magnitudes and phases are modeled by the log-normal and uniform distribution. The covariance matrix is further utilized to capture the inter-scale and intra-scale nonstationary correlation between the statistical subbands and make the joint statistical features more compact and distinguishable. Considering complementary advantages, a feature fusion network (Fusion-Net) base on group compression and smooth normalization is constructed to embed the statistical features into the spatial features and optimize the fusion feature representation. As a result, our model can learn the discriminative features and improve the final classification performance. Experiments on four HR SAR images validate the superiority of the proposed method over other related algorithms.

* 17 pages,11 figures

Via

Access Paper or Ask Questions

RL-CSDia: Representation Learning of Computer Science Diagrams

Mar 10, 2021

Shaowei Wang, LingLing Zhang, Xuan Luo, Yi Yang, Xin Hu, Jun Liu

Figure 1 for RL-CSDia: Representation Learning of Computer Science Diagrams

Figure 2 for RL-CSDia: Representation Learning of Computer Science Diagrams

Figure 3 for RL-CSDia: Representation Learning of Computer Science Diagrams

Figure 4 for RL-CSDia: Representation Learning of Computer Science Diagrams

Abstract:Recent studies on computer vision mainly focus on natural images that express real-world scenes. They achieve outstanding performance on diverse tasks such as visual question answering. Diagram is a special form of visual expression that frequently appears in the education field and is of great significance for learners to understand multimodal knowledge. Current research on diagrams preliminarily focuses on natural disciplines such as Biology and Geography, whose expressions are still similar to natural images. Another type of diagrams such as from Computer Science is composed of graphics containing complex topologies and relations, and research on this type of diagrams is still blank. The main challenges of graphic diagrams understanding are the rarity of data and the confusion of semantics, which are mainly reflected in the diversity of expressions. In this paper, we construct a novel dataset of graphic diagrams named Computer Science Diagrams (CSDia). It contains more than 1,200 diagrams and exhaustive annotations of objects and relations. Considering the visual noises caused by the various expressions in diagrams, we introduce the topology of diagrams to parse topological structure. After that, we propose Diagram Parsing Net (DPN) to represent the diagram from three branches: topology, visual feature, and text, and apply the model to the diagram classification task to evaluate the ability of diagrams understanding. The results show the effectiveness of the proposed DPN on diagrams understanding.

Via

Access Paper or Ask Questions

WHU-Hi: UAV-borne hyperspectral with high spatial resolution benchmark datasets for hyperspectral image classification

Dec 27, 2020

Xin Hu, Yanfei Zhong, Chang Luo, Xinyu Wang

Figure 1 for WHU-Hi: UAV-borne hyperspectral with high spatial resolution benchmark datasets for hyperspectral image classification

Figure 2 for WHU-Hi: UAV-borne hyperspectral with high spatial resolution benchmark datasets for hyperspectral image classification

Figure 3 for WHU-Hi: UAV-borne hyperspectral with high spatial resolution benchmark datasets for hyperspectral image classification

Abstract:Classification is an important aspect of hyperspectral images processing and application. At present, the researchers mostly use the classic airborne hyperspectral imagery as the benchmark dataset. However, existing datasets suffer from three bottlenecks: (1) low spatial resolution; (2) low labeled pixels proportion; (3) low degree of subclasses distinction. In this paper, a new benchmark dataset named the Wuhan UAV-borne hyperspectral image (WHU-Hi) dataset was built for hyperspectral image classification. The WHU-Hi dataset with a high spectral resolution (nm level) and a very high spatial resolution (cm level), which we refer to here as H2 imager. Besides, the WHU-Hi dataset has a higher pixel labeling ratio and finer subclasses. Some start-of-art hyperspectral image classification methods benchmarked the WHU-Hi dataset, and the experimental results show that WHU-Hi is a challenging dataset. We hope WHU-Hi dataset can become a strong benchmark to accelerate future research.

* 4 pages, 1 figure

Via

Access Paper or Ask Questions

Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Jul 13, 2020

You Li, Ruizhi Chen, Xiaoji Niu, Yuan Zhuang, Zhouzheng Gao, Xin Hu, Naser El-Sheimy

Figure 1 for Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Figure 2 for Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Figure 3 for Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Figure 4 for Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Abstract:The inertial navigation system (INS) has been widely used to provide self-contained and continuous motion estimation in intelligent transportation systems. Recently, the emergence of chip-level inertial sensors has expanded the relevant applications from positioning, navigation, and mobile mapping to location-based services, unmanned systems, and transportation big data. Meanwhile, benefit from the emergence of big data and the improvement of algorithms and computing power, artificial intelligence (AI) has become a consensus tool that has been successfully applied in various fields. This article reviews the research on using AI technology to enhance inertial sensing from various aspects, including sensor design and selection, calibration and error modeling, navigation and motion-sensing algorithms, multi-sensor information fusion, system evaluation, and practical application. Based on the over 30 representative articles selected from the nearly 300 related publications, this article summarizes the state of the art, advantages, and challenges on each aspect. Finally, it summarizes nine advantages and nine challenges of AI-enhanced inertial sensing and then points out future research directions.

Via

Access Paper or Ask Questions

Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Apr 09, 2020

You Li, Xin Hu, Yuan Zhuang, Zhouzheng Gao, Peng Zhang, Naser El-Sheimy

Figure 1 for Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Figure 2 for Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Figure 3 for Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Figure 4 for Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Abstract:Location is key to spatialize internet-of-things (IoT) data. However, it is challenging to use low-cost IoT devices for robust unsupervised localization (i.e., localization without training data that have known location labels). Thus, this paper proposes a deep reinforcement learning (DRL) based unsupervised wireless-localization method. The main contributions are as follows. (1) This paper proposes an approach to model a continuous wireless-localization process as a Markov decision process (MDP) and process it within a DRL framework. (2) To alleviate the challenge of obtaining rewards when using unlabeled data (e.g., daily-life crowdsourced data), this paper presents a reward-setting mechanism, which extracts robust landmark data from unlabeled wireless received signal strengths (RSS). (3) To ease requirements for model re-training when using DRL for localization, this paper uses RSS measurements together with agent location to construct DRL inputs. The proposed method was tested by using field testing data from multiple Bluetooth 5 smart ear tags in a pasture. Meanwhile, the experimental verification process reflected the advantages and challenges for using DRL in wireless localization.

Via

Access Paper or Ask Questions