Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinge You

FREE: Feature Refinement for Generalized Zero-Shot Learning

Jul 29, 2021

Shiming Chen, Wenjie Wang, Beihao Xia, Qinmu Peng, Xinge You, Feng Zheng, Ling Shao

Figure 1 for FREE: Feature Refinement for Generalized Zero-Shot Learning

Figure 2 for FREE: Feature Refinement for Generalized Zero-Shot Learning

Figure 3 for FREE: Feature Refinement for Generalized Zero-Shot Learning

Figure 4 for FREE: Feature Refinement for Generalized Zero-Shot Learning

Abstract:Generalized zero-shot learning (GZSL) has achieved significant progress, with many efforts dedicated to overcoming the problems of visual-semantic domain gap and seen-unseen bias. However, most existing methods directly use feature extraction models trained on ImageNet alone, ignoring the cross-dataset bias between ImageNet and GZSL benchmarks. Such a bias inevitably results in poor-quality visual features for GZSL tasks, which potentially limits the recognition performance on both seen and unseen classes. In this paper, we propose a simple yet effective GZSL method, termed feature refinement for generalized zero-shot learning (FREE), to tackle the above problem. FREE employs a feature refinement (FR) module that incorporates \textit{semantic$\rightarrow$visual} mapping into a unified generative model to refine the visual features of seen and unseen class samples. Furthermore, we propose a self-adaptive margin center loss (SAMC-loss) that cooperates with a semantic cycle-consistency loss to guide FR to learn class- and semantically-relevant representations, and concatenate the features in FR to extract the fully refined features. Extensive experiments on five benchmark datasets demonstrate the significant performance gain of FREE over its baseline and current state-of-the-art methods. Our codes are available at https://github.com/shiming-chen/FREE .

* ICCV 2021

Via

Access Paper or Ask Questions

MSN: Multi-Style Network for Trajectory Prediction

Jul 02, 2021

Conghao Wong, Beihao Xia, Qinmu Peng, Xinge You

Figure 1 for MSN: Multi-Style Network for Trajectory Prediction

Figure 2 for MSN: Multi-Style Network for Trajectory Prediction

Figure 3 for MSN: Multi-Style Network for Trajectory Prediction

Figure 4 for MSN: Multi-Style Network for Trajectory Prediction

Abstract:It is essential but challenging to predict future trajectories of various agents in complex scenes. Whether it is internal personality factors of agents, interactive behavior of the neighborhood, or the influence of surroundings, it will have an impact on their future behavior styles. It means that even for the same physical type of agents, there are huge differences in their behavior preferences. Although recent works have made significant progress in studying agents' multi-modal plannings, most of them still apply the same prediction strategy to all agents, which makes them difficult to fully show the multiple styles of vast agents. In this paper, we propose the Multi-Style Network (MSN) to focus on this problem by divide agents' preference styles into several hidden behavior categories adaptively and train each category's prediction network separately, therefore giving agents all styles of predictions simultaneously. Experiments demonstrate that our deterministic MSN-D and generative MSN-G outperform many recent state-of-the-art methods and show better multi-style characteristics in the visualized results.

Via

Access Paper or Ask Questions

Adaptive Matching of Kernel Means

Nov 16, 2020

Miao Cheng, Xinge You

Figure 1 for Adaptive Matching of Kernel Means

Figure 2 for Adaptive Matching of Kernel Means

Figure 3 for Adaptive Matching of Kernel Means

Figure 4 for Adaptive Matching of Kernel Means

Abstract:As a promising step, the performance of data analysis and feature learning are able to be improved if certain pattern matching mechanism is available. One of the feasible solutions can refer to the importance estimation of instances, and consequently, kernel mean matching (KMM) has become an important method for knowledge discovery and novelty detection in kernel machines. Furthermore, the existing KMM methods have focused on concrete learning frameworks. In this work, a novel approach to adaptive matching of kernel means is proposed, and selected data with high importance are adopted to achieve calculation efficiency with optimization. In addition, scalable learning can be conducted in proposed method as a generalized solution to matching of appended data. The experimental results on a wide variety of real-world data sets demonstrate the proposed method is able to give outstanding performance compared with several state-of-the-art methods, while calculation efficiency can be preserved.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

Oct 08, 2020

Beihao xia, Conghao Wong, Heng Li, Shiming Chen, Qinmu Peng, Xinge You

Figure 1 for BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

Figure 2 for BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

Figure 3 for BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

Figure 4 for BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

Abstract:Visual images usually contain the informative context of the environment, thereby helping to predict agents' behaviors. However, they hardly impose the dynamic effects on agents' actual behaviors due to the respectively fixed semantics. To solve this problem, we propose a deterministic model named BGM to construct a guidance map to represent the dynamic semantics, which circumvents to use visual images for each agent to reflect the difference of activities in different periods. We first record all agents' activities in the scene within a period close to the current to construct a guidance map and then feed it to a Context CNN to obtain their context features. We adopt a Historical Trajectory Encoder to extract the trajectory features and then combine them with the context feature as the input of the social energy based trajectory decoder, thus obtaining the prediction that meets the social rules. Experiments demonstrate that BGM achieves state-of-the-art prediction accuracy on the two widely used ETH and UCY datasets and handles more complex scenarios.

Via

Access Paper or Ask Questions

CDE-GAN: Cooperative Dual Evolution Based Generative Adversarial Network

Aug 21, 2020

Shiming Chen, Wenjie Wang, Beihao Xia, Xinge You, Zehong Cao, Weiping Ding

Figure 1 for CDE-GAN: Cooperative Dual Evolution Based Generative Adversarial Network

Figure 2 for CDE-GAN: Cooperative Dual Evolution Based Generative Adversarial Network

Figure 3 for CDE-GAN: Cooperative Dual Evolution Based Generative Adversarial Network

Figure 4 for CDE-GAN: Cooperative Dual Evolution Based Generative Adversarial Network

Abstract:Generative adversarial networks (GANs) have been a popular deep generative model for real-word applications. Despite many recent efforts on GANs have been contributed, however, mode collapse and instability of GANs are still open problems caused by their adversarial optimization difficulties. In this paper, motivated by the cooperative co-evolutionary algorithm, we propose a Cooperative Dual Evolution based Generative Adversarial Network (CDE-GAN) to circumvent these drawbacks. In essence, CDE-GAN incorporates dual evolution with respect to generator(s) and discriminators into a unified evolutionary adversarial framework, thus it exploits the complementary properties and injects dual mutation diversity into training to steadily diversify the estimated density in capturing multi-modes, and to improve generative performance. Specifically, CDE-GAN decomposes the complex adversarial optimization problem into two subproblems (generation and discrimination), and each subproblem is solved with a separated subpopulation (E-Generators and EDiscriminators), evolved by an individual evolutionary algorithm. Additionally, to keep the balance between E-Generators and EDiscriminators, we proposed a Soft Mechanism to cooperate them to conduct effective adversarial training. Extensive experiments on one synthetic dataset and three real-world benchmark image datasets, demonstrate that the proposed CDE-GAN achieves the competitive and superior performance in generating good quality and diverse samples over baselines. The code and more generated results are available at our project homepage https://shiming-chen.github.io/CDE-GAN-website/CDE-GAN.html.

* 14 pages,6 figures,4 tables.Submitted to IEEE Transactions on Evolutionary Computation

Via

Access Paper or Ask Questions

Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning

Mar 22, 2020

Jiamiao Xu, Fangzhao Wang, Qinmu Peng, Xinge You, Shuo Wang, Xiao-Yuan Jing, C. L. Philip Chen

Figure 1 for Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning

Figure 2 for Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning

Figure 3 for Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning

Figure 4 for Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning

Abstract:Low-rank Multi-view Subspace Learning (LMvSL) has shown great potential in cross-view classification in recent years. Despite their empirical success, existing LMvSL based methods are incapable of well handling view discrepancy and discriminancy simultaneously, which thus leads to the performance degradation when there is a large discrepancy among multi-view data. To circumvent this drawback, motivated by the block-diagonal representation learning, we propose Structured Low-rank Matrix Recovery (SLMR), a unique method of effectively removing view discrepancy and improving discriminancy through the recovery of structured low-rank matrix. Furthermore, recent low-rank modeling provides a satisfactory solution to address data contaminated by predefined assumptions of noise distribution, such as Gaussian or Laplacian distribution. However, these models are not practical since complicated noise in practice may violate those assumptions and the distribution is generally unknown in advance. To alleviate such limitation, modal regression is elegantly incorporated into the framework of SLMR (term it MR-SLMR). Different from previous LMvSL based methods, our MR-SLMR can handle any zero-mode noise variable that contains a wide range of noise, such as Gaussian noise, random noise and outliers. The alternating direction method of multipliers (ADMM) framework and half-quadratic theory are used to efficiently optimize MR-SLMR. Experimental results on four public databases demonstrate the superiority of MR-SLMR and its robustness to complicated noise.

* This article has been accepted by IEEE Transactions on Neural Networks and Learning Systems

Via

Access Paper or Ask Questions

A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Mar 16, 2020

Beihao Xia, Conghao Wang, Qinmu Peng, Xinge You, Dacheng Tao

Figure 1 for A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Figure 2 for A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Figure 3 for A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Figure 4 for A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Abstract:It remains challenging to automatically predict the multi-agent trajectory due to multiple interactions including agent to agent interaction and scene to agent interaction. Although recent methods have achieved promising performance, most of them just consider spatial influence of the interactions and ignore the fact that temporal influence always accompanies spatial influence. Moreover, those methods based on scene information always require extra segmented scene images to generate multiple socially acceptable trajectories. To solve these limitations, we propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC). First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity. Experiments are performed on the two widely used ETH-UCY datasets and demonstrate that the proposed model achieves state-of-the-art prediction accuracy and handles more complex scenarios.

Via

Access Paper or Ask Questions

Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Dec 11, 2019

Shiming Chen, Peng Zhang, Xinge You, Qinmu Peng, Xin Liu, Zehong Cao, Dacheng Tao

Figure 1 for Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Figure 2 for Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Figure 3 for Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Figure 4 for Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Abstract:Dynamic texture (DT) exhibits statistical stationarity in the spatial domain and stochastic repetitiveness in the temporal dimension, indicating that different frames of DT possess high similarity correlation. However, there are no DT synthesis methods to consider the similarity prior for representing DT instead, which can explicitly capture the homogeneous and heterogeneous correlation between different frames of DT. In this paper, we propose a novel DT synthesis method (named Similarity-DT), which embeds the similarity prior into the representation of DT. Specifically, we first raise two hypotheses: the content of texture video frames varies over time-to-time, while the more closed frames should be more similar; the transition between frame-to-frame could be modeled as a linear or nonlinear function to capture the similarity correlation. Then, our proposed Similarity-DT integrates kernel learning and extreme learning machine (ELM) into a powerful unified synthesis model to learn kernel similarity embedding to represent the spatial-temporal transition among frame-to-frame of DTs. Extensive experiments on DT videos collected from internet and two benchmark datasets, i.e., Gatech Graphcut Textures and Dyntex, demonstrate that the learned kernel similarity embedding effectively exhibits the discriminative representation for DTs. Hence our method is capable of preserving long-term temporal continuity of the synthesized DT sequences with excellent sustainability and generalization. We also show that our method effectively generates realistic DT videos with fast speed and low computation, compared with the state-of-the-art approaches.

* 12 pages, 10 figures, 2 tables

Via

Access Paper or Ask Questions

Closed-Loop Adaptation for Weakly-Supervised Semantic Segmentation

May 29, 2019

Zhengqiang Zhang, Shujian Yu, Shi Yin, Qinmu Peng, Xinge You

Figure 1 for Closed-Loop Adaptation for Weakly-Supervised Semantic Segmentation

Figure 2 for Closed-Loop Adaptation for Weakly-Supervised Semantic Segmentation

Figure 3 for Closed-Loop Adaptation for Weakly-Supervised Semantic Segmentation

Figure 4 for Closed-Loop Adaptation for Weakly-Supervised Semantic Segmentation

Abstract:Weakly-supervised semantic segmentation aims to assign each pixel a semantic category under weak supervisions, such as image-level tags. Most of existing weakly-supervised semantic segmentation methods do not use any feedback from segmentation output and can be considered as open-loop systems. They are prone to accumulated errors because of the static seeds and the sensitive structure information. In this paper, we propose a generic self-adaptation mechanism for existing weakly-supervised semantic segmentation methods by introducing two feedback chains, thus constituting a closed-loop system. Specifically, the first chain iteratively produces dynamic seeds by incorporating cross-image structure information, whereas the second chain further expands seed regions by a customized random walk process to reconcile inner-image structure information characterized by superpixels. Experiments on PASCAL VOC 2012 suggest that our network outperforms state-of-the-art methods with significantly less computational and memory burden.

Via

Access Paper or Ask Questions

Fast and accurate reconstruction of HARDI using a 1D encoder-decoder convolutional network

Mar 21, 2019

Shi Yin, Zhengqiang Zhang, Qinmu Peng, Xinge You

Figure 1 for Fast and accurate reconstruction of HARDI using a 1D encoder-decoder convolutional network

Figure 2 for Fast and accurate reconstruction of HARDI using a 1D encoder-decoder convolutional network

Figure 3 for Fast and accurate reconstruction of HARDI using a 1D encoder-decoder convolutional network

Abstract:High angular resolution diffusion imaging (HARDI) demands a lager amount of data measurements compared to diffusion tensor imaging, restricting its use in practice. In this work, we explore a learning-based approach to reconstruct HARDI from a smaller number of measurements in q-space. The approach aims to directly learn the mapping relationship between the measured and HARDI signals from the collecting HARDI acquisitions of other subjects. Specifically, the mapping is represented as a 1D encoder-decoder convolutional neural network under the guidance of the compressed sensing (CS) theory for HARDI reconstruction. The proposed network architecture mainly consists of two parts: an encoder network produces the sparse coefficients and a decoder network yields a reconstruction result. Experiment results demonstrate we can robustly reconstruct HARDI signals with the accurate results and fast speed.

* 4 pages

Via

Access Paper or Ask Questions