Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiang Wang

Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Jun 14, 2022

Quanzeng You, Jiang Wang, Peng Chu, Andre Abrantes, Zicheng Liu

Figure 1 for Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Figure 2 for Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Figure 3 for Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Figure 4 for Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Abstract:Video instance segmentation aims at predicting object segmentation masks for each frame, as well as associating the instances across multiple frames. Recent end-to-end video instance segmentation methods are capable of performing object segmentation and instance association together in a direct parallel sequence decoding/prediction framework. Although these methods generally predict higher quality object segmentation masks, they can fail to associate instances in challenging cases because they do not explicitly model the temporal instance consistency for adjacent frames. We propose a consistent end-to-end video instance segmentation framework with Inter-Frame Recurrent Attention to model both the temporal instance consistency for adjacent frames and the global temporal context. Our extensive experiments demonstrate that the Inter-Frame Recurrent Attention significantly improves temporal instance consistency while maintaining the quality of the object segmentation masks. Our model achieves state-of-the-art accuracy on both YouTubeVIS-2019 (62.1\%) and YouTubeVIS-2021 (54.7\%) datasets. In addition, quantitative and qualitative results show that the proposed methods predict more temporally consistent instance segmentation masks.

* 11 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

Augmenting Knowledge Graphs for Better Link Prediction

Mar 26, 2022

Jiang Wang, Filip Ilievski, Pedro Szekely, Ke-Thia Yao

Figure 1 for Augmenting Knowledge Graphs for Better Link Prediction

Figure 2 for Augmenting Knowledge Graphs for Better Link Prediction

Figure 3 for Augmenting Knowledge Graphs for Better Link Prediction

Figure 4 for Augmenting Knowledge Graphs for Better Link Prediction

Abstract:Embedding methods have demonstrated robust performance on the task of link prediction in knowledge graphs, by mostly encoding entity relationships. Recent methods propose to enhance the loss function with a literal-aware term. In this paper, we propose KGA: a knowledge graph augmentation method that incorporates literals in an embedding model without modifying its loss function. KGA discretizes quantity and year values into bins, and chains these bins both horizontally, modeling neighboring values, and vertically, modeling multiple levels of granularity. KGA is scalable and can be used as a pre-processing step for any existing knowledge graph embedding model. Experiments on legacy benchmarks and a new large benchmark, DWD, show that augmenting the knowledge graph with quantities and years is beneficial for predicting both entities and numbers, as KGA outperforms the vanilla models and other relevant baselines. Our ablation studies confirm that both quantities and years contribute to KGA's performance, and that its performance depends on the discretization and binning settings. We make the code, models, and the DWD benchmark publicly available to facilitate reproducibility and future research.

Via

Access Paper or Ask Questions

Deep Frequency Filtering for Domain Generalization

Mar 23, 2022

Shiqi Lin, Zhizheng Zhang, Zhipeng Huang, Yan Lu, Cuiling Lan, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Amey Parulkar(+2 more)

Figure 1 for Deep Frequency Filtering for Domain Generalization

Figure 2 for Deep Frequency Filtering for Domain Generalization

Figure 3 for Deep Frequency Filtering for Domain Generalization

Figure 4 for Deep Frequency Filtering for Domain Generalization

Abstract:Improving the generalization capability of Deep Neural Networks (DNNs) is critical for their practical uses, which has been a longstanding challenge. Some theoretical studies have revealed that DNNs have preferences to different frequency components in the learning process and indicated that this may affect the robustness of learned features. In this paper, we propose Deep Frequency Filtering (DFF) for learning domain-generalizable features, which is the first endeavour to explicitly modulate frequency components of different transfer difficulties across domains during training. To achieve this, we perform Fast Fourier Transform (FFT) on feature maps at different layers, then adopt a light-weight module to learn the attention masks from frequency representations after FFT to enhance transferable frequency components while suppressing the components not conductive to generalization. Further, we empirically compare different types of attention for implementing our conceptualized DFF. Extensive experiments demonstrate the effectiveness of the proposed DFF and show that applying DFF on a plain baseline outperforms the state-of-the-art methods on different domain generalization tasks, including close-set classification and open-set retrieval.

Via

Access Paper or Ask Questions

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

Dec 13, 2021

Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-jun Zha

Figure 1 for Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

Figure 2 for Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

Figure 3 for Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

Figure 4 for Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

Abstract:Unsupervised domain adaptive person re-identification (ReID) has been extensively investigated to mitigate the adverse effects of domain gaps. Those works assume the target domain data can be accessible all at once. However, for the real-world streaming data, this hinders the timely adaptation to changing data statistics and sufficient exploitation of increasing samples. In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID. This is challenging because it requires the model to continuously adapt to unlabeled data of the target environments while alleviating catastrophic forgetting for such a fine-grained person retrieval task. We design an effective scheme for this task, dubbed CLUDA-ReID, where the anti-forgetting is harmoniously coordinated with the adaptation. Specifically, a meta-based Coordinated Data Replay strategy is proposed to replay old data and update the network with a coordinated optimization direction for both adaptation and memorization. Moreover, we propose Relational Consistency Learning for old knowledge distillation/inheritance in line with the objective of retrieval-based tasks. We set up two evaluation settings to simulate the practical application scenarios. Extensive experiments demonstrate the effectiveness of our CLUDA-ReID for both scenarios with stationary target streams and scenarios with dynamic target streams.

Via

Access Paper or Ask Questions

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Nov 30, 2021

Xiaotian Han, Quanzeng You, Chunyu Wang, Zhizheng Zhang, Peng Chu, Houdong Hu, Jiang Wang, Zicheng Liu

Figure 1 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Figure 2 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Figure 3 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Figure 4 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Abstract:Multi-camera tracking systems are gaining popularity in applications that demand high-quality tracking results, such as frictionless checkout because monocular multi-object tracking (MOT) systems often fail in cluttered and crowded environments due to occlusion. Multiple highly overlapped cameras can significantly alleviate the problem by recovering partial 3D information. However, the cost of creating a high-quality multi-camera tracking dataset with diverse camera settings and backgrounds has limited the dataset scale in this domain. In this paper, we provide a large-scale densely-labeled multi-camera tracking dataset in five different environments with the help of an auto-annotation system. The system uses overlapped and calibrated depth and RGB cameras to build a high-performance 3D tracker that automatically generates the 3D tracking results. The 3D tracking results are projected to each RGB camera view using camera parameters to create 2D tracking results. Then, we manually check and correct the 3D tracking results to ensure the label quality, which is much cheaper than fully manual annotation. We have conducted extensive experiments using two real-time multi-camera trackers and a person re-identification (ReID) model with different settings. This dataset provides a more reliable benchmark of multi-camera, multi-object tracking systems in cluttered and crowded environments. Also, our results demonstrate that adapting the trackers and ReID models on this dataset significantly improves their performance. Our dataset will be publicly released upon the acceptance of this work.

Via

Access Paper or Ask Questions

TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Apr 03, 2021

Peng Chu, Jiang Wang, Quanzeng You, Haibin Ling, Zicheng Liu

Figure 1 for TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Figure 2 for TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Figure 3 for TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Figure 4 for TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Abstract:Tracking multiple objects in videos relies on modeling the spatial-temporal interactions of the objects. In this paper, we propose a solution named TransMOT, which leverages powerful graph transformers to efficiently model the spatial and temporal interactions among the objects. TransMOT effectively models the interactions of a large number of objects by arranging the trajectories of the tracked objects as a set of sparse weighted graphs, and constructing a spatial graph transformer encoder layer, a temporal transformer encoder layer, and a spatial graph transformer decoder layer based on the graphs. TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy. To further improve the tracking speed and accuracy, we propose a cascade association framework to handle low-score detections and long-term occlusions that require large computational resources to model in TransMOT. The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20, and it achieves state-of-the-art performance on all the datasets.

Via

Access Paper or Ask Questions

Computational Impact Time Guidance: A Learning-Based Prediction-Correction Approach

Mar 09, 2021

Zichao Liu, Jiang Wang, Shaoming He, Hyo-Sang Shin, Antonios Tsourdos

Figure 1 for Computational Impact Time Guidance: A Learning-Based Prediction-Correction Approach

Figure 2 for Computational Impact Time Guidance: A Learning-Based Prediction-Correction Approach

Figure 3 for Computational Impact Time Guidance: A Learning-Based Prediction-Correction Approach

Figure 4 for Computational Impact Time Guidance: A Learning-Based Prediction-Correction Approach

Abstract:This paper investigates the problem of impact-time-control and proposes a learning-based computational guidance algorithm to solve this problem. The proposed guidance algorithm is developed based on a general prediction-correction concept: the exact time-to-go under proportional navigation guidance with realistic aerodynamic characteristics is estimated by a deep neural network and a biased command to nullify the impact time error is developed by utilizing the emerging reinforcement learning techniques. The deep neural network is augmented into the reinforcement learning block to resolve the issue of sparse reward that has been observed in typical reinforcement learning formulation. Extensive numerical simulations are conducted to support the proposed algorithm.

* Submitted to IEEE

Via

Access Paper or Ask Questions

Coarse Graining Molecular Dynamics with Graph Neural Networks

Aug 21, 2020

Brooke E. Husic, Nicholas E. Charron, Dominik Lemm, Jiang Wang, Adrià Pérez, Andreas Krämer, Yaoyi Chen, Simon Olsson, Gianni de Fabritiis, Frank Noé(+1 more)

Figure 1 for Coarse Graining Molecular Dynamics with Graph Neural Networks

Figure 2 for Coarse Graining Molecular Dynamics with Graph Neural Networks

Figure 3 for Coarse Graining Molecular Dynamics with Graph Neural Networks

Figure 4 for Coarse Graining Molecular Dynamics with Graph Neural Networks

Abstract:Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proven that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features upon which to machine learn the force field. In the present contribution, we build upon the advance of Wang et al.and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learns their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.

* 27 pages, 23 figures

Via

Access Paper or Ask Questions

Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

May 04, 2020

Jiang Wang, Stefan Chmiela, Klaus-Robert Müller, Frank Noè, Cecilia Clementi

Figure 1 for Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Figure 2 for Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Figure 3 for Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Figure 4 for Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Abstract:Gradient-domain machine learning (GDML) is an accurate and efficient approach to learn a molecular potential and associated force field based on the kernel ridge regression algorithm. Here, we demonstrate its application to learn an effective coarse-grained (CG) model from all-atom simulation data in a sample efficient manner. The coarse-grained force field is learned by following the thermodynamic consistency principle, here by minimizing the error between the predicted coarse-grained force and the all-atom mean force in the coarse-grained coordinates. Solving this problem by GDML directly is impossible because coarse-graining requires averaging over many training data points, resulting in impractical memory requirements for storing the kernel matrices. In this work, we propose a data-efficient and memory-saving alternative. Using ensemble learning and stratified sampling, we propose a 2-layer training scheme that enables GDML to learn an effective coarse-grained model. We illustrate our method on a simple biomolecular system, alanine dipeptide, by reconstructing the free energy landscape of a coarse-grained variant of this molecule. Our novel GDML training scheme yields a smaller free energy error than neural networks when the training set is small, and a comparably high accuracy when the training set is sufficiently large.

* 14 pages, 6 figures

Via

Access Paper or Ask Questions

RC-DARTS: Resource Constrained Differentiable Architecture Search

Dec 30, 2019

Xiaojie Jin, Jiang Wang, Joshua Slocum, Ming-Hsuan Yang, Shengyang Dai, Shuicheng Yan, Jiashi Feng

Figure 1 for RC-DARTS: Resource Constrained Differentiable Architecture Search

Figure 2 for RC-DARTS: Resource Constrained Differentiable Architecture Search

Figure 3 for RC-DARTS: Resource Constrained Differentiable Architecture Search

Figure 4 for RC-DARTS: Resource Constrained Differentiable Architecture Search

Abstract:Recent advances show that Neural Architectural Search (NAS) method is able to find state-of-the-art image classification deep architectures. In this paper, we consider the one-shot NAS problem for resource constrained applications. This problem is of great interest because it is critical to choose different architectures according to task complexity when the resource is constrained. Previous techniques are either too slow for one-shot learning or does not take the resource constraint into consideration. In this paper, we propose the resource constrained differentiable architecture search (RC-DARTS) method to learn architectures that are significantly smaller and faster while achieving comparable accuracy. Specifically, we propose to formulate the RC-DARTS task as a constrained optimization problem by adding the resource constraint. An iterative projection method is proposed to solve the given constrained optimization problem. We also propose a multi-level search strategy to enable layers at different depths to adaptively learn different types of neural architectures. Through extensive experiments on the Cifar10 and ImageNet datasets, we show that the RC-DARTS method learns lightweight neural architectures which have smaller model size and lower computational complexity while achieving comparable or better performances than the state-of-the-art methods.

* Tech report

Via

Access Paper or Ask Questions