Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianjia Shao

Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Nov 21, 2022

Yunfeng Diao, He Wang, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg

Figure 1 for Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Figure 2 for Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Figure 3 for Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Figure 4 for Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

Abstract:Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars, where safety and lives are at stake. Recently, the robustness of existing skeleton-based HAR methods has been questioned due to their vulnerability to adversarial attacks, which causes concerns considering the scale of the implication. However, the proposed attacks require the full-knowledge of the attacked classifier, which is overly restrictive. In this paper, we show such threats indeed exist, even when the attacker only has access to the input/output of the model. To this end, we propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR. BASAR explores the interplay between the classification boundary and the natural motion manifold. To our best knowledge, this is the first time data manifold is introduced in adversarial attacks on time series. Via BASAR, we find on-manifold adversarial samples are extremely deceitful and rather common in skeletal motions, in contrast to the common belief that adversarial samples only exist off-manifold. Through exhaustive evaluation, we show that BASAR can deliver successful attacks across classifiers, datasets, and attack modes. By attack, BASAR helps identify the potential causes of the model vulnerability and provides insights on possible improvements. Finally, to mitigate the newly identified threat, we propose a new adversarial training approach by leveraging the sophisticated distributions of on/off-manifold adversarial samples, called mixed manifold-based adversarial training (MMAT). MMAT can successfully help defend against adversarial attacks without compromising classification accuracy.

* arXiv admin note: substantial text overlap with arXiv:2103.05266

Via

Access Paper or Ask Questions

Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

May 06, 2022

Xiaoyu Pan, Jiaming Mai, Xinwei Jiang, Dongxue Tang, Jingxiang Li, Tianjia Shao, Kun Zhou, Xiaogang Jin, Dinesh Manocha

Figure 1 for Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

Figure 2 for Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

Figure 3 for Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

Figure 4 for Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

Abstract:We present a learning algorithm that uses bone-driven motion networks to predict the deformation of loose-fitting garment meshes at interactive rates. Given a garment, we generate a simulation database and extract virtual bones from simulated mesh sequences using skin decomposition. At runtime, we separately compute low- and high-frequency deformations in a sequential manner. The low-frequency deformations are predicted by transferring body motions to virtual bones' motions, and the high-frequency deformations are estimated leveraging the global information of virtual bones' motions and local information extracted from low-frequency meshes. In addition, our method can estimate garment deformations caused by variations of the simulation parameters (e.g., fabric's bending stiffness) using an RBF kernel ensembling trained networks for different sets of simulation parameters. Through extensive comparisons, we show that our method outperforms state-of-the-art methods in terms of prediction accuracy of mesh deformations by about 20% in RMSE and 10% in Hausdorff distance and STED. The code and data are available at https://github.com/non-void/VirtualBones.

* SIGGRAPH 22 Conference Paper

Via

Access Paper or Ask Questions

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Feb 02, 2022

Jiawei Lu, He Wang, Tianjia Shao, Yin Yang, Kun Zhou

Figure 1 for Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Figure 2 for Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Figure 3 for Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Figure 4 for Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Abstract:Generating new images with desired properties (e.g. new view/poses) from source images has been enthusiastically pursued recently, due to its wide range of potential applications. One way to ensure high-quality generation is to use multiple sources with complementary information such as different views of the same object. However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques. Therefore, we propose a new general approach which models multiple types of variations among sources, such as view angles, poses, facial expressions, in a unified framework, so that it can be employed on datasets of vastly different nature. We verify our approach on a variety of data including humans bodies, faces, city scenes and 3D objects. Both the qualitative and quantitative results demonstrate the better performance of our method than the state of the art.

Via

Access Paper or Ask Questions

Unsupervised Image Generation with Infinite Generative Adversarial Networks

Aug 18, 2021

Hui Ying, He Wang, Tianjia Shao, Yin Yang, Kun Zhou

Figure 1 for Unsupervised Image Generation with Infinite Generative Adversarial Networks

Figure 2 for Unsupervised Image Generation with Infinite Generative Adversarial Networks

Figure 3 for Unsupervised Image Generation with Infinite Generative Adversarial Networks

Figure 4 for Unsupervised Image Generation with Infinite Generative Adversarial Networks

Abstract:Image generation has been heavily investigated in computer vision, where one core research challenge is to generate images from arbitrarily complex distributions with little supervision. Generative Adversarial Networks (GANs) as an implicit approach have achieved great successes in this direction and therefore been employed widely. However, GANs are known to suffer from issues such as mode collapse, non-structured latent space, being unable to compute likelihoods, etc. In this paper, we propose a new unsupervised non-parametric method named mixture of infinite conditional GANs or MIC-GANs, to tackle several GAN issues together, aiming for image generation with parsimonious prior knowledge. Through comprehensive evaluations across different datasets, we show that MIC-GANs are effective in structuring the latent space and avoiding mode collapse, and outperform state-of-the-art methods. MICGANs are adaptive, versatile, and robust. They offer a promising solution to several well-known GAN issues. Code available: github.com/yinghdb/MICGANs.

* 18 pages, 11 figures

Via

Access Paper or Ask Questions

BASAR:Black-box Attack on Skeletal Action Recognition

Mar 19, 2021

Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang

Figure 1 for BASAR:Black-box Attack on Skeletal Action Recognition

Figure 2 for BASAR:Black-box Attack on Skeletal Action Recognition

Figure 3 for BASAR:Black-box Attack on Skeletal Action Recognition

Figure 4 for BASAR:Black-box Attack on Skeletal Action Recognition

Abstract:Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. The robustness of skeleton-based activity recognizers has been questioned recently, which shows that they are vulnerable to adversarial attacks when the full-knowledge of the recognizer is accessible to the attacker. However, this white-box requirement is overly restrictive in most scenarios and the attack is not truly threatening. In this paper, we show that such threats do exist under black-box settings too. To this end, we propose the first black-box adversarial attack method BASAR. Through BASAR, we show that adversarial attack is not only truly a threat but also can be extremely deceitful, because on-manifold adversarial samples are rather common in skeletal motions, in contrast to the common belief that adversarial samples only exist off-manifold. Through exhaustive evaluation and comparison, we show that BASAR can deliver successful attacks across models, data, and attack modes. Through harsh perceptual studies, we show that it achieves effective yet imperceptible attacks. By analyzing the attack on different activity recognizers, BASAR helps identify the potential causes of their vulnerability and provides insights on what classifiers are likely to be more robust against attack.

* Accepted in CVPR 2021

Via

Access Paper or Ask Questions

Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

Mar 18, 2021

He Wang, Feixiang He, Zhexi Peng, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg

Figure 1 for Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

Figure 2 for Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

Figure 3 for Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

Figure 4 for Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

Abstract:Action recognition has been heavily employed in many applications such as autonomous vehicles, surveillance, etc, where its robustness is a primary concern. In this paper, we examine the robustness of state-of-the-art action recognizers against adversarial attack, which has been rarely investigated so far. To this end, we propose a new method to attack action recognizers that rely on 3D skeletal motion. Our method involves an innovative perceptual loss that ensures the imperceptibility of the attack. Empirical studies demonstrate that our method is effective in both white-box and black-box scenarios. Its generalizability is evidenced on a variety of action recognizers and datasets. Its versatility is shown in different attacking strategies. Its deceitfulness is proven in extensive perceptual studies. Our method shows that adversarial attack on 3D skeletal motions, one type of time-series data, is significantly different from traditional adversarial attack problems. Its success raises serious concern on the robustness of action recognizers and provides insights on potential improvements.

* Accepted in CVPR 2021. arXiv admin note: substantial text overlap with arXiv:1911.07107

Via

Access Paper or Ask Questions

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Feb 23, 2021

Lijuan Liu, Yin Yang, Yi Yuan, Tianjia Shao, He Wang, Kun Zhou

Figure 1 for In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Figure 2 for In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Figure 3 for In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Figure 4 for In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Abstract:In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex. Given a construction layout, we propose a visual context-aware graph generation network that learns the implicit global relations among the scene components and infers the location of a new building unit. The proposed network takes as input the scene graph and the corresponding top-view depth image. It provides the location recommendations for a newly-added building units by learning an auto-regressive edge distribution conditioned on existing scenes. We also introduce a global graph-image matching loss to enhance the awareness of essential geometry semantics of the site. Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.

Via

Access Paper or Ask Questions

One-shot Face Reenactment Using Appearance Adaptive Normalization

Feb 20, 2021

Guangming Yao, Yi Yuan, Tianjia Shao, Shuang Li, Shanqi Liu, Yong Liu, Mengmeng Wang, Kun Zhou

Figure 1 for One-shot Face Reenactment Using Appearance Adaptive Normalization

Figure 2 for One-shot Face Reenactment Using Appearance Adaptive Normalization

Figure 3 for One-shot Face Reenactment Using Appearance Adaptive Normalization

Figure 4 for One-shot Face Reenactment Using Appearance Adaptive Normalization

Abstract:The paper proposes a novel generative adversarial network for one-shot face reenactment, which can animate a single face image to a different pose-and-expression (provided by a driving image) while keeping its original appearance. The core of our network is a novel mechanism called appearance adaptive normalization, which can effectively integrate the appearance information from the input image into our face generator by modulating the feature maps of the generator using the learned adaptive parameters. Furthermore, we specially design a local net to reenact the local facial components (i.e., eyes, nose and mouth) first, which is a much easier task for the network to learn and can in turn provide explicit anchors to guide our face generator to learn the global appearance and pose-and-expression. Extensive quantitative and qualitative experiments demonstrate the significant efficacy of our model compared with prior one-shot methods.

* 9 pages, 8 figures,3 tables ,Accepted by AAAI2021

Via

Access Paper or Ask Questions

High-order Differentiable Autoencoder for Nonlinear Model Reduction

Feb 19, 2021

Siyuan Shen, Yang Yin, Tianjia Shao, He Wang, Chenfanfu Jiang, Lei Lan, Kun Zhou

Figure 1 for High-order Differentiable Autoencoder for Nonlinear Model Reduction

Figure 2 for High-order Differentiable Autoencoder for Nonlinear Model Reduction

Figure 3 for High-order Differentiable Autoencoder for Nonlinear Model Reduction

Figure 4 for High-order Differentiable Autoencoder for Nonlinear Model Reduction

Abstract:This paper provides a new avenue for exploiting deep neural networks to improve physics-based simulation. Specifically, we integrate the classic Lagrangian mechanics with a deep autoencoder to accelerate elastic simulation of deformable solids. Due to the inertia effect, the dynamic equilibrium cannot be established without evaluating the second-order derivatives of the deep autoencoder network. This is beyond the capability of off-the-shelf automatic differentiation packages and algorithms, which mainly focus on the gradient evaluation. Solving the nonlinear force equilibrium is even more challenging if the standard Newton's method is to be used. This is because we need to compute a third-order derivative of the network to obtain the variational Hessian. We attack those difficulties by exploiting complex-step finite difference, coupled with reverse automatic differentiation. This strategy allows us to enjoy the convenience and accuracy of complex-step finite difference and in the meantime, to deploy complex-value perturbations as collectively as possible to save excessive network passes. With a GPU-based implementation, we are able to wield deep autoencoders (e.g., $10+$ layers) with a relatively high-dimension latent space in real-time. Along this pipeline, we also design a sampling network and a weighting network to enable \emph{weight-varying} Cubature integration in order to incorporate nonlinearity in the model reduction. We believe this work will inspire and benefit future research efforts in nonlinearly reduced physical simulation problems.

Via

Access Paper or Ask Questions

Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

Feb 05, 2021

Jilin Tang, Yi Yuan, Tianjia Shao, Yong Liu, Mengmeng Wang, Kun Zhou

Figure 1 for Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

Figure 2 for Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

Figure 3 for Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

Figure 4 for Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

Abstract:In this paper we tackle the problem of pose guided person image generation, which aims to transfer a person image from the source pose to a novel target pose while maintaining the source appearance. Given the inefficiency of standard CNNs in handling large spatial transformation, we propose a structure-aware flow based method for high-quality person image generation. Specifically, instead of learning the complex overall pose changes of human body, we decompose the human body into different semantic parts (e.g., head, torso, and legs) and apply different networks to predict the flow fields for these parts separately. Moreover, we carefully design the network modules to effectively capture the local and global semantic correlations of features within and among the human parts respectively. Extensive experimental results show that our method can generate high-quality results under large pose discrepancy and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.

* 9 pages, 8 figures

Via

Access Paper or Ask Questions