Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Zheng

DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Jul 31, 2022

Guangyao Zhai, Yu Zheng, Ziwei Xu, Xin Kong, Yong Liu, Benjamin Busam, Yi Ren, Nassir Navab, Zhengyou Zhang

Figure 1 for DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Figure 2 for DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Figure 3 for DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Figure 4 for DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Abstract:In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects. The dataset contains about 9M pairs of parallel-jaw grasps, generated from more than 6000 objects and each labeled with various grasp dexterity measures. In addition, we propose an end-to-end dual-arm grasp evaluation model trained on the rendered scenes from this dataset. We utilize the evaluation model as our baseline to show the value of this novel and nontrivial dataset by both online analysis and real robot experiments. All data and related code will be open-sourced at https://sites.google.com/view/da2dataset.

* RAL+IROS'22

Via

Access Paper or Ask Questions

A Unified Model for Multi-class Anomaly Detection

Jun 08, 2022

Zhiyuan You, Lei Cui, Yujun Shen, Kai Yang, Xin Lu, Yu Zheng, Xinyi Le

Figure 1 for A Unified Model for Multi-class Anomaly Detection

Figure 2 for A Unified Model for Multi-class Anomaly Detection

Figure 3 for A Unified Model for Multi-class Anomaly Detection

Figure 4 for A Unified Model for Multi-class Anomaly Detection

Abstract:Despite the rapid advance of unsupervised anomaly detection, existing methods require to train separate models for different objects. In this work, we present UniAD that accomplishes anomaly detection for multiple classes with a unified framework. Under such a challenging setting, popular reconstruction networks may fall into an "identical shortcut", where both normal and anomalous samples can be well recovered, and hence fail to spot outliers. To tackle this obstacle, we make three improvements. First, we revisit the formulations of fully-connected layer, convolutional layer, as well as attention layer, and confirm the important role of query embedding (i.e., within attention layer) in preventing the network from learning the shortcut. We therefore come up with a layer-wise query decoder to help model the multi-class distribution. Second, we employ a neighbor masked attention module to further avoid the information leak from the input feature to the reconstructed output feature. Third, we propose a feature jittering strategy that urges the model to recover the correct message even with noisy inputs. We evaluate our algorithm on MVTec-AD and CIFAR-10 datasets, where we surpass the state-of-the-art alternatives by a sufficiently large margin. For example, when learning a unified model for 15 categories in MVTec-AD, we surpass the second competitor on the tasks of both anomaly detection (from 88.1% to 96.5%) and anomaly localization (from 89.5% to 96.8%). Code will be made publicly available.

Via

Access Paper or Ask Questions

Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Jun 03, 2022

Yizhen Zheng, Shirui Pan, Vincent Cs Lee, Yu Zheng, Philip S. Yu

Figure 1 for Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Figure 2 for Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Figure 3 for Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Figure 4 for Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Abstract:Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL) via self-supervised learning schemes. The core idea is to learn by maximising mutual information for similar instances, which requires similarity computation between two node instances. However, this operation can be computationally expensive. For example, the time complexity of two commonly adopted contrastive loss functions (i.e., InfoNCE and JSD estimator) for a node is $O(ND)$ and $O(D)$, respectively, where $N$ is the number of nodes, and $D$ is the embedding dimension. Additionally, GCL normally requires a large number of training epochs to be well-trained on large-scale datasets. Inspired by an observation of a technical defect (i.e., inappropriate usage of Sigmoid function) commonly used in two representative GCL works, DGI and MVGRL, we revisit GCL and introduce a new learning paradigm for self-supervised GRL, namely, Group Discrimination (GD), and propose a novel GD-based method called Graph Group Discrimination (GGD). Instead of similarity computation, GGD directly discriminates two groups of summarised node instances with a simple binary cross-entropy loss. As such, GGD only requires $O(1)$ for loss computation of a node. In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets. These two advantages endow GGD with the very efficient property. Extensive experiments show that GGD outperforms state-of-the-art self-supervised methods on 8 datasets. In particular, GGD can be trained in 0.18 seconds (6.44 seconds including data preprocessing) on ogbn-arxiv, which is orders of magnitude (10,000+ faster than GCL baselines} while consuming much less memory. Trained with 9 hours on ogbn-papers100M with billion edges, GGD outperforms its GCL counterparts in both accuracy and efficiency.

Via

Access Paper or Ask Questions

A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

May 31, 2022

Gaode Chen, Yijun Su, Xinghua Zhang, Anmin Hu, Guochun Chen, Siyuan Feng, Ji Xiang, Junbo Zhang, Yu Zheng

Figure 1 for A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

Figure 2 for A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

Figure 3 for A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

Figure 4 for A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

Abstract:Data insufficiency problem (i.e., data missing and label scarcity issues) caused by inadequate services and infrastructures or unbalanced development levels of cities has seriously affected the urban computing tasks in real scenarios. Prior transfer learning methods inspire an elegant solution to the data insufficiency, but are only concerned with one kind of insufficiency issue and fail to fully explore these two issues existing in the real world. In addition, cross-city transfer in existing methods overlooks the inter-city data privacy which is a public concern in practical application. To address the above challenging problems, we propose a novel Cross-city Federated Transfer Learning framework (CcFTL) to cope with the data insufficiency and privacy problems. Concretely, CcFTL transfers the relational knowledge from multiple rich-data source cities to the target city. Besides, the model parameters specific to the target task are firstly trained on the source data and then fine-tuned to the target city by parameter transfer. With our adaptation of federated training and homomorphic encryption settings, CcFTL can effectively deal with the data privacy problem among cities. We take the urban region profiling as an application of smart cities and evaluate the proposed method with a real-world study. The experiments demonstrate the notable superiority of our framework over several competitive state-of-the-art models.

Via

Access Paper or Ask Questions

Configuration-Aware Safe Control for Mobile Robotic Arm with Control Barrier Functions

Apr 18, 2022

Fan Ding, Jianping He, Yi Ren, Han Wang, Yu Zheng

Figure 1 for Configuration-Aware Safe Control for Mobile Robotic Arm with Control Barrier Functions

Figure 2 for Configuration-Aware Safe Control for Mobile Robotic Arm with Control Barrier Functions

Figure 3 for Configuration-Aware Safe Control for Mobile Robotic Arm with Control Barrier Functions

Figure 4 for Configuration-Aware Safe Control for Mobile Robotic Arm with Control Barrier Functions

Abstract:Collision avoidance is a widely investigated topic in robotic applications. When applying collision avoidance techniques to a mobile robot, how to deal with the spatial structure of the robot still remains a challenge. In this paper, we design a configuration-aware safe control law by solving a Quadratic Programming (QP) with designed Control Barrier Functions (CBFs) constraints, which can safely navigate a mobile robotic arm to a desired region while avoiding collision with environmental obstacles. The advantage of our approach is that it correctly and in an elegant way incorporates the spatial structure of the mobile robotic arm. This is achieved by merging geometric restrictions among mobile robotic arm links into CBFs constraints. Simulations on a rigid rod and the modeled mobile robotic arm are performed to verify the feasibility and time-efficiency of proposed method. Numerical results about the time consuming for different degrees of freedom illustrate that our method scales well with dimension.

* submitted to Conference of Decision and Control(CDC)

Via

Access Paper or Ask Questions

HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Apr 12, 2022

Yu Zheng, Yueqi Duan, Jiwen Lu, Jie Zhou, Qi Tian

Figure 1 for HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Figure 2 for HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Figure 3 for HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Figure 4 for HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Abstract:A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects. In this paper, we propose HyperDet3D to explore scene-conditioned prior knowledge for 3D object detection. Existing methods strive for better representation of local elements and their relations without scene-conditioned knowledge, which may cause ambiguity merely based on the understanding of individual points and object candidates. Instead, HyperDet3D simultaneously learns scene-agnostic embeddings and scene-specific knowledge through scene-conditioned hypernetworks. More specifically, our HyperDet3D not only explores the sharable abstracts from various 3D scenes, but also adapts the detector to the given scene at test time. We propose a discriminative Multi-head Scene-specific Attention (MSA) module to dynamically control the layer parameters of the detector conditioned on the fusion of scene-conditioned knowledge. Our HyperDet3D achieves state-of-the-art results on the 3D object detection benchmark of the ScanNet and SUN RGB-D datasets. Moreover, through cross-dataset evaluation, we show the acquired scene-conditioned prior knowledge still takes effect when facing 3D scenes with domain gap.

* to be published on CVPR2022

Via

Access Paper or Ask Questions

Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Mar 27, 2022

Xiuwei Xu, Yifan Wang, Yu Zheng, Yongming Rao, Jie Zhou, Jiwen Lu

Figure 1 for Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Figure 2 for Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Figure 3 for Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Figure 4 for Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Abstract:In this paper, we propose a weakly-supervised approach for 3D object detection, which makes it possible to train a strong 3D detector with position-level annotations (i.e. annotations of object centers). In order to remedy the information loss from box annotations to centers, our method, namely Back to Reality (BR), makes use of synthetic 3D shapes to convert the weak labels into fully-annotated virtual scenes as stronger supervision, and in turn utilizes the perfect virtual labels to complement and refine the real labels. Specifically, we first assemble 3D shapes into physically reasonable virtual scenes according to the coarse scene layout extracted from position-level annotations. Then we go back to reality by applying a virtual-to-real domain adaptation method, which refine the weak labels and additionally supervise the training of detector with the virtual scenes. Furthermore, we propose a more challenging benckmark for indoor 3D object detection with more diversity in object sizes to better show the potential of BR. With less than 5% of the labeling labor, we achieve comparable detection performance with some popular fully-supervised approaches on the widely used ScanNet dataset. Code is available at: https://github.com/wyf-ACCEPT/BackToReality

* Accepted to CVPR2022

Via

Access Paper or Ask Questions

Deep AutoAugment

Mar 15, 2022

Yu Zheng, Zhi Zhang, Shen Yan, Mi Zhang

Abstract:While recent automated data augmentation methods lead to state-of-the-art results, their design spaces and the derived data augmentation strategies still incorporate strong human priors. In this work, instead of fixing a set of hand-picked default augmentations alongside the searched data augmentations, we propose a fully automated approach for data augmentation search named Deep AutoAugment (DeepAA). DeepAA progressively builds a multi-layer data augmentation pipeline from scratch by stacking augmentation layers one at a time until reaching convergence. For each augmentation layer, the policy is optimized to maximize the cosine similarity between the gradients of the original and augmented data along the direction with low variance. Our experiments show that even without default augmentations, we can learn an augmentation policy that achieves strong performance with that of previous works. Extensive ablation studies show that the regularized gradient matching is an effective search method for data augmentation policies. Our code is available at: https://github.com/MSU-MLSys-Lab/DeepAA .

* ICLR 2022 camera-ready. Code: https://github.com/MSU-MLSys-Lab/DeepAA

Via

Access Paper or Ask Questions

Disentangling Long and Short-Term Interests for Recommendation

Feb 26, 2022

Yu Zheng, Chen Gao, Jianxin Chang, Yanan Niu, Yang Song, Depeng Jin, Yong Li

Figure 1 for Disentangling Long and Short-Term Interests for Recommendation

Figure 2 for Disentangling Long and Short-Term Interests for Recommendation

Figure 3 for Disentangling Long and Short-Term Interests for Recommendation

Figure 4 for Disentangling Long and Short-Term Interests for Recommendation

Abstract:Modeling user's long-term and short-term interests is crucial for accurate recommendation. However, since there is no manually annotated label for user interests, existing approaches always follow the paradigm of entangling these two aspects, which may lead to inferior recommendation accuracy and interpretability. In this paper, to address it, we propose a Contrastive learning framework to disentangle Long and Short-term interests for Recommendation (CLSR) with self-supervision. Specifically, we first propose two separate encoders to independently capture user interests of different time scales. We then extract long-term and short-term interests proxies from the interaction sequences, which serve as pseudo labels for user interests. Then pairwise contrastive tasks are designed to supervise the similarity between interest representations and their corresponding interest proxies. Finally, since the importance of long-term and short-term interests is dynamically changing, we propose to adaptively aggregate them through an attention-based network for prediction. We conduct experiments on two large-scale real-world datasets for e-commerce and short-video recommendation. Empirical results show that our CLSR consistently outperforms all state-of-the-art models with significant improvements: GAUC is improved by over 0.01, and NDCG is improved by over 4%. Further counterfactual evaluations demonstrate that stronger disentanglement of long and short-term interests is successfully achieved by CLSR. The code and data are available at https://github.com/tsinghua-fib-lab/CLSR.

* Accepted by WWW'22

Via

Access Paper or Ask Questions

Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Feb 17, 2022

Ming Jin, Yu Zheng, Yuan-Fang Li, Siheng Chen, Bin Yang, Shirui Pan

Figure 1 for Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Figure 2 for Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Figure 3 for Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Figure 4 for Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

Abstract:Multivariate time series forecasting has long received significant attention in real-world applications, such as energy consumption and traffic prediction. While recent methods demonstrate good forecasting abilities, they suffer from three fundamental limitations. (i) Discrete neural architectures: Interlacing individually parameterized spatial and temporal blocks to encode rich underlying patterns leads to discontinuous latent state trajectories and higher forecasting numerical errors. (ii) High complexity: Discrete approaches complicate models with dedicated designs and redundant parameters, leading to higher computational and memory overheads. (iii) Reliance on graph priors: Relying on predefined static graph structures limits their effectiveness and practicability in real-world applications. In this paper, we address all the above limitations by proposing a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE). Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures. Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing, allowing deeper graph propagation and fine-grained temporal information aggregation to characterize stable and precise latent spatial-temporal dynamics. Our experiments demonstrate the superiorities of MTGODE from various perspectives on five time series benchmark datasets.

* 14 pages, 6 figures, 5 tables

Via

Access Paper or Ask Questions