Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mingli Song

Zhejiang University

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

Jul 07, 2023

Zhonghan Zhao, Wenhao Chai, Shengyu Hao, Wenhao Hu, Guanhong Wang, Shidong Cao, Mingli Song, Jenq-Neng Hwang, Gaoang Wang

Abstract:Deep learning has the potential to revolutionize sports performance, with applications ranging from perception and comprehension to decision. This paper presents a comprehensive survey of deep learning in sports performance, focusing on three main aspects: algorithms, datasets and virtual environments, and challenges. Firstly, we discuss the hierarchical structure of deep learning algorithms in sports performance which includes perception, comprehension and decision while comparing their strengths and weaknesses. Secondly, we list widely used existing datasets in sports and highlight their characteristics and limitations. Finally, we summarize current challenges and point out future trends of deep learning in sports. Our survey provides valuable reference material for researchers interested in deep learning in sports applications.

Via

Access Paper or Ask Questions

MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

Jun 29, 2023

Zhenyu Zhang, Wenhao Chai, Zhongyu Jiang, Tian Ye, Mingli Song, Jenq-Neng Hwang, Gaoang Wang

Figure 1 for MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

Figure 2 for MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

Figure 3 for MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

Figure 4 for MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

Abstract:Estimating 3D human poses only from a 2D human pose sequence is thoroughly explored in recent years. Yet, prior to this, no such work has attempted to unify 2D and 3D pose representations in the shared feature space. In this paper, we propose MPM, a unified 2D-3D human pose representation framework via masked pose modeling. We treat 2D and 3D poses as two different modalities like vision and language and build a single-stream transformer-based architecture. We apply three pretext tasks, which are masked 2D pose modeling, masked 3D pose modeling, and masked 2D pose lifting to pre-train our network and use full-supervision to perform further fine-tuning. A high masking ratio of 72.5% in total with a spatio-temporal mask sampling strategy leading to better relation modeling both in spatial and temporal domains. MPM can handle multiple tasks including 3D human pose estimation, 3D pose estimation from occluded 2D pose, and 3D pose completion in a single framework. We conduct extensive experiments and ablation studies on several widely used human pose datasets and achieve state-of-the-art performance on Human3.6M and MPI-INF-3DHP. Codes and model checkpoints are available at https://github.com/vvirgooo2/MPM

* Codes and model checkpoints are available at https://github.com/vvirgooo2/MPM

Via

Access Paper or Ask Questions

Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

Jun 19, 2023

Yu Wang, Tongya Zheng, Shunyu Liu, Kaixuan Chen, Zunlei Feng, Yunzhi Hao, Mingli Song

Figure 1 for Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

Figure 2 for Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

Figure 3 for Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

Figure 4 for Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

Abstract:Human mobility patterns have shown significant applications in policy-decision scenarios and economic behavior researches. The human mobility simulation task aims to generate human mobility trajectories given a small set of trajectory data, which have aroused much concern due to the scarcity and sparsity of human mobility data. Existing methods mostly rely on the static relationships of locations, while largely neglect the dynamic spatiotemporal effects of locations. On the one hand, spatiotemporal correspondences of visit distributions reveal the spatial proximity and the functionality similarity of locations. On the other hand, the varying durations in different locations hinder the iterative generation process of the mobility trajectory. Therefore, we propose a novel framework to model the dynamic spatiotemporal effects of locations, namely SpatioTemporal-Augmented gRaph neural networks (STAR). The STAR framework designs various spatiotemporal graphs to capture the spatiotemporal correspondences and builds a novel dwell branch to simulate the varying durations in locations, which is finally optimized in an adversarial manner. The comprehensive experiments over four real datasets for the human mobility simulation have verified the superiority of STAR to state-of-the-art methods. Our code will be made publicly available.

Via

Access Paper or Ask Questions

Curricular Subgoals for Inverse Reinforcement Learning

Jun 14, 2023

Shunyu Liu, Yunpeng Qing, Shuqi Xu, Hongyan Wu, Jiangtao Zhang, Jingyuan Cong, Tianhao Chen, Yunfu Liu, Mingli Song

Figure 1 for Curricular Subgoals for Inverse Reinforcement Learning

Figure 2 for Curricular Subgoals for Inverse Reinforcement Learning

Figure 3 for Curricular Subgoals for Inverse Reinforcement Learning

Figure 4 for Curricular Subgoals for Inverse Reinforcement Learning

Abstract:Inverse Reinforcement Learning (IRL) aims to reconstruct the reward function from expert demonstrations to facilitate policy learning, and has demonstrated its remarkable success in imitation learning. To promote expert-like behavior, existing IRL methods mainly focus on learning global reward functions to minimize the trajectory difference between the imitator and the expert. However, these global designs are still limited by the redundant noise and error propagation problems, leading to the unsuitable reward assignment and thus downgrading the agent capability in complex multi-stage tasks. In this paper, we propose a novel Curricular Subgoal-based Inverse Reinforcement Learning (CSIRL) framework, that explicitly disentangles one task with several local subgoals to guide agent imitation. Specifically, CSIRL firstly introduces decision uncertainty of the trained agent over expert trajectories to dynamically select subgoals, which directly determines the exploration boundary of different task stages. To further acquire local reward functions for each stage, we customize a meta-imitation objective based on these curricular subgoals to train an intrinsic reward generator. Experiments on the D4RL and autonomous driving benchmarks demonstrate that the proposed methods yields results superior to the state-of-the-art counterparts, as well as better interpretability. Our code is available at https://github.com/Plankson/CSIRL.

Via

Access Paper or Ask Questions

Lookaround Optimizer: $k$ steps around, 1 step average

Jun 13, 2023

Jiangtao Zhang, Shunyu Liu, Jie Song, Tongtian Zhu, Zhengqi Xu, Mingli Song

Abstract:Weight Average (WA) is an active research topic due to its simplicity in ensembling deep networks and the effectiveness in promoting generalization. Existing weight average approaches, however, are often carried out along only one training trajectory in a post-hoc manner (i.e., the weights are averaged after the entire training process is finished), which significantly degrades the diversity between networks and thus impairs the effectiveness in ensembling. In this paper, inspired by weight average, we propose Lookaround, a straightforward yet effective SGD-based optimizer leading to flatter minima with better generalization. Specifically, Lookaround iterates two steps during the whole training period: the around step and the average step. In each iteration, 1) the around step starts from a common point and trains multiple networks simultaneously, each on transformed data by a different data augmentation, and 2) the average step averages these trained networks to get the averaged network, which serves as the starting point for the next iteration. The around step improves the functionality diversity while the average step guarantees the weight locality of these networks during the whole training, which is essential for WA to work. We theoretically explain the superiority of Lookaround by convergence analysis, and make extensive experiments to evaluate Lookaround on popular benchmarks including CIFAR and ImageNet with both CNNs and ViTs, demonstrating clear superiority over state-of-the-arts. Our code is available at https://github.com/Ardcy/Lookaround.

* 18 pages, 9 figures

Via

Access Paper or Ask Questions

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Jun 12, 2023

Tongtian Zhu, Fengxiang He, Kaixuan Chen, Mingli Song, Dacheng Tao

Figure 1 for Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Figure 2 for Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Figure 3 for Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Figure 4 for Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Abstract:Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server. However, existing theories claim that decentralization invariably undermines generalization. In this paper, we challenge the conventional belief and present a completely new perspective for understanding decentralized learning. We prove that D-SGD implicitly minimizes the loss function of an average-direction Sharpness-aware minimization (SAM) algorithm under general non-convex non-$\beta$-smooth settings. This surprising asymptotic equivalence reveals an intrinsic regularization-optimization trade-off and three advantages of decentralization: (1) there exists a free uncertainty evaluation mechanism in D-SGD to improve posterior estimation; (2) D-SGD exhibits a gradient smoothing effect; and (3) the sharpness regularization effect of D-SGD does not decrease as total batch size increases, which justifies the potential generalization benefit of D-SGD over centralized SGD (C-SGD) in large-batch scenarios.

* Accepted for publication in the 40th International Conference on Machine Learning (ICML 2023)

Via

Access Paper or Ask Questions

Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization

Jun 11, 2023

Kaixuan Chen, Shunyu Liu, Tongtian Zhu, Tongya Zheng, Haofei Zhang, Zunlei Feng, Jingwen Ye, Mingli Song

Figure 1 for Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization

Figure 2 for Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization

Figure 3 for Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization

Figure 4 for Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization

Abstract:Graph Neural Networks (GNNs) have emerged as a powerful category of learning architecture for handling graph-structured data. However, existing GNNs typically ignore crucial structural characteristics in node-induced subgraphs, which thus limits their expressiveness for various downstream tasks. In this paper, we strive to strengthen the representative capabilities of GNNs by devising a dedicated plug-and-play normalization scheme, termed as SUbgraph-sPEcific FactoR Embedded Normalization (SuperNorm), that explicitly considers the intra-connection information within each node-induced subgraph. To this end, we embed the subgraph-specific factor at the beginning and the end of the standard BatchNorm, as well as incorporate graph instance-specific statistics for improved distinguishable capabilities. In the meantime, we provide theoretical analysis to support that, with the elaborated SuperNorm, an arbitrary GNN is at least as powerful as the 1-WL test in distinguishing non-isomorphism graphs. Furthermore, the proposed SuperNorm scheme is also demonstrated to alleviate the over-smoothing phenomenon. Experimental results related to predictions of graph, node, and link properties on the eight popular datasets demonstrate the effectiveness of the proposed method. The code is available at https://github.com/chenchkx/SuperNorm.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions

Message-passing selection: Towards interpretable GNNs for graph classification

Jun 08, 2023

Wenda Li, Kaixuan Chen, Shunyu Liu, Wenjie Huang, Haofei Zhang, Yingjie Tian, Yun Su, Mingli Song

Abstract:In this paper, we strive to develop an interpretable GNNs' inference paradigm, termed MSInterpreter, which can serve as a plug-and-play scheme readily applicable to various GNNs' baselines. Unlike the most existing explanation methods, MSInterpreter provides a Message-passing Selection scheme(MSScheme) to select the critical paths for GNNs' message aggregations, which aims at reaching the self-explaination instead of post-hoc explanations. In detail, the elaborate MSScheme is designed to calculate weight factors of message aggregation paths by considering the vanilla structure and node embedding components, where the structure base aims at weight factors among node-induced substructures; on the other hand, the node embedding base focuses on weight factors via node embeddings obtained by one-layer GNN.Finally, we demonstrate the effectiveness of our approach on graph classification benchmarks.

* 6 pages, 1 figures

Via

Access Paper or Ask Questions

Improved Projection-free Online Continuous Submodular Maximization

May 29, 2023

Yucheng Liao, Yuanyu Wan, Chang Yao, Mingli Song

Abstract:We investigate the problem of online learning with monotone and continuous DR-submodular reward functions, which has received great attention recently. To efficiently handle this problem, especially in the case with complicated decision sets, previous studies have proposed an efficient projection-free algorithm called Mono-Frank-Wolfe (Mono-FW) using $O(T)$ gradient evaluations and linear optimization steps in total. However, it only attains a $(1-1/e)$-regret bound of $O(T^{4/5})$. In this paper, we propose an improved projection-free algorithm, namely POBGA, which reduces the regret bound to $O(T^{3/4})$ while keeping the same computational complexity as Mono-FW. Instead of modifying Mono-FW, our key idea is to make a novel combination of a projection-based algorithm called online boosting gradient ascent, an infeasible projection technique, and a blocking technique. Furthermore, we consider the decentralized setting and develop a variant of POBGA, which not only reduces the current best regret bound of efficient projection-free algorithms for this setting from $O(T^{4/5})$ to $O(T^{3/4})$, but also reduces the total communication complexity from $O(T)$ to $O(\sqrt{T})$.

Via

Access Paper or Ask Questions

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

May 27, 2023

Yihe Zhou, Shunyu Liu, Yunpeng Qing, Kaixuan Chen, Tongya Zheng, Yanhao Huang, Jie Song, Mingli Song

Figure 1 for Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Figure 2 for Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Figure 3 for Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Figure 4 for Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Abstract:Centralized Training with Decentralized Execution (CTDE) has recently emerged as a popular framework for cooperative Multi-Agent Reinforcement Learning (MARL), where agents can use additional global state information to guide training in a centralized way and make their own decisions only based on decentralized local policies. Despite the encouraging results achieved, CTDE makes an independence assumption on agent policies, which limits agents to adopt global cooperative information from each other during centralized training. Therefore, we argue that existing CTDE methods cannot fully utilize global information for training, leading to an inefficient joint-policy exploration and even suboptimal results. In this paper, we introduce a novel Centralized Advising and Decentralized Pruning (CADP) framework for multi-agent reinforcement learning, that not only enables an efficacious message exchange among agents during training but also guarantees the independent policies for execution. Firstly, CADP endows agents the explicit communication channel to seek and take advices from different agents for more centralized training. To further ensure the decentralized execution, we propose a smooth model pruning mechanism to progressively constraint the agent communication into a closed one without degradation in agent cooperation capability. Empirical evaluations on StarCraft II micromanagement and Google Research Football benchmarks demonstrate that the proposed framework achieves superior performance compared with the state-of-the-art counterparts. Our code will be made publicly available.

Via

Access Paper or Ask Questions