Alert button
Picture for Tianpei Gu

Tianpei Gu

Alert button

Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion

Mar 25, 2022
Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie Zhou, Jiwen Lu

Figure 1 for Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
Figure 2 for Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
Figure 3 for Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
Figure 4 for Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion

Human behavior has the nature of indeterminacy, which requires the pedestrian trajectory prediction system to model the multi-modality of future motion states. Unlike existing stochastic trajectory prediction methods which usually use a latent variable to represent multi-modality, we explicitly simulate the process of human motion variation from indeterminate to determinate. In this paper, we present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID), in which we progressively discard indeterminacy from all the walkable areas until reaching the desired trajectory. This process is learned with a parameterized Markov chain conditioned by the observed trajectories. We can adjust the length of the chain to control the degree of indeterminacy and balance the diversity and determinacy of the predictions. Specifically, we encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories. Extensive experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method. Code is available at https://github.com/gutianpei/MID.

* Accepted to CVPR2022 
Viaarxiv icon

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Mar 25, 2022
Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

Figure 1 for Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
Figure 2 for Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
Figure 3 for Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
Figure 4 for Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Driving 3D characters to dance following a piece of music is highly challenging due to the spatial constraints applied to poses by choreography norms. In addition, the generated dance sequence also needs to maintain temporal coherency with different music genres. To tackle these challenges, we propose a novel music-to-dance framework, Bailando, with two powerful components: 1) a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequence to a quantized codebook, 2) an actor-critic Generative Pre-trained Transformer (GPT) that composes these units to a fluent dance coherent to the music. With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints. To achieve synchronized alignment between diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a newly-designed beat-align reward function. Extensive experiments on the standard benchmark demonstrate that our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively. Notably, the learned choreographic memory is shown to discover human-interpretable dancing-style poses in an unsupervised manner.

* Accepted by CVPR 2022. Code and video link: https://github.com/lisiyao21/Bailando/ 
Viaarxiv icon

Person Re-identification via Attention Pyramid

Aug 11, 2021
Guangyi Chen, Tianpei Gu, Jiwen Lu, Jin-An Bao, Jie Zhou

Figure 1 for Person Re-identification via Attention Pyramid
Figure 2 for Person Re-identification via Attention Pyramid
Figure 3 for Person Re-identification via Attention Pyramid
Figure 4 for Person Re-identification via Attention Pyramid

In this paper, we propose an attention pyramid method for person re-identification. Unlike conventional attention-based methods which only learn a global attention map, our attention pyramid exploits the attention regions in a multi-scale manner because human attention varies with different scales. Our attention pyramid imitates the process of human visual perception which tends to notice the foreground person over the cluttered background, and further focus on the specific color of the shirt with close observation. Specifically, we describe our attention pyramid by a "split-attend-merge-stack" principle. We first split the features into multiple local parts and learn the corresponding attentions. Then, we merge local attentions and stack these merged attentions with the residual connection as an attention pyramid. The proposed attention pyramid is a lightweight plug-and-play module that can be applied to off-the-shelf models. We implement our attention pyramid method in two different attention mechanisms including channel-wise attention and spatial attention. We evaluate our method on four largescale person re-identification benchmarks including Market-1501, DukeMTMC, CUHK03, and MSMT17. Experimental results demonstrate the superiority of our method, which outperforms the state-of-the-art methods by a large margin with limited computational cost.

* Accepted by IEEE Transcations on Image Processing. Code available at https://github.com/CHENGY12/APNet 
Viaarxiv icon