Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jin Bok Park

Autoregression-free video prediction using diffusion model for mitigating error propagation

May 28, 2025

Woonho Ko, Jin Bok Park, Il Yong Chun

Abstract:Existing long-term video prediction methods often rely on an autoregressive video prediction mechanism. However, this approach suffers from error propagation, particularly in distant future frames. To address this limitation, this paper proposes the first AutoRegression-Free (ARFree) video prediction framework using diffusion models. Different from an autoregressive video prediction mechanism, ARFree directly predicts any future frame tuples from the context frame tuple. The proposed ARFree consists of two key components: 1) a motion prediction module that predicts a future motion using motion feature extracted from the context frame tuple; 2) a training method that improves motion continuity and contextual consistency between adjacent future frame tuples. Our experiments with two benchmark datasets show that the proposed ARFree video prediction framework outperforms several state-of-the-art video prediction methods.

* 6 pages, 4 figures, 2 tables

Via

Access Paper or Ask Questions

End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

Aug 28, 2023

Jin Bok Park, Jinkyu Lee, Muhyun Back, Hyunmin Han, David T. Ma, Sang Min Won, Sung Soo Hwang, Il Yong Chun

Figure 1 for End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

Figure 2 for End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

Figure 3 for End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

Figure 4 for End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

Abstract:In autonomous driving, the end-to-end (E2E) driving approach that predicts vehicle control signals directly from sensor data is rapidly gaining attention. To learn a safe E2E driving system, one needs an extensive amount of driving data and human intervention. Vehicle control data is constructed by many hours of human driving, and it is challenging to construct large vehicle control datasets. Often, publicly available driving datasets are collected with limited driving scenes, and collecting vehicle control data is only available by vehicle manufacturers. To address these challenges, this paper proposes the first self-supervised learning framework, self-supervised imitation learning (SSIL), that can learn E2E driving networks without using driving command data. To construct pseudo steering angle data, proposed SSIL predicts a pseudo target from the vehicle's poses at the current and previous time points that are estimated with light detection and ranging sensors. Our numerical experiments demonstrate that the proposed SSIL framework achieves comparable E2E driving accuracy with the supervised learning counterpart. In addition, our qualitative analyses using a conventional visual explanation tool show that trained NNs by proposed SSIL and the supervision counterpart attend similar objects in making predictions.

* 20 pages, 8 figures

Via

Access Paper or Ask Questions