Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Colin Graber

STT: Stateful Tracking with Transformers for Autonomous Driving

Apr 30, 2024

Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng(+12 more)

Abstract:Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying complex heuristics to predict the states. In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately. STT consumes rich appearance, geometry, and motion signals through long term history of detections and is jointly optimized for both data association and state estimation tasks. Since the standard tracking metrics like MOTA and MOTP do not capture the combined performance of the two tasks in the wider spectrum of object states, we extend them with new metrics called S-MOTA and MOTPS that address this limitation. STT achieves competitive real-time performance on the Waymo Open Dataset.

* ICRA 2024

Via

Access Paper or Ask Questions

Joint Forecasting of Panoptic Segmentations with Difference Attention

Apr 14, 2022

Colin Graber, Cyril Jazra, Wenjie Luo, Liangyan Gui, Alexander Schwing

Figure 1 for Joint Forecasting of Panoptic Segmentations with Difference Attention

Figure 2 for Joint Forecasting of Panoptic Segmentations with Difference Attention

Figure 3 for Joint Forecasting of Panoptic Segmentations with Difference Attention

Figure 4 for Joint Forecasting of Panoptic Segmentations with Difference Attention

Abstract:Forecasting of a representation is important for safe and effective autonomy. For this, panoptic segmentations have been studied as a compelling representation in recent work. However, recent state-of-the-art on panoptic segmentation forecasting suffers from two issues: first, individual object instances are treated independently of each other; second, individual object instance forecasts are merged in a heuristic manner. To address both issues, we study a new panoptic segmentation forecasting model that jointly forecasts all object instances in a scene using a transformer model based on 'difference attention.' It further refines the predictions by taking depth estimates into account. We evaluate the proposed model on the Cityscapes and AIODrive datasets. We find difference attention to be particularly suitable for forecasting because the difference of quantities like locations enables a model to explicitly reason about velocities and acceleration. Because of this, we attain state-of-the-art on panoptic segmentation forecasting metrics.

* Accepted by CVPR 2022 (Oral)

Via

Access Paper or Ask Questions

Panoptic Segmentation Forecasting

Apr 08, 2021

Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing

Figure 1 for Panoptic Segmentation Forecasting

Figure 2 for Panoptic Segmentation Forecasting

Figure 3 for Panoptic Segmentation Forecasting

Figure 4 for Panoptic Segmentation Forecasting

Abstract:Our goal is to forecast the near future given a set of recent observations. We think this ability to forecast, i.e., to anticipate, is integral for the success of autonomous agents which need not only passively analyze an observation but also must react to it in real-time. Importantly, accurate forecasting hinges upon the chosen scene decomposition. We think that superior forecasting can be achieved by decomposing a dynamic scene into individual 'things' and background 'stuff'. Background 'stuff' largely moves because of camera motion, while foreground 'things' move because of both camera and individual object motion. Following this decomposition, we introduce panoptic segmentation forecasting. Panoptic segmentation forecasting opens up a middle-ground between existing extremes, which either forecast instance trajectories or predict the appearance of future image frames. To address this task we develop a two-component model: one component learns the dynamics of the background stuff by anticipating odometry, the other one anticipates the dynamics of detected things. We establish a leaderboard for this novel task, and validate a state-of-the-art model that outperforms available baselines.

* CVPR 2021

Via

Access Paper or Ask Questions

Graph Structured Prediction Energy Networks

Oct 31, 2019

Colin Graber, Alexander Schwing

Figure 1 for Graph Structured Prediction Energy Networks

Figure 2 for Graph Structured Prediction Energy Networks

Figure 3 for Graph Structured Prediction Energy Networks

Figure 4 for Graph Structured Prediction Energy Networks

Abstract:For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce `Graph Structured Prediction Energy Networks,' for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility.

* Appearing in NeurIPS 2019

Via

Access Paper or Ask Questions

Deep Structured Prediction with Nonlinear Output Transformations

Nov 01, 2018

Colin Graber, Ofer Meshi, Alexander Schwing

Figure 1 for Deep Structured Prediction with Nonlinear Output Transformations

Figure 2 for Deep Structured Prediction with Nonlinear Output Transformations

Figure 3 for Deep Structured Prediction with Nonlinear Output Transformations

Figure 4 for Deep Structured Prediction with Nonlinear Output Transformations

Abstract:Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets. However, current deep structured models are restricted by oftentimes very local neighborhood structure, which cannot be increased for computational complexity reasons, and by the fact that the output configuration, or a representation thereof, cannot be transformed further. Very recent approaches which address those issues include graphical model inference inside deep nets so as to permit subsequent non-linear output space transformations. However, optimization of those formulations is challenging and not well understood. Here, we develop a novel model which generalizes existing approaches, such as structured prediction energy networks, and discuss a formulation which maintains applicability of existing inference techniques.

* Appearing in NIPS 2018

Via

Access Paper or Ask Questions