Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Raquel Urtasun

TTIC

Real-Time Neural Rasterization for Large Scenes

Nov 09, 2023

Jeffrey Yunfan Liu, Yun Chen, Ze Yang, Jingkang Wang, Sivabalan Manivasagam, Raquel Urtasun

Abstract:We propose a new method for realistic real-time novel-view synthesis (NVS) of large scenes. Existing neural rendering methods generate realistic results, but primarily work for small scale scenes (<50 square meters) and have difficulty at large scale (>10000 square meters). Traditional graphics-based rasterization rendering is fast for large scenes but lacks realism and requires expensive manually created assets. Our approach combines the best of both worlds by taking a moderate-quality scaffold mesh as input and learning a neural texture field and shader to model view-dependant effects to enhance realism, while still using the standard graphics pipeline for real-time rendering. Our method outperforms existing neural rendering methods, providing at least 30x faster rendering with comparable or better realism for large self-driving and drone scenes. Our work is the first to enable real-time rendering of large real-world scenes.

* Published in ICCV 2023. webpage: https://waabi.ai/NeuRas/

Via

Access Paper or Ask Questions

Towards Unsupervised Object Detection From LiDAR Point Clouds

Nov 03, 2023

Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun

Figure 1 for Towards Unsupervised Object Detection From LiDAR Point Clouds

Figure 2 for Towards Unsupervised Object Detection From LiDAR Point Clouds

Figure 3 for Towards Unsupervised Object Detection From LiDAR Point Clouds

Figure 4 for Towards Unsupervised Object Detection From LiDAR Point Clouds

Abstract:In this paper, we study the problem of unsupervised object detection from 3D point clouds in self-driving scenes. We present a simple yet effective method that exploits (i) point clustering in near-range areas where the point clouds are dense, (ii) temporal consistency to filter out noisy unsupervised detections, (iii) translation equivariance of CNNs to extend the auto-labels to long range, and (iv) self-supervision for improving on its own. Our approach, OYSTER (Object Discovery via Spatio-Temporal Refinement), does not impose constraints on data collection (such as repeated traversals of the same location), is able to detect objects in a zero-shot manner without supervised finetuning (even in sparse, distant regions), and continues to self-improve given more rounds of iterative self-training. To better measure model performance in self-driving scenarios, we propose a new planning-centric perception metric based on distance-to-collision. We demonstrate that our unsupervised object detector significantly outperforms unsupervised baselines on PandaSet and Argoverse 2 Sensor dataset, showing promise that self-supervision combined with object priors can enable object discovery in the wild. For more information, visit the project website: https://waabi.ai/research/oyster

* CVPR 2023

Via

Access Paper or Ask Questions

MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

Nov 02, 2023

Enxu Li, Sergio Casas, Raquel Urtasun

Figure 1 for MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

Figure 2 for MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

Figure 3 for MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

Figure 4 for MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

Abstract:Semantic segmentation of LiDAR point clouds has been widely studied in recent years, with most existing methods focusing on tackling this task using a single scan of the environment. However, leveraging the temporal stream of observations can provide very rich contextual information on regions of the scene with poor visibility (e.g., occlusions) or sparse observations (e.g., at long range), and can help reduce redundant computation frame after frame. In this paper, we tackle the challenge of exploiting the information from the past frames to improve the predictions of the current frame in an online fashion. To address this challenge, we propose a novel framework for semantic segmentation of a temporal sequence of LiDAR point clouds that utilizes a memory network to store, update and retrieve past information. Our framework also includes a regularizer that penalizes prediction variations in the neighborhood of the point cloud. Prior works have attempted to incorporate memory in range view representations for semantic segmentation, but these methods fail to handle occlusions and the range view representation of the scene changes drastically as agents nearby move. Our proposed framework overcomes these limitations by building a sparse 3D latent representation of the surroundings. We evaluate our method on SemanticKITTI, nuScenes, and PandaSet. Our experiments demonstrate the effectiveness of the proposed framework compared to the state-of-the-art.

* accepted at ICCV 2023

Via

Access Paper or Ask Questions

UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

Nov 02, 2023

Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

Abstract:LiDAR provides accurate geometric measurements of the 3D world. Unfortunately, dense LiDARs are very expensive and the point clouds captured by low-beam LiDAR are often sparse. To address these issues, we present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation. The crux of UltraLiDAR is a compact, discrete representation that encodes the point cloud's geometric structure, is robust to noise, and is easy to manipulate. We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds as if they were captured by a real high-density LiDAR, drastically reducing the cost. Furthermore, by learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving. We evaluate the effectiveness of UltraLiDAR on sparse-to-dense LiDAR completion and LiDAR generation. Experiments show that densifying real-world point clouds with our approach can significantly improve the performance of downstream perception systems. Compared to prior art on LiDAR generation, our approach generates much more realistic point clouds. According to A/B test, over 98.5\% of the time human participants prefer our results over those of previous methods.

* CVPR 2023. Project page: https://waabi.ai/ultralidar/

Via

Access Paper or Ask Questions

Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

Nov 02, 2023

Jay Sarva, Jingkang Wang, James Tu, Yuwen Xiong, Sivabalan Manivasagam, Raquel Urtasun

Figure 1 for Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

Figure 2 for Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

Figure 3 for Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

Figure 4 for Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

Abstract:Self-driving vehicles (SDVs) must be rigorously tested on a wide range of scenarios to ensure safe deployment. The industry typically relies on closed-loop simulation to evaluate how the SDV interacts on a corpus of synthetic and real scenarios and verify it performs properly. However, they primarily only test the system's motion planning module, and only consider behavior variations. It is key to evaluate the full autonomy system in closed-loop, and to understand how variations in sensor data based on scene appearance, such as the shape of actors, affect system performance. In this paper, we propose a framework, Adv3D, that takes real world scenarios and performs closed-loop sensor simulation to evaluate autonomy performance, and finds vehicle shapes that make the scenario more challenging, resulting in autonomy failures and uncomfortable SDV maneuvers. Unlike prior works that add contrived adversarial shapes to vehicle roof-tops or roadside to harm perception only, we optimize a low-dimensional shape representation to modify the vehicle shape itself in a realistic manner to degrade autonomy performance (e.g., perception, prediction, and motion planning). Moreover, we find that the shape variations found with Adv3D optimized in closed-loop are much more effective than those in open-loop, demonstrating the importance of finding scene appearance variations that affect autonomy in the interactive setting.

* CoRL 2023. Project page: https://waabi.ai/adv3d/

Via

Access Paper or Ask Questions

LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

Nov 02, 2023

Anqi Joyce Yang, Sergio Casas, Nikita Dvornik, Sean Segal, Yuwen Xiong, Jordan Sir Kwang Hu, Carter Fang, Raquel Urtasun

Figure 1 for LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

Figure 2 for LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

Figure 3 for LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

Figure 4 for LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

Abstract:A major bottleneck to scaling-up training of self-driving perception systems are the human annotations required for supervision. A promising alternative is to leverage "auto-labelling" offboard perception models that are trained to automatically generate annotations from raw LiDAR point clouds at a fraction of the cost. Auto-labels are most commonly generated via a two-stage approach -- first objects are detected and tracked over time, and then each object trajectory is passed to a learned refinement model to improve accuracy. Since existing refinement models are overly complex and lack advanced temporal reasoning capabilities, in this work we propose LabelFormer, a simple, efficient, and effective trajectory-level refinement approach. Our approach first encodes each frame's observations separately, then exploits self-attention to reason about the trajectory with full temporal context, and finally decodes the refined object size and per-frame poses. Evaluation on both urban and highway datasets demonstrates that LabelFormer outperforms existing works by a large margin. Finally, we show that training on a dataset augmented with auto-labels generated by our method leads to improved downstream detection performance compared to existing methods. Please visit the project website for details https://waabi.ai/labelformer

* CoRL 2023
* 20 pages, 8 figures, 7 tables

Via

Access Paper or Ask Questions

Learning Realistic Traffic Agents in Closed-loop

Nov 02, 2023

Chris Zhang, James Tu, Lunjun Zhang, Kelvin Wong, Simon Suo, Raquel Urtasun

Abstract:Realistic traffic simulation is crucial for developing self-driving software in a safe and scalable manner prior to real-world deployment. Typically, imitation learning (IL) is used to learn human-like traffic agents directly from real-world observations collected offline, but without explicit specification of traffic rules, agents trained from IL alone frequently display unrealistic infractions like collisions and driving off the road. This problem is exacerbated in out-of-distribution and long-tail scenarios. On the other hand, reinforcement learning (RL) can train traffic agents to avoid infractions, but using RL alone results in unhuman-like driving behaviors. We propose Reinforcing Traffic Rules (RTR), a holistic closed-loop learning objective to match expert demonstrations under a traffic compliance constraint, which naturally gives rise to a joint IL + RL approach, obtaining the best of both worlds. Our method learns in closed-loop simulations of both nominal scenarios from real-world datasets as well as procedurally generated long-tail scenarios. Our experiments show that RTR learns more realistic and generalizable traffic simulation policies, achieving significantly better tradeoffs between human-like driving and traffic compliance in both nominal and long-tail scenarios. Moreover, when used as a data generation tool for training prediction models, our learned traffic policy leads to considerably improved downstream prediction metrics compared to baseline traffic agents. For more information, visit the project website: https://waabi.ai/rtr

* CORL 2023

Via

Access Paper or Ask Questions

CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

Nov 02, 2023

Jingkang Wang, Sivabalan Manivasagam, Yun Chen, Ze Yang, Ioan Andrei Bârsan, Anqi Joyce Yang, Wei-Chiu Ma, Raquel Urtasun

Abstract:Realistic simulation is key to enabling safe and scalable development of % self-driving vehicles. A core component is simulating the sensors so that the entire autonomy system can be tested in simulation. Sensor simulation involves modeling traffic participants, such as vehicles, with high quality appearance and articulated geometry, and rendering them in real time. The self-driving industry has typically employed artists to build these assets. However, this is expensive, slow, and may not reflect reality. Instead, reconstructing assets automatically from sensor data collected in the wild would provide a better path to generating a diverse and large set with good real-world coverage. Nevertheless, current reconstruction approaches struggle on in-the-wild sensor data, due to its sparsity and noise. To tackle these issues, we present CADSim, which combines part-aware object-class priors via a small set of CAD models with differentiable rendering to automatically reconstruct vehicle geometry, including articulated wheels, with high-quality appearance. Our experiments show our method recovers more accurate shapes from sparse data compared to existing approaches. Importantly, it also trains and renders efficiently. We demonstrate our reconstructed vehicles in several applications, including accurate testing of autonomy perception systems.

* CoRL 2022. Project page: https://waabi.ai/cadsim/

Via

Access Paper or Ask Questions

Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

Nov 02, 2023

Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun

Abstract:Learning world models can teach an agent how the world works in an unsupervised manner. Even though it can be viewed as a special case of sequence modeling, progress for scaling world models on robotic applications such as autonomous driving has been somewhat less rapid than scaling language models with Generative Pre-trained Transformers (GPT). We identify two reasons as major bottlenecks: dealing with complex and unstructured observation space, and having a scalable generative model. Consequently, we propose a novel world modeling approach that first tokenizes sensor observations with VQVAE, then predicts the future via discrete diffusion. To efficiently decode and denoise tokens in parallel, we recast Masked Generative Image Transformer into the discrete diffusion framework with a few simple changes, resulting in notable improvement. When applied to learning world models on point cloud observations, our model reduces prior SOTA Chamfer distance by more than 65% for 1s prediction, and more than 50% for 3s prediction, across NuScenes, KITTI Odometry, and Argoverse2 datasets. Our results demonstrate that discrete diffusion on tokenized agent experience can unlock the power of GPT-like unsupervised learning for robotic agents.

Via

Access Paper or Ask Questions

UniSim: A Neural Closed-Loop Sensor Simulator

Aug 03, 2023

Ze Yang, Yun Chen, Jingkang Wang, Sivabalan Manivasagam, Wei-Chiu Ma, Anqi Joyce Yang, Raquel Urtasun

Figure 1 for UniSim: A Neural Closed-Loop Sensor Simulator

Figure 2 for UniSim: A Neural Closed-Loop Sensor Simulator

Figure 3 for UniSim: A Neural Closed-Loop Sensor Simulator

Figure 4 for UniSim: A Neural Closed-Loop Sensor Simulator

Abstract:Rigorously testing autonomy systems is essential for making safe self-driving vehicles (SDV) a reality. It requires one to generate safety critical scenarios beyond what can be collected safely in the world, as many scenarios happen rarely on public roads. To accurately evaluate performance, we need to test the SDV on these scenarios in closed-loop, where the SDV and other actors interact with each other at each timestep. Previously recorded driving logs provide a rich resource to build these new scenarios from, but for closed loop evaluation, we need to modify the sensor data based on the new scene configuration and the SDV's decisions, as actors might be added or removed and the trajectories of existing actors and the SDV will differ from the original log. In this paper, we present UniSim, a neural sensor simulator that takes a single recorded log captured by a sensor-equipped vehicle and converts it into a realistic closed-loop multi-sensor simulation. UniSim builds neural feature grids to reconstruct both the static background and dynamic actors in the scene, and composites them together to simulate LiDAR and camera data at new viewpoints, with actors added or removed and at new placements. To better handle extrapolated views, we incorporate learnable priors for dynamic objects, and leverage a convolutional network to complete unseen regions. Our experiments show UniSim can simulate realistic sensor data with small domain gap on downstream tasks. With UniSim, we demonstrate closed-loop evaluation of an autonomy system on safety-critical scenarios as if it were in the real world.

* CVPR 2023 Highlight. Project page: https://waabi.ai/research/unisim/

Via

Access Paper or Ask Questions