Picture for Steven L. Waslander

Steven L. Waslander

University of Toronto

LMGenDrive: Bridging Multimodal Understanding and Generative World Modeling for End-to-End Driving

Add code
Apr 09, 2026
Viaarxiv icon

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

Add code
Apr 02, 2026
Viaarxiv icon

SCATR: Mitigating New Instance Suppression in LiDAR-based Tracking-by-Attention via Second Chance Assignment and Track Query Dropout

Add code
Mar 02, 2026
Viaarxiv icon

CLIP Is Shortsighted: Paying Attention Beyond the First Sentence

Add code
Feb 25, 2026
Viaarxiv icon

ToaSt: Token Channel Selection and Structured Pruning for Efficient ViT

Add code
Feb 17, 2026
Viaarxiv icon

STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory

Add code
Feb 12, 2026
Viaarxiv icon

DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving

Add code
Jan 04, 2026
Viaarxiv icon

Complete Gaussian Splats from a Single Image with Denoising Diffusion Models

Add code
Aug 29, 2025
Viaarxiv icon

ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting

Add code
Aug 09, 2025
Figure 1 for ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Figure 2 for ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Figure 3 for ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Figure 4 for ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Viaarxiv icon

OpenNav: Open-World Navigation with Multimodal Large Language Models

Add code
Jul 24, 2025
Figure 1 for OpenNav: Open-World Navigation with Multimodal Large Language Models
Figure 2 for OpenNav: Open-World Navigation with Multimodal Large Language Models
Figure 3 for OpenNav: Open-World Navigation with Multimodal Large Language Models
Figure 4 for OpenNav: Open-World Navigation with Multimodal Large Language Models
Viaarxiv icon