The fusion of color and lidar data plays a critical role in object detection for autonomous vehicles, which base their decision making on these inputs. While existing methods exploit redundant and complimentary information under good imaging conditions, they fail to do this in adverse weather and imaging conditions where the sensory streams can be asymmetrically distorted. These rare "edge-case" scenarios are not represented in available data sets, and existing fusion architectures are not designed to handle severe asymmetric distortions. We present a deep fusion architecture that allows for robust fusion in fog and snow without having large labeled training data available for these scenarios. Departing from proposal-level fusion, we propose a real-time single-shot model that adaptively fuses features driven by temporal coherence of the distortions. We validate the proposed method, trained on clean data, in simulation and on unseen conditions of in-the-wild driving scenarios.