Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanja Fidler

NVIDIA, University of Toronto, Vector Institute

Personalized Federated Learning with First Order Model Optimization

Jan 28, 2021

Michael Zhang, Karan Sapra, Sanja Fidler, Serena Yeung, Jose M. Alvarez

Figure 1 for Personalized Federated Learning with First Order Model Optimization

Figure 2 for Personalized Federated Learning with First Order Model Optimization

Figure 3 for Personalized Federated Learning with First Order Model Optimization

Figure 4 for Personalized Federated Learning with First Order Model Optimization

Abstract:While federated learning traditionally aims to train a single global model across decentralized local datasets, one model may not always be ideal for all participating clients. Here we propose an alternative, where each client only federates with other relevant clients to obtain a stronger model per client-specific objectives. To achieve this personalization, rather than computing a single model average with constant weights for the entire federation as in traditional FL, we efficiently calculate optimal weighted model combinations for each client, based on figuring out how much a client can benefit from another's model. We do not assume knowledge of any underlying data distributions or client similarities, and allow each client to optimize for arbitrary target distributions of interest, enabling greater flexibility for personalization. We evaluate and characterize our method on a variety of federated settings, datasets, and degrees of local data heterogeneity. Our method outperforms existing alternatives, while also enabling new features for personalized FL such as transfer outside of local data distributions.

* ICLR 2021

Via

Access Paper or Ask Questions

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Jan 26, 2021

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

Figure 1 for Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Figure 2 for Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Figure 3 for Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Figure 4 for Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Abstract:Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering with these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making these representations impractical for real-time graphics. We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality. We represent implicit surfaces using an octree-based feature volume which adaptively fits shapes with multiple discrete levels of detail (LODs), and enables continuous LOD with SDF interpolation. We further develop an efficient algorithm to directly render our novel neural SDF representation in real-time by querying only the necessary LODs with sparse octree traversal. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works. Furthermore, it produces state-of-the-art reconstruction quality for complex shapes under both 3D geometric and 2D image-space metrics.

Via

Access Paper or Ask Questions

UniCon: Universal Neural Controller For Physics-based Character Motion

Nov 30, 2020

Tingwu Wang, Yunrong Guo, Maria Shugrina, Sanja Fidler

Figure 1 for UniCon: Universal Neural Controller For Physics-based Character Motion

Figure 2 for UniCon: Universal Neural Controller For Physics-based Character Motion

Figure 3 for UniCon: Universal Neural Controller For Physics-based Character Motion

Figure 4 for UniCon: Universal Neural Controller For Physics-based Character Motion

Abstract:The field of physics-based animation is gaining importance due to the increasing demand for realism in video games and films, and has recently seen wide adoption of data-driven techniques, such as deep reinforcement learning (RL), which learn control from (human) demonstrations. While RL has shown impressive results at reproducing individual motions and interactive locomotion, existing methods are limited in their ability to generalize to new motions and their ability to compose a complex motion sequence interactively. In this paper, we propose a physics-based universal neural controller (UniCon) that learns to master thousands of motions with different styles by learning on large-scale motion datasets. UniCon is a two-level framework that consists of a high-level motion scheduler and an RL-powered low-level motion executor, which is our key innovation. By systematically analyzing existing multi-motion RL frameworks, we introduce a novel objective function and training techniques which make a significant leap in performance. Once trained, our motion executor can be combined with different high-level schedulers without the need for retraining, enabling a variety of real-time interactive applications. We show that UniCon can support keyboard-driven control, compose motion sequences drawn from a large pool of locomotion and acrobatics skills and teleport a person captured on video to a physics-based virtual avatar. Numerical and qualitative results demonstrate a significant improvement in efficiency, robustness and generalizability of UniCon over prior state-of-the-art, showcasing transferability to unseen motions, unseen humanoid models and unseen perturbation.

* 15 pages, 15 figures

Via

Access Paper or Ask Questions

Emergent Road Rules In Multi-Agent Driving Environments

Nov 21, 2020

Avik Pal, Jonah Philion, Yuan-Hong Liao, Sanja Fidler

Figure 1 for Emergent Road Rules In Multi-Agent Driving Environments

Figure 2 for Emergent Road Rules In Multi-Agent Driving Environments

Figure 3 for Emergent Road Rules In Multi-Agent Driving Environments

Figure 4 for Emergent Road Rules In Multi-Agent Driving Environments

Abstract:For autonomous vehicles to safely share the road with human drivers, autonomous vehicles must abide by specific "road rules" that human drivers have agreed to follow. "Road rules" include rules that drivers are required to follow by law -- such as the requirement that vehicles stop at red lights -- as well as more subtle social rules -- such as the implicit designation of fast lanes on the highway. In this paper, we provide empirical evidence that suggests that -- instead of hard-coding road rules into self-driving algorithms -- a scalable alternative may be to design multi-agent environments in which road rules emerge as optimal solutions to the problem of maximizing traffic flow. We analyze what ingredients in driving environments cause the emergence of these road rules and find that two crucial factors are noisy perception and agents' spatial density. We provide qualitative and quantitative evidence of the emergence of seven social driving behaviors, ranging from obeying traffic signals to following lanes, all of which emerge from training agents to drive quickly to destinations without colliding. Our results add empirical support for the social road rules that countries worldwide have agreed on for safe, efficient driving.

* Project Page: http://fidler-lab.github.io/social-driving

Via

Access Paper or Ask Questions

Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Nov 03, 2020

Jun Gao, Wenzheng Chen, Tommy Xiang, Alec Jacobson, Morgan McGuire, Sanja Fidler

Figure 1 for Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Figure 2 for Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Figure 3 for Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Figure 4 for Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Abstract:3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics. Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations. We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem. Unlike existing volumetric approaches, DefTet optimizes for both vertex placement and occupancy, and is differentiable with respect to standard 3D reconstruction loss functions. It is thus simultaneously high-precision, volumetric, and amenable to learning-based neural architectures. We show that it can represent arbitrary, complex topology, is both memory and computationally efficient, and can produce high-fidelity reconstructions with a significantly smaller grid size than alternative volumetric approaches. The predicted surfaces are also inherently defined as tetrahedral meshes, thus do not require post-processing. We demonstrate that DefTet matches or exceeds both the quality of the previous best approaches and the performance of the fastest ones. Our approach obtains high-quality tetrahedral meshes computed directly from noisy point clouds, and is the first to showcase high-quality 3D tet-mesh results using only a single image as input.

* Accepted to NeurIPS 2020

Via

Access Paper or Ask Questions

The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

Oct 24, 2020

Yiluan Guo, Holger Caesar, Oscar Beijbom, Jonah Philion, Sanja Fidler

Figure 1 for The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

Figure 2 for The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

Figure 3 for The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

Figure 4 for The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

Abstract:A high-performing object detection system plays a crucial role in autonomous driving (AD). The performance, typically evaluated in terms of mean Average Precision, does not take into account orientation and distance of the actors in the scene, which are important for the safe AD. It also ignores environmental context. Recently, Philion et al. proposed a neural planning metric (PKL), based on the KL divergence of a planner's trajectory and the groundtruth route, to accommodate these requirements. In this paper, we use this neural planning metric to score all submissions of the nuScenes detection challenge and analyze the results. We find that while somewhat correlated with mAP, the PKL metric shows different behavior to increased traffic density, ego velocity, road curvature and intersections. Finally, we propose ideas to extend the neural planning metric.

* IROS 2020 Workshop on Benchmarking Progress in Autonomous Driving

Via

Access Paper or Ask Questions

Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Oct 19, 2020

Xavier Puig, Tianmin Shu, Shuang Li, Zilin Wang, Joshua B. Tenenbaum, Sanja Fidler, Antonio Torralba

Figure 1 for Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Figure 2 for Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Figure 3 for Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Figure 4 for Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Abstract:In this paper, we introduce Watch-And-Help (WAH), a challenge for testing social intelligence in agents. In WAH, an AI agent needs to help a human-like agent perform a complex household task efficiently. To succeed, the AI agent needs to i) understand the underlying goal of the task by watching a single demonstration of the human-like agent performing the same task (social perception), and ii) coordinate with the human-like agent to solve the task in an unseen environment as fast as possible (human-AI collaboration). For this challenge, we build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines. We evaluate the performance of AI agents with the human-like agent as well as with real humans using objective metrics and subjective user ratings. Experimental results demonstrate that the proposed challenge and virtual environment enable a systematic evaluation on the important aspects of machine social intelligence at scale.

Via

Access Paper or Ask Questions

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Oct 18, 2020

Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Figure 1 for Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Figure 2 for Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Figure 3 for Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Figure 4 for Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Abstract:Differentiable rendering has paved the way to training neural networks to perform "inverse graphics" tasks such as predicting 3D geometry from monocular photographs. To train high performing models, most of the current approaches rely on multi-view imagery which are not readily available in practice. Recent Generative Adversarial Networks (GANs) that synthesize images, in contrast, seem to acquire 3D knowledge implicitly during training: object viewpoints can be manipulated by simply manipulating the latent codes. However, these latent codes often lack further physical interpretation and thus GANs cannot easily be inverted to perform explicit 3D reasoning. In this paper, we aim to extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers. Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties. The entire architecture is trained iteratively using cycle consistency losses. We show that our approach significantly outperforms state-of-the-art inverse graphics networks trained on existing datasets, both quantitatively and via user studies. We further showcase the disentangled GAN as a controllable 3D "neural renderer", complementing traditional graphics renderers.

Via

Access Paper or Ask Questions

Fed-Sim: Federated Simulation for Medical Imaging

Sep 01, 2020

Daiqing Li, Amlan Kar, Nishant Ravikumar, Alejandro F Frangi, Sanja Fidler

Figure 1 for Fed-Sim: Federated Simulation for Medical Imaging

Figure 2 for Fed-Sim: Federated Simulation for Medical Imaging

Figure 3 for Fed-Sim: Federated Simulation for Medical Imaging

Figure 4 for Fed-Sim: Federated Simulation for Medical Imaging

Abstract:Labelling data is expensive and time consuming especially for domains such as medical imaging that contain volumetric imaging data and require expert knowledge. Exploiting a larger pool of labeled data available across multiple centers, such as in federated learning, has also seen limited success since current deep learning approaches do not generalize well to images acquired with scanners from different manufacturers. We aim to address these problems in a common, learning-based image simulation framework which we refer to as Federated Simulation. We introduce a physics-driven generative approach that consists of two learnable neural modules: 1) a module that synthesizes 3D cardiac shapes along with their materials, and 2) a CT simulator that renders these into realistic 3D CT Volumes, with annotations. Since the model of geometry and material is disentangled from the imaging sensor, it can effectively be trained across multiple medical centers. We show that our data synthesis framework improves the downstream segmentation performance on several datasets. Project Page: https://nv-tlabs.github.io/fed-sim/ .

* MICCAI 2020 (Early Accept)

Via

Access Paper or Ask Questions

Expressive Telepresence via Modular Codec Avatars

Aug 26, 2020

Hang Chu, Shugao Ma, Fernando De la Torre, Sanja Fidler, Yaser Sheikh

Figure 1 for Expressive Telepresence via Modular Codec Avatars

Figure 2 for Expressive Telepresence via Modular Codec Avatars

Figure 3 for Expressive Telepresence via Modular Codec Avatars

Figure 4 for Expressive Telepresence via Modular Codec Avatars

Abstract:VR telepresence consists of interacting with another human in a virtual space represented by an avatar. Today most avatars are cartoon-like, but soon the technology will allow video-realistic ones. This paper aims in this direction and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic faces driven by the cameras in the VR headset. MCA extends traditional Codec Avatars (CA) by replacing the holistic models with a learned modular representation. It is important to note that traditional person-specific CAs are learned from few training samples, and typically lack robustness as well as limited expressiveness when transferring facial expressions. MCAs solve these issues by learning a modulated adaptive blending of different facial components as well as an exemplar-based latent alignment. We demonstrate that MCA achieves improved expressiveness and robustness w.r.t to CA in a variety of real-world datasets and practical scenarios. Finally, we showcase new applications in VR telepresence enabled by the proposed model.

* ECCV 2020

Via

Access Paper or Ask Questions