Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tatiana Tommasi

Politecnico di Torino, Italy, Italian Institute of Technology

Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Jun 03, 2024

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

Figure 1 for Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Figure 2 for Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Figure 3 for Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Figure 4 for Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Abstract:Recent advances in neural network pruning have shown how it is possible to reduce the computational costs and memory demands of deep learning models before training. We focus on this framework and propose a new pruning at initialization algorithm that leverages the Neural Tangent Kernel (NTK) theory to align the training dynamics of the sparse network with that of the dense one. Specifically, we show how the usually neglected data-dependent component in the NTK's spectrum can be taken into account by providing an analytical upper bound to the NTK's trace obtained by decomposing neural networks into individual paths. This leads to our Path eXclusion (PX), a foresight pruning method designed to preserve the parameters that mostly influence the NTK's trace. PX is able to find lottery tickets (i.e. good paths) even at high sparsity levels and largely reduces the need for additional training. When applied to pre-trained models it extracts subnetworks directly usable for several downstream tasks, resulting in performance comparable to those of the dense counterpart but with substantial cost and computational savings. Code available at: https://github.com/iurada/px-ntk-pruning

* Accepted CVPR 2024 - https://iurada.github.io/PX

Via

Access Paper or Ask Questions

Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation

Apr 08, 2024

Qitian Ma, Shyam Nanda Rai, Carlo Masone, Tatiana Tommasi

Abstract:In the domain of computer vision, semantic segmentation emerges as a fundamental application within machine learning, wherein individual pixels of an image are classified into distinct semantic categories. This task transcends traditional accuracy metrics by incorporating uncertainty quantification, a critical measure for assessing the reliability of each segmentation prediction. Such quantification is instrumental in facilitating informed decision-making, particularly in applications where precision is paramount. Within this nuanced framework, the metric known as PAvPU (Patch Accuracy versus Patch Uncertainty) has been developed as a specialized tool for evaluating entropy-based uncertainty in image segmentation tasks. However, our investigation identifies three core deficiencies within the PAvPU framework and proposes robust solutions aimed at refining the metric. By addressing these issues, we aim to enhance the reliability and applicability of uncertainty quantification, especially in scenarios that demand high levels of safety and accuracy, thus contributing to the advancement of semantic segmentation methodologies in critical applications.

* Premature Submission: accidentally submitted before it was ready

Via

Access Paper or Ask Questions

PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Dec 18, 2023

Antonio Alliegro, Yawar Siddiqui, Tatiana Tommasi, Matthias Nießner

Figure 1 for PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Figure 2 for PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Figure 3 for PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Figure 4 for PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Abstract:We introduce PolyDiff, the first diffusion-based approach capable of directly generating realistic and diverse 3D polygonal meshes. In contrast to methods that use alternate 3D shape representations (e.g. implicit representations), our approach is a discrete denoising diffusion probabilistic model that operates natively on the polygonal mesh data structure. This enables learning of both the geometric properties of vertices and the topological characteristics of faces. Specifically, we treat meshes as quantized triangle soups, progressively corrupted with categorical noise in the forward diffusion phase. In the reverse diffusion phase, a transformer-based denoising network is trained to revert the noising process, restoring the original mesh structure. At inference, new meshes can be generated by applying this denoising network iteratively, starting with a completely noisy triangle soup. Consequently, our model is capable of producing high-quality 3D polygonal meshes, ready for integration into downstream 3D workflows. Our extensive experimental analysis shows that PolyDiff achieves a significant advantage (avg. FID and JSD improvement of 18.2 and 5.8 respectively) over current state-of-the-art methods.

Via

Access Paper or Ask Questions

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Nov 27, 2023

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, Matthias Nießner

Figure 1 for MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Figure 2 for MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Figure 3 for MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Figure 4 for MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Abstract:We introduce MeshGPT, a new approach for generating triangle meshes that reflects the compactness typical of artist-created meshes, in contrast to dense triangle meshes extracted by iso-surfacing methods from neural fields. Inspired by recent advances in powerful large language models, we adopt a sequence-based approach to autoregressively generate triangle meshes as sequences of triangles. We first learn a vocabulary of latent quantized embeddings, using graph convolutions, which inform these embeddings of the local mesh geometry and topology. These embeddings are sequenced and decoded into triangles by a decoder, ensuring that they can effectively reconstruct the mesh. A transformer is then trained on this learned vocabulary to predict the index of the next embedding given previous embeddings. Once trained, our model can be autoregressively sampled to generate new triangle meshes, directly generating compact meshes with sharp edges, more closely imitating the efficient triangulation patterns of human-crafted meshes. MeshGPT demonstrates a notable improvement over state of the art mesh generation methods, with a 9% increase in shape coverage and a 30-point enhancement in FID scores across various categories.

* Project Page: https://nihalsid.github.io/mesh-gpt/, Video: https://youtu.be/UV90O1_69_o

Via

Access Paper or Ask Questions

Domain Randomization via Entropy Maximization

Nov 03, 2023

Gabriele Tiboni, Pascal Klink, Jan Peters, Tatiana Tommasi, Carlo D'Eramo, Georgia Chalvatzaki

Abstract:Varying dynamics parameters in simulation is a popular Domain Randomization (DR) approach for overcoming the reality gap in Reinforcement Learning (RL). Nevertheless, DR heavily hinges on the choice of the sampling distribution of the dynamics parameters, since high variability is crucial to regularize the agent's behavior but notoriously leads to overly conservative policies when randomizing excessively. In this paper, we propose a novel approach to address sim-to-real transfer, which automatically shapes dynamics distributions during training in simulation without requiring real-world data. We introduce DOmain RAndomization via Entropy MaximizatiON (DORAEMON), a constrained optimization problem that directly maximizes the entropy of the training distribution while retaining generalization capabilities. In achieving this, DORAEMON gradually increases the diversity of sampled dynamics parameters as long as the probability of success of the current policy is sufficiently high. We empirically validate the consistent benefits of DORAEMON in obtaining highly adaptive and generalizable policies, i.e. solving the task at hand across the widest range of dynamics parameters, as opposed to representative baselines from the DR literature. Notably, we also demonstrate the Sim2Real applicability of DORAEMON through its successful zero-shot transfer in a robotic manipulation setup under unknown real-world parameters.

* Project website at https://gabrieletiboni.github.io/doraemon/

Via

Access Paper or Ask Questions

OpenPatch: a 3D patchwork for Out-Of-Distribution detection

Oct 06, 2023

Paolo Rabino, Antonio Alliegro, Francesco Cappio Borlino, Tatiana Tommasi

Figure 1 for OpenPatch: a 3D patchwork for Out-Of-Distribution detection

Figure 2 for OpenPatch: a 3D patchwork for Out-Of-Distribution detection

Figure 3 for OpenPatch: a 3D patchwork for Out-Of-Distribution detection

Figure 4 for OpenPatch: a 3D patchwork for Out-Of-Distribution detection

Abstract:Moving deep learning models from the laboratory setting to the open world entails preparing them to handle unforeseen conditions. In several applications the occurrence of novel classes during deployment poses a significant threat, thus it is crucial to effectively detect them. Ideally, this skill should be used when needed without requiring any further computational training effort at every new task. Out-of-distribution detection has attracted significant attention in the last years, however the majority of the studies deal with 2D images ignoring the inherent 3D nature of the real-world and often confusing between domain and semantic novelty. In this work, we focus on the latter, considering the objects geometric structure captured by 3D point clouds regardless of the specific domain. We advance the field by introducing OpenPatch that builds on a large pre-trained model and simply extracts from its intermediate features a set of patch representations that describe each known class. For any new sample, we obtain a novelty score by evaluating whether it can be recomposed mainly by patches of a single known class or rather via the contribution of multiple classes. We present an extensive experimental evaluation of our approach for the task of semantic novelty detection on real-world point cloud samples when the reference known data are synthetic. We demonstrate that OpenPatch excels in both the full and few-shot known sample scenarios, showcasing its robustness across varying pre-training objectives and network backbones. The inherent training-free nature of our method allows for its immediate application to a wide array of real-world tasks, offering a compelling advantage over approaches that need expensive retraining efforts.

Via

Access Paper or Ask Questions

An Outlook into the Future of Egocentric Vision

Aug 14, 2023

Chiara Plizzari, Gabriele Goletto, Antonino Furnari, Siddhant Bansal, Francesco Ragusa, Giovanni Maria Farinella, Dima Damen, Tatiana Tommasi

Figure 1 for An Outlook into the Future of Egocentric Vision

Figure 2 for An Outlook into the Future of Egocentric Vision

Figure 3 for An Outlook into the Future of Egocentric Vision

Figure 4 for An Outlook into the Future of Egocentric Vision

Abstract:What will the future be? We wonder! In this survey, we explore the gap between current research in egocentric vision and the ever-anticipated future, where wearable computing, with outward facing cameras and digital overlays, is expected to be integrated in our every day lives. To understand this gap, the article starts by envisaging the future through character-based stories, showcasing through examples the limitations of current technology. We then provide a mapping between this future and previously defined research tasks. For each task, we survey its seminal works, current state-of-the-art methodologies and available datasets, then reflect on shortcomings that limit its applicability to future research. Note that this survey focuses on software models for egocentric vision, independent of any specific hardware. The paper concludes with recommendations for areas of immediate explorations so as to unlock our path to the future always-on, personalised and life-enhancing egocentric vision.

* We invite comments, suggestions and corrections here: https://openreview.net/forum?id=V3974SUk1w

Via

Access Paper or Ask Questions

Large Class Separation is not what you need for Relational Reasoning-based OOD Detection

Jul 12, 2023

Lorenzo Li Lu, Giulia D'Ascenzi, Francesco Cappio Borlino, Tatiana Tommasi

Abstract:Standard recognition approaches are unable to deal with novel categories at test time. Their overconfidence on the known classes makes the predictions unreliable for safety-critical applications such as healthcare or autonomous driving. Out-Of-Distribution (OOD) detection methods provide a solution by identifying semantic novelty. Most of these methods leverage a learning stage on the known data, which means training (or fine-tuning) a model to capture the concept of normality. This process is clearly sensitive to the amount of available samples and might be computationally expensive for on-board systems. A viable alternative is that of evaluating similarities in the embedding space produced by large pre-trained models without any further learning effort. We focus exactly on such a fine-tuning-free OOD detection setting. This works presents an in-depth analysis of the recently introduced relational reasoning pre-training and investigates the properties of the learned embedding, highlighting the existence of a correlation between the inter-class feature distance and the OOD detection accuracy. As the class separation depends on the chosen pre-training objective, we propose an alternative loss function to control the inter-class margin, and we show its advantage with thorough experiments.

* Accepted for publication at ICIAP 2023

Via

Access Paper or Ask Questions

Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics

Mar 25, 2023

Leonardo Iurada, Silvia Bucci, Timothy M. Hospedales, Tatiana Tommasi

Figure 1 for Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics

Figure 2 for Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics

Figure 3 for Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics

Figure 4 for Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics

Abstract:Deep learning-based recognition systems are deployed at scale for several real-world applications that inevitably involve our social life. Although being of great support when making complex decisions, they might capture spurious data correlations and leverage sensitive attributes (e.g. age, gender, ethnicity). How to factor out this information while keeping a high prediction performance is a task with still several open questions, many of which are shared with those of the domain adaptation and generalization literature which focuses on avoiding visual domain biases. In this work, we propose an in-depth study of the relationship between cross-domain learning (CD) and model fairness by introducing a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks. After having highlighted the limits of the current evaluation metrics, we introduce a new Harmonic Fairness (HF) score to assess jointly how fair and accurate every model is with respect to a reference baseline. Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter. Overall, our work paves the way for a more systematic analysis of fairness problems in computer vision. Code available at: https://github.com/iurada/fairness_crossdomain

Via

Access Paper or Ask Questions

Domain Randomization for Robust, Affordable and Effective Closed-loop Control of Soft Robots

Mar 07, 2023

Gabriele Tiboni, Andrea Protopapa, Tatiana Tommasi, Giuseppe Averta

Abstract:Soft robots are becoming extremely popular thanks to their intrinsic safety to contacts and adaptability. However, the potentially infinite number of Degrees of Freedom makes their modeling a daunting task, and in many cases only an approximated description is available. This challenge makes reinforcement learning (RL) based approaches inefficient when deployed on a realistic scenario, due to the large domain gap between models and the real platform. In this work, we demonstrate, for the first time, how Domain Randomization (DR) can solve this problem by enhancing RL policies with: i) a higher robustness w.r.t. environmental changes; ii) a higher affordability of learned policies when the target model differs significantly from the training model; iii) a higher effectiveness of the policy, which can even autonomously learn to exploit the environment to increase the robot capabilities (environmental constraints exploitation). Moreover, we introduce a novel algorithmic extension of previous adaptive domain randomization methods for the automatic inference of dynamics parameters for deformable objects. We provide results on four different tasks and two soft robot designs, opening interesting perspectives for future research on Reinforcement Learning for closed-loop soft robot control.

* Project website at https://andreaprotopapa.github.io/dr-soro/

Via

Access Paper or Ask Questions