Abstract:Despite their central role in the success of foundational models and large-scale language modeling, the theoretical foundations governing the operation of Transformers remain only partially understood. Contemporary research has largely focused on their representational capacity for language comprehension and their prowess in in-context learning, frequently under idealized assumptions such as linearized attention mechanisms. Initially conceived to model sequence-to-sequence transformations, a fundamental and unresolved question is whether Transformers can robustly perform functional regression over sequences of input tokens. This question assumes heightened importance given the inherently non-Euclidean geometry underlying real-world data distributions. In this work, we establish that Transformers equipped with softmax-based nonlinear attention are uniformly consistent when tasked with executing Ordinary Least Squares (OLS) regression, provided both the inputs and outputs are embedded in hyperbolic space. We derive deterministic upper bounds on the empirical error which, in the asymptotic regime, decay at a provable rate of $\mathcal{O}(t^{-1/2d})$, where $t$ denotes the number of input tokens and $d$ the embedding dimensionality. Notably, our analysis subsumes the Euclidean setting as a special case, recovering analogous convergence guarantees parameterized by the intrinsic dimensionality of the data manifold. These theoretical insights are corroborated through empirical evaluations on real-world datasets involving both continuous and categorical response variables.
Abstract:Clustering aims to form groups of similar data points in an unsupervised regime. Yet, clustering complex datasets containing critically intertwined shapes poses significant challenges. The prevailing clustering algorithms widely depend on evaluating similarity measures based on Euclidean metrics. Exploring topological characteristics to perform clustering of complex datasets inevitably presents a better scope. The topological clustering algorithms predominantly perceive the point set through the lens of Simplicial complexes and Persistent homology. Despite these approaches, the existing topological clustering algorithms cannot somehow fully exploit topological structures and show inconsistent performances on some highly complicated datasets. This work aims to mitigate the limitations by identifying topologically similar neighbors through the Vietoris-Rips complex and Betti number filtration. In addition, we introduce the concept of the Betti sequences to capture flexibly essential features from the topological structures. Our proposed algorithm is adept at clustering complex, intertwined shapes contained in the datasets. We carried out experiments on several synthetic and real-world datasets. Our algorithm demonstrated commendable performances across the datasets compared to some of the well-known topology-based clustering algorithms.
Abstract:The emergence of Deep Convolutional Neural Networks (DCNNs) has been a pervasive tool for accomplishing widespread applications in computer vision. Despite its potential capability to capture intricate patterns inside the data, the underlying embedding space remains Euclidean and primarily pursues contractive convolution. Several instances can serve as a precedent for the exacerbating performance of DCNNs. The recent advancement of neural networks in the hyperbolic spaces gained traction, incentivizing the development of convolutional deep neural networks in the hyperbolic space. In this work, we propose Hyperbolic DCNN based on the Poincar\'{e} Disc. The work predominantly revolves around analyzing the nature of expansive convolution in the context of the non-Euclidean domain. We further offer extensive theoretical insights pertaining to the universal consistency of the expansive convolution in the hyperbolic space. Several simulations were performed not only on the synthetic datasets but also on some real-world datasets. The experimental results reveal that the hyperbolic convolutional architecture outperforms the Euclidean ones by a commendable margin.
Abstract:Super-Resolution (SR) is a time-hallowed image processing problem that aims to improve the quality of a Low-Resolution (LR) sample up to the standard of its High-Resolution (HR) counterpart. We aim to address this by introducing Super-Resolution Generator (SuRGe), a fully-convolutional Generative Adversarial Network (GAN)-based architecture for SR. We show that distinct convolutional features obtained at increasing depths of a GAN generator can be optimally combined by a set of learnable convex weights to improve the quality of generated SR samples. In the process, we employ the Jensen-Shannon and the Gromov-Wasserstein losses respectively between the SR-HR and LR-SR pairs of distributions to further aid the generator of SuRGe to better exploit the available information in an attempt to improve SR. Moreover, we train the discriminator of SuRGe with the Wasserstein loss with gradient penalty, to primarily prevent mode collapse. The proposed SuRGe, as an end-to-end GAN workflow tailor-made for super-resolution, offers improved performance while maintaining low inference time. The efficacy of SuRGe is substantiated by its superior performance compared to 18 state-of-the-art contenders on 10 benchmark datasets.
Abstract:Graph Transformers (GTs) facilitate the comprehension of graph-structured data by calculating the self-attention of node pairs without considering node position information. To address this limitation, we introduce an innovative and efficient framework that introduces Positional Encodings (PEs) into the Transformer, generating a set of learnable positional encodings in the hyperbolic space, a non-Euclidean domain. This approach empowers us to explore diverse options for optimal selection of PEs for specific downstream tasks, leveraging hyperbolic neural networks or hyperbolic graph convolutional networks. Additionally, we repurpose these positional encodings to mitigate the impact of over-smoothing in deep Graph Neural Networks (GNNs). Comprehensive experiments on molecular benchmark datasets, co-author, and co-purchase networks substantiate the effectiveness of hyperbolic positional encodings in enhancing the performance of deep GNNs.