Alert button
Picture for Prakash Ishwar

Prakash Ishwar

Alert button

A principled approach to model validation in domain generalization

Apr 02, 2023
Boyang Lyu, Thuan Nguyen, Matthias Scheutz, Prakash Ishwar, Shuchin Aeron

Figure 1 for A principled approach to model validation in domain generalization

Domain generalization aims to learn a model with good generalization ability, that is, the learned model should not only perform well on several seen domains but also on unseen domains with different data distributions. State-of-the-art domain generalization methods typically train a representation function followed by a classifier jointly to minimize both the classification risk and the domain discrepancy. However, when it comes to model selection, most of these methods rely on traditional validation routines that select models solely based on the lowest classification risk on the validation set. In this paper, we theoretically demonstrate a trade-off between minimizing classification risk and mitigating domain discrepancy, i.e., it is impossible to achieve the minimum of these two objectives simultaneously. Motivated by this theoretical result, we propose a novel model selection method suggesting that the validation process should account for both the classification risk and the domain discrepancy. We validate the effectiveness of the proposed method by numerical results on several domain generalization datasets.

* Accepted to ICASSP 2023 
Viaarxiv icon

Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight

Mar 21, 2023
Zhangchi Lu, Mertcan Cokbas, Prakash Ishwar, Jansuz Konrad

Figure 1 for Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight
Figure 2 for Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight
Figure 3 for Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight
Figure 4 for Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight

Unobtrusive monitoring of distances between people indoors is a useful tool in the fight against pandemics. A natural resource to accomplish this are surveillance cameras. Unlike previous distance estimation methods, we use a single, overhead, fisheye camera with wide area coverage and propose two approaches. One method leverages a geometric model of the fisheye lens, whereas the other method uses a neural network to predict the 3D-world distance from people-locations in a fisheye image. To evaluate our algorithms, we collected a first-of-its-kind dataset using single fisheye camera, that comprises a wide range of distances between people (1-58 ft) and will be made publicly available. The algorithms achieve 1-2 ft distance error and over 95% accuracy in detecting social-distance violations.

* In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP (2023), pages 528-535  
Viaarxiv icon

Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images

Dec 22, 2022
Mertcan Cokbas, Prakash Ishwar, Janusz Konrad

Figure 1 for Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images
Figure 2 for Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images
Figure 3 for Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images
Figure 4 for Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images

Reliable and cost-effective counting of people in large indoor spaces is a significant challenge with many applications. An emerging approach is to deploy multiple fisheye cameras mounted overhead to monitor the whole space. However, due to the overlapping fields of view, person re-identificaiton (PRID) is critical for the accuracy of counting. While PRID has been thoroughly researched for traditional rectilinear cameras, few methods have been proposed for fisheye cameras and their performance is comparatively lower. To close this performance gap, we propose a multi-feature framework for fisheye PRID where we combine deep-learning, color-based and location-based features by means of novel feature fusion. We evaluate the performance of our framework for various feature combinations on FRIDA, a public fisheye PRID dataset. The results demonstrate that our multi-feature approach outperforms recent appearance-based deep-learning methods by almost 18% points and location-based methods by almost 3% points in accuracy.

Viaarxiv icon

Trade-off between reconstruction loss and feature alignment for domain generalization

Oct 26, 2022
Thuan Nguyen, Boyang Lyu, Prakash Ishwar, Matthias Scheutz, Shuchin Aeron

Figure 1 for Trade-off between reconstruction loss and feature alignment for domain generalization
Figure 2 for Trade-off between reconstruction loss and feature alignment for domain generalization

Domain generalization (DG) is a branch of transfer learning that aims to train the learning models on several seen domains and subsequently apply these pre-trained models to other unseen (unknown but related) domains. To deal with challenging settings in DG where both data and label of the unseen domain are not available at training time, the most common approach is to design the classifiers based on the domain-invariant representation features, i.e., the latent representations that are unchanged and transferable between domains. Contrary to popular belief, we show that designing classifiers based on invariant representation features alone is necessary but insufficient in DG. Our analysis indicates the necessity of imposing a constraint on the reconstruction loss induced by representation functions to preserve most of the relevant information about the label in the latent space. More importantly, we point out the trade-off between minimizing the reconstruction loss and achieving domain alignment in DG. Our theoretical results motivate a new DG framework that jointly optimizes the reconstruction loss and the domain discrepancy. Both theoretical and numerical results are provided to justify our approach.

* International Conference on Machine Learning and Applications (ICMLA-2022)  
* 13 pages, 2 tables 
Viaarxiv icon

FRIDA: Fisheye Re-Identification Dataset with Annotations

Oct 04, 2022
Mertcan Cokbas, John Bolognino, Janusz Konrad, Prakash Ishwar

Figure 1 for FRIDA: Fisheye Re-Identification Dataset with Annotations
Figure 2 for FRIDA: Fisheye Re-Identification Dataset with Annotations
Figure 3 for FRIDA: Fisheye Re-Identification Dataset with Annotations
Figure 4 for FRIDA: Fisheye Re-Identification Dataset with Annotations

Person re-identification (PRID) from side-mounted rectilinear-lens cameras is a well-studied problem. On the other hand, PRID from overhead fisheye cameras is new and largely unstudied, primarily due to the lack of suitable image datasets. To fill this void, we introduce the "Fisheye Re-IDentification Dataset with Annotations" (FRIDA), with 240k+ bounding-box annotations of people, captured by 3 time-synchronized, ceiling-mounted fisheye cameras in a large indoor space. Due to a field-of-view overlap, PRID in this case differs from a typical PRID problem, which we discuss in depth. We also evaluate the performance of 10 state-of-the-art PRID algorithms on FRIDA. We show that for 6 CNN-based algorithms, training on FRIDA boosts the performance by up to 11.64% points in mAP compared to training on a common rectilinear-camera PRID dataset.

* 8 pages 
Viaarxiv icon

Joint covariate-alignment and concept-alignment: a framework for domain generalization

Aug 01, 2022
Thuan Nguyen, Boyang Lyu, Prakash Ishwar, Matthias Scheutz, Shuchin Aeron

Figure 1 for Joint covariate-alignment and concept-alignment: a framework for domain generalization
Figure 2 for Joint covariate-alignment and concept-alignment: a framework for domain generalization
Figure 3 for Joint covariate-alignment and concept-alignment: a framework for domain generalization

In this paper, we propose a novel domain generalization (DG) framework based on a new upper bound to the risk on the unseen domain. Particularly, our framework proposes to jointly minimize both the covariate-shift as well as the concept-shift between the seen domains for a better performance on the unseen domain. While the proposed approach can be implemented via an arbitrary combination of covariate-alignment and concept-alignment modules, in this work we use well-established approaches for distributional alignment namely, Maximum Mean Discrepancy (MMD) and covariance Alignment (CORAL), and use an Invariant Risk Minimization (IRM)-based approach for concept alignment. Our numerical results show that the proposed methods perform as well as or better than the state-of-the-art for domain generalization on several data sets.

* 8 pages, 2 figures, and 1 table. This paper is accepted at 32nd IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2022) 
Viaarxiv icon

Conditional entropy minimization principle for learning domain invariant representation features

Jan 25, 2022
Thuan Nguyen, Boyang Lyu, Prakash Ishwar, Matthias Scheutz, Shuchin Aeron

Figure 1 for Conditional entropy minimization principle for learning domain invariant representation features

Invariance principle-based methods, for example, Invariant Risk Minimization (IRM), have recently emerged as promising approaches for Domain Generalization (DG). Despite the promising theory, invariance principle-based approaches fail in common classification tasks due to the mixture of the true invariant features and the spurious invariant features. In this paper, we propose a framework based on the conditional entropy minimization principle to filter out the spurious invariant features leading to a new algorithm with a better generalization capability. We theoretically prove that under some particular assumptions, the representation function can precisely recover the true invariant features. In addition, we also show that the proposed approach is closely related to the well-known Information Bottleneck framework. Both the theoretical and numerical results are provided to justify our approach.

* 7 pages 
Viaarxiv icon

Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning

Nov 04, 2021
Ruijie Jiang, Prakash Ishwar, Shuchin Aeron

Figure 1 for Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning
Figure 2 for Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning
Figure 3 for Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning
Figure 4 for Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning

We study the problem of designing hard negative sampling distributions for unsupervised contrastive representation learning. We analyze a novel min-max framework that seeks a representation which minimizes the maximum (worst-case) generalized contrastive learning loss over all couplings (joint distributions between positive and negative samples subject to marginal constraints) and prove that the resulting min-max optimum representation will be degenerate. This provides the first theoretical justification for incorporating additional regularization constraints on the couplings. We re-interpret the min-max problem through the lens of Optimal Transport theory and utilize regularized transport couplings to control the degree of hardness of negative examples. We demonstrate that the state-of-the-art hard negative sampling distributions that were recently proposed are a special case corresponding to entropic regularization of the coupling.

Viaarxiv icon

Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings

Sep 09, 2021
Christy Lin, Daniel Sussman, Prakash Ishwar

Figure 1 for Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings
Figure 2 for Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings
Figure 3 for Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings
Figure 4 for Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings

Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties remain poorly understood. This paper studies properties of random walk based node embeddings in the unsupervised setting of discovering hidden block structure in the network, i.e., learning node representations whose cluster structure in Euclidean space reflects their adjacency structure within the network. We characterize the ergodic limits of the embedding objective, its generalization, and related convex relaxations to derive corresponding non-randomized versions of the node embedding objectives. We also characterize the optimal node embedding Grammians of the non-randomized objectives for the expected graph of a two-community Stochastic Block Model (SBM). We prove that the solution Grammian has rank $1$ for a suitable nuclear norm relaxation of the non-randomized objective. Comprehensive experimental results on SBM random networks reveal that our non-randomized ergodic objectives yield node embeddings whose distribution is Gaussian-like, centered at the node embeddings of the expected network within each community, and concentrate in the linear degree-scaling regime as the number of nodes increases.

Viaarxiv icon

Barycenteric distribution alignment and manifold-restricted invertibility for domain generalization

Sep 04, 2021
Boyang Lyu, Thuan Nguyen, Prakash Ishwar, Matthias Scheutz, Shuchin Aeron

Figure 1 for Barycenteric distribution alignment and manifold-restricted invertibility for domain generalization
Figure 2 for Barycenteric distribution alignment and manifold-restricted invertibility for domain generalization
Figure 3 for Barycenteric distribution alignment and manifold-restricted invertibility for domain generalization
Figure 4 for Barycenteric distribution alignment and manifold-restricted invertibility for domain generalization

For the Domain Generalization (DG) problem where the hypotheses are composed of a common representation function followed by a labeling function, we point out a shortcoming in existing approaches that fail to explicitly optimize for a term, appearing in a well-known and widely adopted upper bound to the risk on the unseen domain, that is dependent on the representation to be learned. To this end, we first derive a novel upper bound to the prediction risk. We show that imposing a mild assumption on the representation to be learned, namely manifold restricted invertibility, is sufficient to deal with this issue. Further, unlike existing approaches, our novel upper bound doesn't require the assumption of Lipschitzness of the loss function. In addition, the distributional discrepancy in the representation space is handled via the Wasserstein-2 barycenter cost. In this context, we creatively leverage old and recent transport inequalities, which link various optimal transport metrics, in particular the $L^1$ distance (also known as the total variation distance) and the Wasserstein-2 distances, with the Kullback-Liebler divergence. These analyses and insights motivate a new representation learning cost for DG that additively balances three competing objectives: 1) minimizing classification error across seen domains via cross-entropy, 2) enforcing domain-invariance in the representation space via the Wasserstein-2 barycenter cost, and 3) promoting non-degenerate, nearly-invertible representation via one of two mechanisms, viz., an autoencoder-based reconstruction loss or a mutual information loss. It is to be noted that the proposed algorithms completely bypass the use of any adversarial training mechanism that is typical of many current domain generalization approaches. Simulation results on several standard datasets demonstrate superior performance compared to several well-known DG algorithms.

Viaarxiv icon