Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jugurta Montalvão

Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Dec 10, 2025

Gabriel F. A. Bastos, Jugurta Montalvão

Figure 1 for Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Figure 2 for Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Figure 3 for Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Figure 4 for Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Abstract:Reliable data-driven estimation of Shannon entropy from small data sets, where the number of examples is potentially smaller than the number of possible outcomes, is a critical matter in several applications. In this paper, we introduce a discrete entropy estimator, where we use the decomposability property in combination with estimations of the missing mass and the number of unseen outcomes to compensate for the negative bias induced by them. Experimental results show that the proposed method outperforms some classical estimators in undersampled regimes, and performs comparably with some well-established state-of-the-art estimators.

* The manuscript contains 6 pages and 10 figures. It has been accepted for International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA 2026)

Via

Access Paper or Ask Questions

On the Information Content of Predictions in Word Analogy Tests

Oct 18, 2022

Jugurta Montalvão

Figure 1 for On the Information Content of Predictions in Word Analogy Tests

Figure 2 for On the Information Content of Predictions in Word Analogy Tests

Figure 3 for On the Information Content of Predictions in Word Analogy Tests

Figure 4 for On the Information Content of Predictions in Word Analogy Tests

Abstract:An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests. The main component of this approach is a softaccuracy estimator that also yields entropy estimates with compensated biases. Experimental results obtained with pre-trained GloVe 300-D vectors and two public analogy test sets show that proximity hints are much more relevant than analogies in analogy tests, from an information content perspective. Accordingly, a simple word embedding model is used to predict that analogies carry about one bit of information, which is experimentally corroborated.

* Journal of Communication and Information Systems (JCIS) - ISSN: 1980-6604

Via

Access Paper or Ask Questions

Bias-corrected estimator for intrinsic dimension and differential entropy--a visual multiscale approach

Apr 30, 2020

Jugurta Montalvão, Jânio Canuto, Luiz Miranda

Figure 1 for Bias-corrected estimator for intrinsic dimension and differential entropy--a visual multiscale approach

Figure 2 for Bias-corrected estimator for intrinsic dimension and differential entropy--a visual multiscale approach

Figure 3 for Bias-corrected estimator for intrinsic dimension and differential entropy--a visual multiscale approach

Figure 4 for Bias-corrected estimator for intrinsic dimension and differential entropy--a visual multiscale approach

Abstract:Intrinsic dimension and differential entropy estimators are studied in this paper, including their systematic bias. A pragmatic approach for joint estimation and bias correction of these two fundamental measures is proposed. Shared steps on both estimators are highlighted, along with their useful consequences to data analysis. It is shown that both estimators can be complementary parts of a single approach, and that the simultaneous estimation of differential entropy and intrinsic dimension give meaning to each other, where estimates at different observation scales convey different perspectives of underlying manifolds. Experiments with synthetic and real datasets are presented to illustrate how to extract meaning from visual inspections, and how to compensate for biases.

* 10 pages, 11 figures

Via

Access Paper or Ask Questions

Offline signature authenticity verification through unambiguously connected skeleton segments

Nov 08, 2017

Jugurta Montalvão, Luiz Miranda, Jânio Canuto

Figure 1 for Offline signature authenticity verification through unambiguously connected skeleton segments

Figure 2 for Offline signature authenticity verification through unambiguously connected skeleton segments

Figure 3 for Offline signature authenticity verification through unambiguously connected skeleton segments

Figure 4 for Offline signature authenticity verification through unambiguously connected skeleton segments

Abstract:A method for offline signature verification is presented in this paper. It is based on the segmentation of the signature skeleton (through standard image skeletonization) into unambiguous sequences of points, or unambiguously connected skeleton segments corresponding to vectorial representations of signature portions. These segments are assumed to be the fundamental carriers of useful information for authenticity verification, and are compactly encoded as sets of 9 scalars (4 sampled coordinates and 1 length measure). Thus signature authenticity is inferred through Euclidean distance based comparisons between pairs of such compact representations. The average performance of this method is evaluated through experiments with offline versions of signatures from the MCYT-100 database. For comparison purposes, three other approaches are applied to the same set of signatures, namely: (1) a straightforward approach based on Dynamic Time Warping distances between segments, (2) a published method by [shanker2007], also based on DTW, and (3) the average human performance under equivalent experimental protocol. Results suggest that if human performance is taken as a goal for automatic verification, then we should discard signature shape details to approach this goal. Moreover, our best result -- close to human performance -- was obtained by the simplest strategy, where equal weights were given to segment shape and length.

Via

Access Paper or Ask Questions