Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacek Mańdziuk

Advancing Generalization Across a Variety of Abstract Visual Reasoning Tasks

May 19, 2025

Mikołaj Małkiński, Jacek Mańdziuk

Abstract:The abstract visual reasoning (AVR) domain presents a diverse suite of analogy-based tasks devoted to studying model generalization. Recent years have brought dynamic progress in the field, particularly in i.i.d. scenarios, in which models are trained and evaluated on the same data distributions. Nevertheless, o.o.d. setups that assess model generalization to new test distributions remain challenging even for the most recent models. To advance generalization in AVR tasks, we present the Pathways of Normalized Group Convolution model (PoNG), a novel neural architecture that features group convolution, normalization, and a parallel design. We consider a wide set of AVR benchmarks, including Raven's Progressive Matrices and visual analogy problems with both synthetic and real-world images. The experiments demonstrate strong generalization capabilities of the proposed model, which in several settings outperforms the existing literature methods.

* Accepted to the 34th International Joint Conference on Artificial Intelligence (IJCAI 2025)

Via

Access Paper or Ask Questions

Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Feb 02, 2025

Szymon Sieradzki, Jacek Mańdziuk

Figure 1 for Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Figure 2 for Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Figure 3 for Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Figure 4 for Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Abstract:In this paper, we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks.

* 21 pages, 10 figures

Via

Access Paper or Ask Questions

Constrained Hybrid Metaheuristic Algorithm for Probabilistic Neural Networks Learning

Jan 26, 2025

Piotr A. Kowalski, Szymon Kucharczyk, Jacek Mańdziuk

Abstract:This study investigates the potential of hybrid metaheuristic algorithms to enhance the training of Probabilistic Neural Networks (PNNs) by leveraging the complementary strengths of multiple optimisation strategies. Traditional learning methods, such as gradient-based approaches, often struggle to optimise high-dimensional and uncertain environments, while single-method metaheuristics may fail to exploit the solution space fully. To address these challenges, we propose the constrained Hybrid Metaheuristic (cHM) algorithm, a novel approach that combines multiple population-based optimisation techniques into a unified framework. The proposed procedure operates in two phases: an initial probing phase evaluates multiple metaheuristics to identify the best-performing one based on the error rate, followed by a fitting phase where the selected metaheuristic refines the PNN to achieve optimal smoothing parameters. This iterative process ensures efficient exploration and convergence, enhancing the network's generalisation and classification accuracy. cHM integrates several popular metaheuristics, such as BAT, Simulated Annealing, Flower Pollination Algorithm, Bacterial Foraging Optimization, and Particle Swarm Optimisation as internal optimisers. To evaluate cHM performance, experiments were conducted on 16 datasets with varying characteristics, including binary and multiclass classification tasks, balanced and imbalanced class distributions, and diverse feature dimensions. The results demonstrate that cHM effectively combines the strengths of individual metaheuristics, leading to faster convergence and more robust learning. By optimising the smoothing parameters of PNNs, the proposed method enhances classification performance across diverse datasets, proving its application flexibility and efficiency.

* 35 pages, 1 Algorithm flow, 11 tables, 4 figures

Via

Access Paper or Ask Questions

Cultivating Archipelago of Forests: Evolving Robust Decision Trees through Island Coevolution

Dec 18, 2024

Adam Żychowski, Andrew Perrault, Jacek Mańdziuk

Abstract:Decision trees are widely used in machine learning due to their simplicity and interpretability, but they often lack robustness to adversarial attacks and data perturbations. The paper proposes a novel island-based coevolutionary algorithm (ICoEvoRDF) for constructing robust decision tree ensembles. The algorithm operates on multiple islands, each containing populations of decision trees and adversarial perturbations. The populations on each island evolve independently, with periodic migration of top-performing decision trees between islands. This approach fosters diversity and enhances the exploration of the solution space, leading to more robust and accurate decision tree ensembles. ICoEvoRDF utilizes a popular game theory concept of mixed Nash equilibrium for ensemble weighting, which further leads to improvement in results. ICoEvoRDF is evaluated on 20 benchmark datasets, demonstrating its superior performance compared to state-of-the-art methods in optimizing both adversarial accuracy and minimax regret. The flexibility of ICoEvoRDF allows for the integration of decision trees from various existing methods, providing a unified framework for combining diverse solutions. Our approach offers a promising direction for developing robust and interpretable machine learning models

Via

Access Paper or Ask Questions

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

Nov 02, 2024

Mikołaj Małkiński, Szymon Pawlonka, Jacek Mańdziuk

Abstract:Abstract visual reasoning (AVR) encompasses a suite of tasks whose solving requires the ability to discover common concepts underlying the set of pictures through an analogy-making process, similarly to human IQ tests. Bongard Problems (BPs), proposed in 1968, constitute a fundamental challenge in this domain mainly due to their requirement to combine visual reasoning and verbal description. This work poses a question whether multimodal large language models (MLLMs) inherently designed to combine vision and language are capable of tackling BPs. To this end, we propose a set of diverse MLLM-suited strategies to tackle BPs and examine four popular proprietary MLLMs: GPT-4o, GPT-4 Turbo, Gemini 1.5 Pro, and Claude 3.5 Sonnet, and four open models: InternVL2-8B, LLaVa-1.6 Mistral-7B, Phi-3.5-Vision, and Pixtral 12B. The above MLLMs are compared on three BP datasets: a set of original BP instances relying on synthetic, geometry-based images and two recent datasets based on real-world images, i.e., Bongard-HOI and Bongard-OpenWorld. The experiments reveal significant limitations of MLLMs in solving BPs. In particular, the models struggle to solve the classical set of synthetic BPs, despite their visual simplicity. Though their performance ameliorates on real-world concepts expressed in Bongard-HOI and Bongard-OpenWorld, the models still have difficulty in utilizing new information to improve their predictions, as well as utilizing a dialog context window effectively. To capture the reasons of performance discrepancy between synthetic and real-world AVR domains, we propose Bongard-RWR, a new BP dataset consisting of real-world images that translates concepts from hand-crafted synthetic BPs to real-world concepts. The MLLMs' results on Bongard-RWR suggest that their poor performance on classical BPs is not due to domain specificity but rather reflects their general AVR limitations.

Via

Access Paper or Ask Questions

Bootstrap Sampling Rate Greater than 1.0 May Improve Random Forest Performance

Oct 05, 2024

Stanisław Kaźmierczak, Jacek Mańdziuk

Abstract:Random forests utilize bootstrap sampling to create an individual training set for each component tree. This involves sampling with replacement, with the number of instances equal to the size of the original training set ($N$). Research literature indicates that drawing fewer than $N$ observations can also yield satisfactory results. The ratio of the number of observations in each bootstrap sample to the total number of training instances is called the bootstrap rate (BR). Sampling more than $N$ observations (BR $>$ 1) has been explored in the literature only to a limited extent and has generally proven ineffective. In this paper, we re-examine this approach using 36 diverse datasets and consider BR values ranging from 1.2 to 5.0. Contrary to previous findings, we show that such parameterization can result in statistically significant improvements in classification accuracy compared to standard settings (BR $\leq$ 1). Furthermore, we investigate what the optimal BR depends on and conclude that it is more a property of the dataset than a dependence on the random forest hyperparameters. Finally, we develop a binary classifier to predict whether the optimal BR is $\leq$ 1 or $>$ 1 for a given dataset, achieving between 81.88\% and 88.81\% accuracy, depending on the experiment configuration.

Via

Access Paper or Ask Questions

Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Jun 16, 2024

Mikołaj Małkiński, Jacek Mańdziuk

Figure 1 for Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Figure 2 for Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Figure 3 for Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Figure 4 for Generalization and Knowledge Transfer in Abstract Visual Reasoning Models

Abstract:We study generalization and knowledge reuse capabilities of deep neural networks in the domain of abstract visual reasoning (AVR), employing Raven's Progressive Matrices (RPMs), a recognized benchmark task for assessing AVR abilities. Two knowledge transfer scenarios referring to the I-RAVEN dataset are investigated. Firstly, inspired by generalization assessment capabilities of the PGM dataset and popularity of I-RAVEN, we introduce Attributeless-I-RAVEN, a benchmark with four generalization regimes that allow to test generalization of abstract rules applied to held-out attributes. Secondly, we construct I-RAVEN-Mesh, a dataset that enriches RPMs with a novel component structure comprising line-based patterns, facilitating assessment of progressive knowledge acquisition in transfer learning setting. The developed benchmarks reveal shortcomings of the contemporary deep learning models, which we partly address with Pathways of Normalized Group Convolution (PoNG) model, a novel neural architecture for solving AVR tasks. PoNG excels in both presented challenges, as well as the standard I-RAVEN and PGM setups.

Via

Access Paper or Ask Questions

A Unified View of Abstract Visual Reasoning Problems

Jun 16, 2024

Mikołaj Małkiński, Jacek Mańdziuk

Abstract:The field of Abstract Visual Reasoning (AVR) encompasses a wide range of problems, many of which are inspired by human IQ tests. The variety of AVR tasks has resulted in state-of-the-art AVR methods being task-specific approaches. Furthermore, contemporary methods consider each AVR problem instance not as a whole, but in the form of a set of individual panels with particular locations and roles (context vs. answer panels) pre-assigned according to the task-specific arrangements. While these highly specialized approaches have recently led to significant progress in solving particular AVR tasks, considering each task in isolation hinders the development of universal learning systems in this domain. In this paper, we introduce a unified view of AVR tasks, where each problem instance is rendered as a single image, with no a priori assumptions about the number of panels, their location, or role. The main advantage of the proposed unified view is the ability to develop universal learning models applicable to various AVR tasks. What is more, the proposed approach inherently facilitates transfer learning in the AVR domain, as various types of problems share a common representation. The experiments conducted on four AVR datasets with Raven's Progressive Matrices and Visual Analogy Problems, and one real-world visual analogy dataset show that the proposed unified representation of AVR tasks poses a challenge to state-of-the-art Deep Learning (DL) AVR models and, more broadly, contemporary DL image recognition methods. In order to address this challenge, we introduce the Unified Model for Abstract Visual Reasoning (UMAVR) capable of dealing with various types of AVR problems in a unified manner. UMAVR outperforms existing AVR methods in selected single-task learning experiments, and demonstrates effective knowledge reuse in transfer learning and curriculum learning setups.

Via

Access Paper or Ask Questions

ReModels: Quantile Regression Averaging models

May 18, 2024

Grzegorz Zakrzewski, Kacper Skonieczka, Mikołaj Małkiński, Jacek Mańdziuk

Abstract:Electricity price forecasts play a crucial role in making key business decisions within the electricity markets. A focal point in this domain are probabilistic predictions, which delineate future price values in a more comprehensive manner than simple point forecasts. The golden standard in probabilistic approaches to predict energy prices is the Quantile Regression Averaging (QRA) method. In this paper, we present a Python package that encompasses the implementation of QRA, along with modifications of this approach that have appeared in the literature over the past few years. The proposed package also facilitates the acquisition and preparation of data related to electricity markets, as well as the evaluation of model predictions.

Via

Access Paper or Ask Questions

Interpretable Multi-task Learning with Shared Variable Embeddings

May 10, 2024

Maciej Żelaszczyk, Jacek Mańdziuk

Figure 1 for Interpretable Multi-task Learning with Shared Variable Embeddings

Figure 2 for Interpretable Multi-task Learning with Shared Variable Embeddings

Figure 3 for Interpretable Multi-task Learning with Shared Variable Embeddings

Figure 4 for Interpretable Multi-task Learning with Shared Variable Embeddings

Abstract:This paper proposes a general interpretable predictive system with shared information. The system is able to perform predictions in a multi-task setting where distinct tasks are not bound to have the same input/output structure. Embeddings of input and output variables in a common space are obtained, where the input embeddings are produced through attending to a set of shared embeddings, reused across tasks. All the embeddings are treated as model parameters and learned. Specific restrictions on the space of shared embedings and the sparsity of the attention mechanism are considered. Experiments show that the introduction of shared embeddings does not deteriorate the results obtained from a vanilla variable embeddings method. We run a number of further ablations. Inducing sparsity in the attention mechanism leads to both an increase in accuracy and a significant decrease in the number of training steps required. Shared embeddings provide a measure of interpretability in terms of both a qualitative assessment and the ability to map specific shared embeddings to pre-defined concepts that are not tailored to the considered model. There seems to be a trade-off between accuracy and interpretability. The basic shared embeddings method favors interpretability, whereas the sparse attention method promotes accuracy. The results lead to the conclusion that variable embedding methods may be extended with shared information to provide increased interpretability and accuracy.

Via

Access Paper or Ask Questions