Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Can Information Behaviour Inform Machine Learning?

May 01, 2022
Michael Ridley

The objective of this paper is to explore the opportunities for human information behaviour research to inform and influence the field of machine learning and the resulting machine information behaviour. Using the development of foundation models in machine learning as an example, the paper illustrates how human information behaviour research can bring to machine learning a more nuanced view of information and informing, a better understanding of information need and how that affects the communication among people and systems, guidance on the nature of context and how to operationalize that in models and systems, and insights into bias, misinformation, and marginalization. Despite their clear differences, the fields of information behaviour and machine learning share many common objectives, paradigms, and key research questions. The example of foundation models illustrates that human information behaviour research has much to offer in addressing some of the challenges emerging in the nascent area of machine information behaviour.

* 19 pages

Via

Access Paper or Ask Questions

I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images

Feb 21, 2023
Kanggeun Lee, Kyungryun Lee, Won-Ki Jeong

Figure 1 for I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images

Figure 2 for I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images

Figure 3 for I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images

Figure 4 for I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images

Although the advances of self-supervised blind denoising are significantly superior to conventional approaches without clean supervision in synthetic noise scenarios, it shows poor quality in real-world images due to spatially correlated noise corruption. Recently, pixel-shuffle downsampling (PD) has been proposed to eliminate the spatial correlation of noise. A study combining a blind spot network (BSN) and asymmetric PD (AP) successfully demonstrated that self-supervised blind denoising is applicable to real-world noisy images. However, PD-based inference may degrade texture details in the testing phase because high-frequency details (e.g., edges) are destroyed in the downsampled images. To avoid such an issue, we propose self-residual learning without the PD process to maintain texture information. We also propose an order-variant PD constraint, noise prior loss, and an efficient inference scheme (progressive random-replacing refinement ($\text{PR}^3$)) to boost overall performance. The results of extensive experiments show that the proposed method outperforms state-of-the-art self-supervised blind denoising approaches, including several supervised learning methods, in terms of PSNR, SSIM, LPIPS, and DISTS in real-world sRGB images.

* 23 pages, 17 figures, 7 tables

Via

Access Paper or Ask Questions

Managing multi-facet bias in collaborative filtering recommender systems

Feb 21, 2023
Samira Vaez Barenji, Saeed Farzi

Figure 1 for Managing multi-facet bias in collaborative filtering recommender systems

Figure 2 for Managing multi-facet bias in collaborative filtering recommender systems

Figure 3 for Managing multi-facet bias in collaborative filtering recommender systems

Figure 4 for Managing multi-facet bias in collaborative filtering recommender systems

Due to the extensive growth of information available online, recommender systems play a more significant role in serving people's interests. Traditional recommender systems mostly use an accuracy-focused approach to produce recommendations. Today's research suggests that this single-dimension approach can lead the system to be biased against a series of items with certain attributes. Biased recommendations across groups of items can endanger the interests of item providers along with causing user dissatisfaction with the system. This study aims to manage a new type of intersectional bias regarding the geographical origin and popularity of items in the output of state-of-the-art collaborative filtering recommender algorithms. We introduce an algorithm called MFAIR, a multi-facet post-processing bias mitigation algorithm to alleviate these biases. Extensive experiments on two real-world datasets of movies and books, enriched with the items' continents of production, show that the proposed algorithm strikes a reasonable balance between accuracy and both types of the mentioned biases. According to the results, our proposed approach outperforms a well-known competitor with no or only a slight loss of efficiency.

* 23 pages, 6 figures, 6 tables

Via

Access Paper or Ask Questions

Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Feb 21, 2023
Pei Xu, Jean-Bernard Hayet, Ioannis Karamouzas

Figure 1 for Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Figure 2 for Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Figure 3 for Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Figure 4 for Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

Real-time, accurate prediction of human steering behaviors has wide applications, from developing intelligent traffic systems to deploying autonomous driving systems in both real and simulated worlds. In this paper, we present ContextVAE, a context-aware approach for multi-modal vehicle trajectory prediction. Built upon the backbone architecture of a timewise variational autoencoder, ContextVAE employs a dual attention mechanism for observation encoding that accounts for the environmental context information and the dynamic agents' states in a unified way. By utilizing features extracted from semantic maps during agent state encoding, our approach takes into account both the social features exhibited by agents on the scene and the physical environment constraints to generate map-compliant and socially-aware trajectories. We perform extensive testing on the nuScenes prediction challenge, Lyft Level 5 dataset and Waymo Open Motion Dataset to show the effectiveness of our approach and its state-of-the-art performance. In all tested datasets, ContextVAE models are fast to train and provide high-quality multi-modal predictions in real-time.

Via

Access Paper or Ask Questions

MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

Feb 21, 2023
Zizhang Wu, Yuanzhu Gan, Lei Wang, Guilian Chen, Jian Pu

Figure 1 for MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

Figure 2 for MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

Figure 3 for MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

Figure 4 for MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center's depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the performance of clues for spatial 3D detection tasks. To alleviate this, we propose MonoPGC, a novel end-to-end Monocular 3D object detection framework with rich Pixel Geometry Contexts. We introduce the pixel depth estimation as our auxiliary task and design depth cross-attention pyramid module (DCPM) to inject local and global depth geometry knowledge into visual features. In addition, we present the depth-space-aware transformer (DSAT) to integrate 3D space position and depth-aware features efficiently. Besides, we design a novel depth-gradient positional encoding (DGPE) to bring more distinct pixel geometry contexts into the transformer for better object detection. Extensive experiments demonstrate that our method achieves the state-of-the-art performance on the KITTI dataset.

* Accepted by ICRA 2023

Via

Access Paper or Ask Questions

CTE: A Dataset for Contextualized Table Extraction

Feb 02, 2023
Andrea Gemelli, Emanuele Vivoli, Simone Marinai

Figure 1 for CTE: A Dataset for Contextualized Table Extraction

Figure 2 for CTE: A Dataset for Contextualized Table Extraction

Figure 3 for CTE: A Dataset for Contextualized Table Extraction

Figure 4 for CTE: A Dataset for Contextualized Table Extraction

Relevant information in documents is often summarized in tables, helping the reader to identify useful facts. Most benchmark datasets support either document layout analysis or table understanding, but lack in providing data to apply both tasks in a unified way. We define the task of Contextualized Table Extraction (CTE), which aims to extract and define the structure of tables considering the textual context of the document. The dataset comprises 75k fully annotated pages of scientific papers, including more than 35k tables. Data are gathered from PubMed Central, merging the information provided by annotations in the PubTables-1M and PubLayNet datasets. The dataset can support CTE and adds new classes to the original ones. The generated annotations can be used to develop end-to-end pipelines for various tasks, including document layout analysis, table detection, structure recognition, and functional analysis. We formally define CTE and evaluation metrics, showing which subtasks can be tackled, describing advantages, limitations, and future works of this collection of data. Annotations and code will be accessible a https://github.com/AILab-UniFI/cte-dataset.

Via

Access Paper or Ask Questions

An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Sep 19, 2022
Arthur Aubret, Laetitia Matignon, Salima Hassas

Figure 1 for An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Figure 2 for An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Figure 3 for An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Figure 4 for An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

The reinforcement learning (RL) research area is very active, with an important number of new contributions; especially considering the emergent field of deep RL (DRL). However a number of scientific and technical challenges still need to be resolved, amongst which we can mention the ability to abstract actions or the difficulty to explore the environment in sparse-reward settings which can be addressed by intrinsic motivation (IM). We propose to survey these research works through a new taxonomy based on information theory: we computationally revisit the notions of surprise, novelty and skill learning. This allows us to identify advantages and disadvantages of methods and exhibit current outlooks of research. Our analysis suggests that novelty and surprise can assist the building of a hierarchy of transferable skills that further abstracts the environment and makes the exploration process more robust.

Via

Access Paper or Ask Questions

Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression

Mar 03, 2023
Gabriel Arpino, Ramji Venkataramanan

Figure 1 for Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression

We consider the problem of mixed sparse linear regression with two components, where two real $k$-sparse signals $\beta_1, \beta_2$ are to be recovered from $n$ unlabelled noisy linear measurements. The sparsity is allowed to be sublinear in the dimension, and additive noise is assumed to be independent Gaussian with variance $\sigma^2$. Prior work has shown that the problem suffers from a $\frac{k}{SNR^2}$-to-$\frac{k^2}{SNR^2}$ statistical-to-computational gap, resembling other computationally challenging high-dimensional inference problems such as Sparse PCA and Robust Sparse Mean Estimation; here $SNR$ is the signal-to-noise ratio. We establish the existence of a more extensive computational barrier for this problem through the method of low-degree polynomials, but show that the problem is computationally hard only in a very narrow symmetric parameter regime. We identify a smooth information-computation tradeoff between the sample complexity $n$ and runtime for any randomized algorithm in this hard regime. Via a simple reduction, this provides novel rigorous evidence for the existence of a computational barrier to solving exact support recovery in sparse phase retrieval with sample complexity $n = \tilde{o}(k^2)$. Our second contribution is to analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem in $O(np)$ time with square-root the number of samples and matches the sample complexity required for (non-mixed) sparse linear regression; this allows the recovery problem to be subsequently solved by state-of-the-art techniques from the dense case. As a special case of our results, we show that this simple algorithm is order-optimal among a large family of algorithms in solving exact signed support recovery in sparse linear regression.

* 60 pages

Via

Access Paper or Ask Questions

Multi-frequency PolSAR Image Fusion Classification Based on Semantic Interactive Information and Topological Structure

Sep 05, 2022
Yice Cao, Yan Wu, Ming Li, Mingjie Zheng, Peng Zhang, Jili Wang

Figure 1 for Multi-frequency PolSAR Image Fusion Classification Based on Semantic Interactive Information and Topological Structure

Figure 2 for Multi-frequency PolSAR Image Fusion Classification Based on Semantic Interactive Information and Topological Structure

Figure 3 for Multi-frequency PolSAR Image Fusion Classification Based on Semantic Interactive Information and Topological Structure

Figure 4 for Multi-frequency PolSAR Image Fusion Classification Based on Semantic Interactive Information and Topological Structure

Compared with the rapid development of single-frequency multi-polarization SAR image classification technology, there is less research on the land cover classification of multifrequency polarimetric SAR (MF-PolSAR) images. In addition, the current deep learning methods for MF-PolSAR classification are mainly based on convolutional neural networks (CNNs), only local spatiality is considered but the nonlocal relationship is ignored. Therefore, based on semantic interaction and nonlocal topological structure, this paper proposes the MF semantics and topology fusion network (MF-STFnet) to improve MF-PolSAR classification performance. In MF-STFnet, two kinds of classification are implemented for each band, semantic information-based (SIC) and topological property-based (TPC). They work collaboratively during MF-STFnet training, which can not only fully leverage the complementarity of bands, but also combine local and nonlocal spatial information to improve the discrimination between different categories. For SIC, the designed crossband interactive feature extraction module (CIFEM) is embedded to explicitly model the deep semantic correlation among bands, thereby leveraging the complementarity of bands to make ground objects more separable. For TPC, the graph sample and aggregate network (GraphSAGE) is employed to dynamically capture the representation of nonlocal topological relations between land cover categories. In this way, the robustness of classification can be further improved by combining nonlocal spatial information. Finally, an adaptive weighting fusion (AWF) strategy is proposed to merge inference from different bands, so as to make the MF joint classification decisions of SIC and TPC. The comparative experiments show that MF-STFnet can achieve more competitive classification performance than some state-of-the-art methods.

Via

Access Paper or Ask Questions

Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

Aug 21, 2022
Kenshi Abe, Kaito Ariu, Mitsuki Sakamoto, Kentaro Toyoshima, Atsushi Iwasaki

Figure 1 for Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

Figure 2 for Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

Figure 3 for Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

Figure 4 for Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

The theory of learning in games is prominent in the AI community, motivated by several rising applications such as multi-agent reinforcement learning and Generative Adversarial Networks. We propose Mutation-driven Multiplicative Weights Update (M2WU) for learning an equilibrium in two-player zero-sum normal-form games and prove that it exhibits the last-iterate convergence property in both full- and noisy-information feedback settings. In the full-information feedback setting, the players observe their exact gradient vectors of the utility functions. On the other hand, in the noisy-information feedback setting, they can only observe the noisy gradient vectors. Existing algorithms, including the well-known Multiplicative Weights Update (MWU) and Optimistic MWU (OMWU) algorithms, fail to converge to a Nash equilibrium with noisy-information feedback. In contrast, M2WU exhibits the last-iterate convergence to a stationary point near a Nash equilibrium in both of the feedback settings. We then prove that it converges to an exact Nash equilibrium by adapting the mutation term iteratively. We empirically confirm that M2WU outperforms MWU and OMWU in exploitability and convergence rates.

Via

Access Paper or Ask Questions