We consider a system consisting of a server, which receives updates for $N$ files according to independent Poisson processes. The goal of the server is to deliver the latest version of the files to the user through a parallel network of $K$ caches. We consider an update received by the user successful, if the user receives the same file version that is currently prevailing at the server. We derive an analytical expression for information freshness at the user. We observe that freshness for a file increases with increase in consolidation of rates across caches. To solve the multi-cache problem, we first solve the auxiliary problem of a single-cache system. We then rework this auxiliary solution to our parallel-cache network by consolidating rates to single routes as much as possible. This yields an approximate (sub-optimal) solution for the original problem. We provide an upper bound on the gap between the sub-optimal solution and the optimal solution. Numerical results show that the sub-optimal policy closely approximates the optimal policy.
Existing research on action recognition treats activities as monolithic events occurring in videos. Recently, the benefits of formulating actions as a combination of atomic-actions have shown promise in improving action understanding with the emergence of datasets containing such annotations, allowing us to learn representations capturing this information. However, there remains a lack of studies that extend action composition and leverage multiple viewpoints and multiple modalities of data for representation learning. To promote research in this direction, we introduce Home Action Genome (HOMAGE): a multi-view action dataset with multiple modalities and view-points supplemented with hierarchical activity and atomic action labels together with dense scene composition labels. Leveraging rich multi-modal and multi-view settings, we propose Cooperative Compositional Action Understanding (CCAU), a cooperative learning framework for hierarchical action recognition that is aware of compositional action elements. CCAU shows consistent performance improvements across all modalities. Furthermore, we demonstrate the utility of co-learning compositions in few-shot action recognition by achieving 28.6% mAP with just a single sample.
Colonoscopy is a procedure to detect colorectal polyps which are the primary cause for developing colorectal cancer. However, polyp segmentation is a challenging task due to the diverse shape, size, color, and texture of polyps, shuttle difference between polyp and its background, as well as low contrast of the colonoscopic images. To address these challenges, we propose a feature enhancement network for accurate polyp segmentation in colonoscopy images. Specifically, the proposed network enhances the semantic information using the novel Semantic Feature Enhance Module (SFEM). Furthermore, instead of directly adding encoder features to the respective decoder layer, we introduce an Adaptive Global Context Module (AGCM), which focuses only on the encoder's significant and hard fine-grained features. The integration of these two modules improves the quality of features layer by layer, which in turn enhances the final feature representation. The proposed approach is evaluated on five colonoscopy datasets and demonstrates superior performance compared to other state-of-the-art models.
Attention control is a key cognitive ability for humans to select information relevant to the current task. This paper develops a computational model of attention and an algorithm for attention-based probabilistic planning in Markov decision processes. In attention-based planning, the robot decides to be in different attention modes. An attention mode corresponds to a subset of state variables monitored by the robot. By switching between different attention modes, the robot actively perceives task-relevant information to reduce the cost of information acquisition and processing, while achieving near-optimal task performance. Though planning with attention-based active perception inevitably introduces partial observations, a partially observable MDP formulation makes the problem computational expensive to solve. Instead, our proposed method employs a hierarchical planning framework in which the robot determines what to pay attention to and for how long the attention should be sustained before shifting to other information sources. During the attention sustaining phase, the robot carries out a sub-policy, computed from an abstraction of the original MDP given the current attention. We use an example where a robot is tasked to capture a set of intruders in a stochastic gridworld. The experimental results show that the proposed method enables information- and computation-efficient optimal planning in stochastic environments.
Information retrieval (IR) systems traditionally aim to maximize metrics built on rankings, such as precision or NDCG. However, the non-differentiability of the ranking operation prevents direct optimization of such metrics in state-of-the-art neural IR models, which rely entirely on the ability to compute meaningful gradients. To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as a basic building block to devise differentiable approximations of IR metrics. We further provide theoretical guarantees on SmoothI and derived approximations, showing in particular that the approximation errors decrease exponentially with an inverse temperature-like hyperparameter that controls the quality of the approximations. Extensive experiments conducted on four standard learning-to-rank datasets validate the efficacy of the listwise losses based on SmoothI, in comparison to previously proposed ones. Additional experiments with a vanilla BERT ranking model on a text-based IR task also confirm the benefits of our listwise approach.
This paper presents the Multilingual COVID-19 Analysis Method (CMTA) for detecting and observing the spread of misinformation about this disease within texts. CMTA proposes a data science (DS) pipeline that applies machine learning models for processing, classifying (Dense-CNN) and analyzing (MBERT) multilingual (micro)-texts. DS pipeline data preparation tasks extract features from multilingual textual data and categorize it into specific information classes (i.e., 'false', 'partly false', 'misleading'). The CMTA pipeline has been experimented with multilingual micro-texts (tweets), showing misinformation spread across different languages. To assess the performance of CMTA and put it in perspective, we performed a comparative analysis of CMTA with eight monolingual models used for detecting misinformation. The comparison shows that CMTA has surpassed various monolingual models and suggests that it can be used as a general method for detecting misinformation in multilingual micro-texts. CMTA experimental results show misinformation trends about COVID-19 in different languages during the first pandemic months.
Feedforward networks (FFN) are ubiquitous structures in neural systems and have been studied to understand mechanisms of reliable signal and information transmission. In many FFNs, neurons in one layer have intrinsic properties that are distinct from those in their pre-/postsynaptic layers, but how this affects network-level information processing remains unexplored. Here we show that layer-to-layer heterogeneity arising from lamina-specific cellular properties facilitates signal and information transmission in FFNs. Specifically, we found that signal transformations, made by each layer of neurons on an input-driven spike signal, demodulate signal distortions introduced by preceding layers. This mechanism boosts information transfer carried by a propagating spike signal and thereby supports reliable spike signal and information transmission in a deep FFN. Our study suggests that distinct cell types in neural circuits, performing different computational functions, facilitate information processing on the whole.
Classification and identification of amino acids in aqueous solutions is important in the study of biomacromolecules. Laser Induced Breakdown Spectroscopy (LIBS) uses high energy laser-pulses for ablation of chemical compounds whose radiated spectra are captured and recorded to reveal molecular structure. Spectral peaks and noise from LIBS are impacted by experimental protocols. Current methods for LIBS spectral analysis achieves promising results using PCA, a linear method. It is well-known that the underlying physical processes behind LIBS are highly nonlinear. Our work set out to understand the impact of LIBS spectra on suitable neighborhood size over which to consider pattern phenomena, if nonlinear methods capture pattern phenomena with increased efficacy, and how they improve classification and identification of compounds. We analyzed four amino acids, polysaccharide, and a control group, water. We developed an information theoretic method for measurement of LIBS energy spectra, implemented manifold methods for nonlinear dimensionality reduction, and found while clustering results were not statistically significantly different, nonlinear methods lead to increased classification accuracy. Moreover, our approach uncovered the contribution of micro-wells (experimental protocol) in LIBS spectra. To the best of our knowledge, ours is the first application of Manifold methods to LIBS amino-acid analysis in the research literature.
Long-range and short-range temporal modeling are two complementary and crucial aspects of video recognition. Most of the state-of-the-arts focus on short-range spatio-temporal modeling and then average multiple snippet-level predictions to yield the final video-level prediction. Thus, their video-level prediction does not consider spatio-temporal features of how video evolves along the temporal dimension. In this paper, we introduce a novel Dynamic Segment Aggregation (DSA) module to capture relationship among snippets. To be more specific, we attempt to generate a dynamic kernel for a convolutional operation to aggregate long-range temporal information among adjacent snippets adaptively. The DSA module is an efficient plug-and-play module and can be combined with the off-the-shelf clip-based models (i.e., TSM, I3D) to perform powerful long-range modeling with minimal overhead. The final video architecture, coined as DSANet. We conduct extensive experiments on several video recognition benchmarks (i.e., Mini-Kinetics-200, Kinetics-400, Something-Something V1 and ActivityNet) to show its superiority. Our proposed DSA module is shown to benefit various video recognition models significantly. For example, equipped with DSA modules, the top-1 accuracy of I3D ResNet-50 is improved from 74.9% to 78.2% on Kinetics-400. Codes will be available.
With the development of medical computer-aided diagnostic systems, pulmonary artery-vein(A/V) reconstruction plays a crucial role in assisting doctors in preoperative planning for lung cancer surgery. However, distinguishing arterial from venous irrigation in chest CT images remains a challenge due to the similarity and complex structure of the arteries and veins. We propose a novel method for automatic separation of pulmonary arteries and veins from chest CT images. The method consists of three parts. First, global connection information and local feature information are used to construct a complete topological tree and ensure the continuity of vessel reconstruction. Second, the multitask classification network proposed can automatically learn the differences between arteries and veins at different scales to reduce classification errors caused by changes in terminal vessel characteristics. Finally, the topology optimizer considers interbranch and intrabranch topological relationships to maintain spatial consistency to avoid the misclassification of A/V irrigations. We validate the performance of the method on chest CT images. Compared with manual classification, the proposed method achieves an average accuracy of 96.2% on noncontrast chest CT. In addition, the method has been proven to have good generalization, that is, the accuracies of 93.8% and 94.8% are obtained for CT scans from other devices and other modes, respectively. The result of pulmonary artery-vein reconstruction obtained by the proposed method can provide better assistance for preoperative planning of lung cancer surgery.