Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Information": models, code, and papers

A Review on Methods and Applications in Multimodal Deep Learning

Feb 18, 2022
Jabeen Summaira, Xi Li, Amin Muhammad Shoib, Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning (MMDL) is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals. Detailed analysis of the baseline approaches and an in-depth study of recent advancements during the last five years (2017 to 2021) in multimodal deep learning applications has been provided. A fine-grained taxonomy of various multimodal deep learning methods is proposed, elaborating on different applications in more depth. Lastly, main issues are highlighted separately for each domain, along with their possible future research directions.

* ACM Transactions on Multimedia Computing, Communications, and Applications 2022 
* 29 pages. arXiv admin note: substantial text overlap with arXiv:2105.11087 

  Access Paper or Ask Questions

Temporal Walk Centrality: Ranking Nodes in Evolving Networks

Feb 08, 2022
Lutz Oettershagen, Petra Mutzel, Nils M. Kriege

We propose the Temporal Walk Centrality, which quantifies the importance of a node by measuring its ability to obtain and distribute information in a temporal network. In contrast to the widely-used betweenness centrality, we assume that information does not necessarily spread on shortest paths but on temporal random walks that satisfy the time constraints of the network. We show that temporal walk centrality can identify nodes playing central roles in dissemination processes that might not be detected by related betweenness concepts and other common static and temporal centrality measures. We propose exact and approximation algorithms with different running times depending on the properties of the temporal network and parameters of our new centrality measure. A technical contribution is a general approach to lift existing algebraic methods for counting walks in static networks to temporal networks. Our experiments on real-world temporal networks show the efficiency and accuracy of our algorithms. Finally, we demonstrate that the rankings by temporal walk centrality often differ significantly from those of other state-of-the-art temporal centralities.

* Accepted at the ACM Web Conference (WWW) 2022 

  Access Paper or Ask Questions

Spectral and Energy Efficiency of DCO-OFDM in Visible Light Communication Systems with Finite-Alphabet Inputs

Feb 02, 2022
Ruixin Yang, Shuai Ma, Zihan Xu, Hang Li, Xiaodong Liu, Xintong Ling, Xiong Deng, Xun Zhang, Shiyin Li

The bound of the information transmission rate of direct current biased optical orthogonal frequency division multiplexing (DCO-OFDM) for visible light communication (VLC) with finite-alphabet inputs is yet unknown, where the corresponding spectral efficiency (SE) and energy efficiency (EE) stems out as the open research problems. In this paper, we derive the exact achievable rate of {the} DCO-OFDM system with finite-alphabet inputs for the first time. Furthermore, we investigate SE maximization problems of {the} DCO-OFDM system subject to both electrical and optical power constraints. By exploiting the relationship between the mutual information and the minimum mean-squared error, we propose a multi-level mercury-water-filling power allocation scheme to achieve the maximum SE. Moreover, the EE maximization problems of {the} DCO-OFDM system are studied, and the Dinkelbach-type power allocation scheme is developed for the maximum EE. Numerical results verify the effectiveness of the proposed theories and power allocation schemes.

* IEEE Transactions on Wireless Communications 
* 14 pages, 14 figures, accepted by IEEE Transactions on Wireless Communications 

  Access Paper or Ask Questions

DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

Jan 24, 2022
Ningyu Zhang, Xin Xu, Liankuan Tao, Haiyang Yu, Hongbin Ye, Xin Xie, Xiang Chen, Zhoubo Li, Lei Li, Xiaozhuan Liang, Yunzhi Yao, Shumin Deng, Wen Zhang, Zhenru Zhang, Chuanqi Tan, Fei Huang, Guozhou Zheng, Huajun Chen

We present a new open-source and extensible knowledge extraction toolkit, called DeepKE (Deep learning based Knowledge Extraction), supporting standard fully supervised, low-resource few-shot and document-level scenarios. DeepKE implements various information extraction tasks, including named entity recognition, relation extraction and attribute extraction. With a unified framework, DeepKE allows developers and researchers to customize datasets and models to extract information from unstructured texts according to their requirements. Specifically, DeepKE not only provides various functional modules and model implementation for different tasks and scenarios but also organizes all components by consistent frameworks to maintain sufficient modularity and extensibility. Besides, we present an online platform in for real-time extraction of various tasks. DeepKE has been equipped with Google Colab tutorials and comprehensive documents for beginners. We release the source code at, with a demo video.

* work in progress 

  Access Paper or Ask Questions

Graph Neural Networks for Double-Strand DNA Breaks Prediction

Jan 04, 2022
XU Wang, Huan Zhao, Weiwei TU, Hao Li, Yu Sun, Xiaochen Bo

Double-strand DNA breaks (DSBs) are a form of DNA damage that can cause abnormal chromosomal rearrangements. Recent technologies based on high-throughput experiments have obvious high costs and technical challenges.Therefore, we design a graph neural network based method to predict DSBs (GraphDSB), using DNA sequence features and chromosome structure information. In order to improve the expression ability of the model, we introduce Jumping Knowledge architecture and several effective structural encoding methods. The contribution of structural information to the prediction of DSBs is verified by the experiments on datasets from normal human epidermal keratinocytes (NHEK) and chronic myeloid leukemia cell line (K562), and the ablation studies further demonstrate the effectiveness of the designed components in the proposed GraphDSB framework. Finally, we use GNNExplainer to analyze the contribution of node features and topology to DSBs prediction, and proved the high contribution of 5-mer DNA sequence features and two chromatin interaction modes.

  Access Paper or Ask Questions

LoSAC: An Efficient Local Stochastic Average Control Method for Federated Optimization

Dec 20, 2021
Huiming Chen, Huandong Wang, Quanming Yao, Yong Li, Depeng Jin, Qiang Yang

Federated optimization (FedOpt), which targets at collaboratively training a learning model across a large number of distributed clients, is vital for federated learning. The primary concerns in FedOpt can be attributed to the model divergence and communication efficiency, which significantly affect the performance. In this paper, we propose a new method, i.e., LoSAC, to learn from heterogeneous distributed data more efficiently. Its key algorithmic insight is to locally update the estimate for the global full gradient after {each} regular local model update. Thus, LoSAC can keep clients' information refreshed in a more compact way. In particular, we have studied the convergence result for LoSAC. Besides, the bonus of LoSAC is the ability to defend the information leakage from the recent technique Deep Leakage Gradients (DLG). Finally, experiments have verified the superiority of LoSAC comparing with state-of-the-art FedOpt algorithms. Specifically, LoSAC significantly improves communication efficiency by more than $100\%$ on average, mitigates the model divergence problem and equips with the defense ability against DLG.

  Access Paper or Ask Questions

CCasGNN: Collaborative Cascade Prediction Based on Graph Neural Networks

Dec 07, 2021
Yansong Wang, Xiaomeng Wang, Tao Jia

Cascade prediction aims at modeling information diffusion in the network. Most previous methods concentrate on mining either structural or sequential features from the network and the propagation path. Recent efforts devoted to combining network structure and sequence features by graph neural networks and recurrent neural networks. Nevertheless, the limitation of spectral or spatial methods restricts the improvement of prediction performance. Moreover, recurrent neural networks are time-consuming and computation-expensive, which causes the inefficiency of prediction. Here, we propose a novel method CCasGNN considering the individual profile, structural features, and sequence information. The method benefits from using a collaborative framework of GAT and GCN and stacking positional encoding into the layers of graph neural networks, which is different from all existing ones and demonstrates good performance. The experiments conducted on two real-world datasets confirm that our method significantly improves the prediction accuracy compared to state-of-the-art approaches. What's more, the ablation study investigates the contribution of each component in our method.

  Access Paper or Ask Questions

Hierarchical Graph Networks for 3D Human Pose Estimation

Nov 23, 2021
Han Li, Bowen Shi, Wenrui Dai, Yabo Chen, Botao Wang, Yu Sun, Min Guo, Chenlin Li, Junni Zou, Hongkai Xiong

Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. However, we argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. To overcome these weaknesses, we propose a novel graph convolution network architecture, Hierarchical Graph Networks (HGN). It is based on denser graph topology generated by our multi-scale graph structure building strategy, thus providing more delicate geometric information. The proposed architecture contains three sparse-to-fine representation subnetworks organized in parallel, in which multi-scale graph-structured features are processed and exchange information through a novel feature fusion strategy, leading to rich hierarchical representations. We also introduce a 3D coarse mesh constraint to further boost detail-related feature learning. Extensive experiments demonstrate that our HGN achieves the state-of-the art performance with reduced network parameters

* accepted by BMVC 2021 

  Access Paper or Ask Questions

Utilizing Textual Reviews in Latent Factor Models for Recommender Systems

Nov 16, 2021
Tatev Karen Aslanyan, Flavius Frasincar

Most of the existing recommender systems are based only on the rating data, and they ignore other sources of information that might increase the quality of recommendations, such as textual reviews, or user and item characteristics. Moreover, the majority of those systems are applicable only on small datasets (with thousands of observations) and are unable to handle large datasets (with millions of observations). We propose a recommender algorithm that combines a rating modelling technique (i.e., Latent Factor Model) with a topic modelling method based on textual reviews (i.e., Latent Dirichlet Allocation), and we extend the algorithm such that it allows adding extra user- and item-specific information to the system. We evaluate the performance of the algorithm using datasets with different sizes, corresponding to 23 product categories. After comparing the built model to four other models we found that combining textual reviews with ratings leads to better recommendations. Moreover, we found that adding extra user and item features to the model increases its prediction accuracy, which is especially true for medium and large datasets.

* The 36th ACM/SIGAPP Symposium on Applied Computing (SAC '21), March 22--26, 2021, Virtual Event, Republic of Korea 

  Access Paper or Ask Questions

A Survey of NLP-Related Crowdsourcing HITs: what works and what does not

Nov 09, 2021
Jessica Huynh, Jeffrey Bigham, Maxine Eskenazi

Crowdsourcing requesters on Amazon Mechanical Turk (AMT) have raised questions about the reliability of the workers. The AMT workforce is very diverse and it is not possible to make blanket assumptions about them as a group. Some requesters now reject work en mass when they do not get the results they expect. This has the effect of giving each worker (good or bad) a lower Human Intelligence Task (HIT) approval score, which is unfair to the good workers. It also has the effect of giving the requester a bad reputation on the workers' forums. Some of the issues causing the mass rejections stem from the requesters not taking the time to create a well-formed task with complete instructions and/or not paying a fair wage. To explore this assumption, this paper describes a study that looks at the crowdsourcing HITs on AMT that were available over a given span of time and records information about those HITs. This study also records information from a crowdsourcing forum on the worker perspective on both those HITs and on their corresponding requesters. Results reveal issues in worker payment and presentation issues such as missing instructions or HITs that are not doable.

  Access Paper or Ask Questions