Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network

Oct 12, 2020
Jialu Huang, Jing Liao, Sam Kwong

Image-to-Image (I2I) translation is a heated topic in academia, and it also has been applied in real-world industry for tasks like image synthesis, super-resolution, and colorization. However, traditional I2I translation methods train data in two or more domains together. This requires lots of computation resources. Moreover, the results are of lower quality, and they contain many more artifacts. The training process could be unstable when the data in different domains are not balanced, and modal collapse is more likely to happen. We proposed a new I2I translation method that generates a new model in the target domain via a series of model transformations on a pre-trained StyleGAN2 model in the source domain. After that, we proposed an inversion method to achieve the conversion between an image and its latent vector. By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain. Both qualitative and quantitative evaluations were conducted to prove that the proposed method can achieve outstanding performance in terms of image quality, diversity and semantic similarity to the input and reference images compared to state-of-the-art works.

* 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works 

  Access Paper or Ask Questions

We Don't Speak the Same Language: Interpreting Polarization through Machine Translation

Oct 05, 2020
Ashiqur R. KhudaBukhsh, Rupak Sarkar, Mark S. Kamlet, Tom M. Mitchell

Polarization among US political parties, media and elites is a widely studied topic. Prominent lines of prior research across multiple disciplines have observed and analyzed growing polarization in social media. In this paper, we present a new methodology that offers a fresh perspective on interpreting polarization through the lens of machine translation. With a novel proposition that two sub-communities are speaking in two different \emph{languages}, we demonstrate that modern machine translation methods can provide a simple yet powerful and interpretable framework to understand the differences between two (or more) large-scale social media discussion data sets at the granularity of words. Via a substantial corpus of 86.6 million comments by 6.5 million users on over 200,000 news videos hosted by YouTube channels of four prominent US news networks, we demonstrate that simple word-level and phrase-level translation pairs can reveal deep insights into the current political divide -- what is \emph{black lives matter} to one can be \emph{all lives matter} to the other.

  Access Paper or Ask Questions

Teaching Tech to Talk: K-12 Conversational Artificial Intelligence Literacy Curriculum and Development Tools

Sep 11, 2020
Jessica Van Brummelen, Tommy Heng, Viktoriya Tabunshchyk

With children talking to smart-speakers, smart-phones and even smart-microwaves daily, it is increasingly important to educate students on how these agents work-from underlying mechanisms to societal implications. Researchers are developing tools and curriculum to teach K-12 students broadly about artificial intelligence (AI); however, few studies have evaluated these tools with respect to AI-specific learning outcomes, and even fewer have addressed student learning about AI-based conversational agents. We evaluate our Conversational Agent Interface for MIT App Inventor and workshop curriculum with respect to eight AI competencies from the literature. Furthermore, we analyze teacher (n=9) and student (n=47) feedback from workshops with the interface and recommend that future work leverages design considerations from the literature to optimize engagement, collaborates with teachers, and addresses a range of student abilities through pacing and opportunities for extension. We found students struggled most with the concepts of AI ethics and learning, and recommend emphasizing these topics when teaching. The appendix, including a demo video, can be found here:

* 8 pages, 4 figures, for associated video:, for appendix: 

  Access Paper or Ask Questions

A review of deep learning in medical imaging: Image traits, technology trends, case studies with progress highlights, and future promises

Aug 02, 2020
S. Kevin Zhou, Hayit Greenspan, Christos Davatzikos, James S. Duncan, Bram van Ginneken, Anant Madabhushi, Jerry L. Prince, Daniel Rueckert, Ronald M. Summers

Since its renaissance, deep learning has been widely used in various medical imaging tasks and has achieved remarkable success in many medical imaging applications, thereby propelling us into the so-called artificial intelligence (AI) era. It is known that the success of AI is mostly attributed to the availability of big data with annotations for a single task and the advances in high performance computing. However, medical imaging presents unique challenges that confront deep learning approaches. In this survey paper, we first highlight both clinical needs and technical challenges in medical imaging and describe how emerging trends in deep learning are addressing these issues. We cover the topics of network architecture, sparse and noisy labels, federating learning, interpretability, uncertainty quantification, etc. Then, we present several case studies that are commonly found in clinical practice, including digital pathology and chest, brain, cardiovascular, and abdominal imaging. Rather than presenting an exhaustive literature survey, we instead describe some prominent research highlights related to these case study applications. We conclude with a discussion and presentation of promising future directions.

* 19 pages, 7 figures 

  Access Paper or Ask Questions

Visualizing the Finer Cluster Structure of Large-Scale and High-Dimensional Data

Jul 17, 2020
Yu Liang, Arin Chaudhuri, Haoyu Wang

Dimension reduction and visualization of high-dimensional data have become very important research topics because of the rapid growth of large databases in data science. In this paper, we propose using a generalized sigmoid function to model the distance similarity in both high- and low-dimensional spaces. In particular, the parameter b is introduced to the generalized sigmoid function in low-dimensional space, so that we can adjust the heaviness of the function tail by changing the value of b. Using both simulated and real-world data sets, we show that our proposed method can generate visualization results comparable to those of uniform manifold approximation and projection (UMAP), which is a newly developed manifold learning technique with fast running speed, better global structure, and scalability to massive data sets. In addition, according to the purpose of the study and the data structure, we can decrease or increase the value of b to either reveal the finer cluster structure of the data or maintain the neighborhood continuity of the embedding for better visualization. Finally, we use domain knowledge to demonstrate that the finer subclusters revealed with small values of b are meaningful.

  Access Paper or Ask Questions

Variational Mutual Information Maximization Framework for VAE Latent Codes with Continuous and Discrete Priors

Jun 02, 2020
Andriy Serdega, Dae-Shik Kim

Learning interpretable and disentangled representations of data is a key topic in machine learning research. Variational Autoencoder (VAE) is a scalable method for learning directed latent variable models of complex data. It employs a clear and interpretable objective that can be easily optimized. However, this objective does not provide an explicit measure for the quality of latent variable representations which may result in their poor quality. We propose Variational Mutual Information Maximization Framework for VAE to address this issue. In comparison to other methods, it provides an explicit objective that maximizes lower bound on mutual information between latent codes and observations. The objective acts as a regularizer that forces VAE to not ignore the latent variable and allows one to select particular components of it to be most informative with respect to the observations. On top of that, the proposed framework provides a way to evaluate mutual information between latent codes and observations for a fixed VAE model. We have conducted our experiments on VAE models with Gaussian and joint Gaussian and discrete latent variables. Our results illustrate that the proposed approach strengthens relationships between latent codes and observations and improves learned representations.

* arXiv admin note: text overlap with arXiv:2005.13953 

  Access Paper or Ask Questions

Compose Like Humans: Jointly Improving the Coherence and Novelty for Modern Chinese Poetry Generation

May 04, 2020
Lei Shen, Xiaoyu Guo, Meng Chen

Chinese poetry is an important part of worldwide culture, and classical and modern sub-branches are quite different. The former is a unique genre and has strict constraints, while the latter is very flexible in length, optional to have rhymes, and similar to modern poetry in other languages. Thus, it requires more to control the coherence and improve the novelty. In this paper, we propose a generate-retrieve-then-refine paradigm to jointly improve the coherence and novelty. In the first stage, a draft is generated given keywords (i.e., topics) only. The second stage produces a "refining vector" from retrieval lines. At last, we take into consideration both the draft and the "refining vector" to generate a new poem. The draft provides future sentence-level information for a line to be generated. Meanwhile, the "refining vector" points out the direction of refinement based on impressive words detection mechanism which can learn good patterns from references and then create new ones via insertion operation. Experimental results on a collected large-scale modern Chinese poetry dataset show that our proposed approach can not only generate more coherent poems, but also improve the diversity and novelty.

* To appear at IJCNN 2020 (long paper) 

  Access Paper or Ask Questions

WeatherBench: A benchmark dataset for data-driven weather forecasting

Feb 12, 2020
Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn, Soukayna Mouatadid, Nils Thuerey

Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used for numerical weather prediction. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models as well as purely physical forecasting models. All data is publicly available at and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.

* Github repository:; Data download: 

  Access Paper or Ask Questions

Multi-View Multiple Clusterings using Deep Matrix Factorization

Nov 26, 2019
Shaowei Wei, Jun Wang, Guoxian Yu, Carlotta, Xiangliang Zhang

Multi-view clustering aims at integrating complementary information from multiple heterogeneous views to improve clustering results. Existing multi-view clustering solutions can only output a single clustering of the data. Due to their multiplicity, multi-view data, can have different groupings that are reasonable and interesting from different perspectives. However, how to find multiple, meaningful, and diverse clustering results from multi-view data is still a rarely studied and challenging topic in multi-view clustering and multiple clusterings. In this paper, we introduce a deep matrix factorization based solution (DMClusts) to discover multiple clusterings. DMClusts gradually factorizes multi-view data matrices into representational subspaces layer-by-layer and generates one clustering in each layer. To enforce the diversity between generated clusterings, it minimizes a new redundancy quantification term derived from the proximity between samples in these subspaces. We further introduce an iterative optimization procedure to simultaneously seek multiple clusterings with quality and diversity. Experimental results on benchmark datasets confirm that DMClusts outperforms state-of-the-art multiple clustering solutions.

  Access Paper or Ask Questions

A Probabilistic Approach for Discovering Daily Human Mobility Patterns with Mobile Data

Nov 21, 2019
Weizhu Qian, Fabrice Lauri, Franck Gechter

Discovering human mobility patterns with geo-location data collected from smartphone users has been a hot research topic in recent years. In this paper, we attempt to discover daily mobile patterns based on GPS data. We view this problem from a probabilistic perspective in order to explore more information from the original GPS data compared to other conventional methods. A non-parameter Bayesian modeling method, Infinite Gaussian Mixture Model, is used to estimate the probability density for the daily mobility. Then, we use Kullback-Leibler divergence as the metrics to measure the similarity of different probability distributions. And combining Infinite Gaussian Mixture Model and Kullback-Leibler divergence, we derived an automatic clustering algorithm to discover mobility patterns for each individual user without setting the number of clusters in advance. In the experiments, the effectiveness of our method is validated on the real user data collected from different users. The results show that the IGMM-based algorithm outperforms the GMM-based algorithm. We also test our methods on the dataset with different lengths to discover the minimum data length for discovering mobility patterns.

* 10 pages, 14 figures, journal paper 

  Access Paper or Ask Questions