Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lior Wolf

Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Aug 16, 2019
Shir Gur, Lior Wolf, Lior Golgher, Pablo Blinder

Figure 1 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 2 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 3 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 4 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

The task of blood vessel segmentation in microscopy images is crucial for many diagnostic and research applications. However, vessels can look vastly different, depending on the transient imaging conditions, and collecting data for supervised training is laborious. We present a novel deep learning method for unsupervised segmentation of blood vessels. The method is inspired by the field of active contours and we introduce a new loss term, which is based on the morphological Active Contours Without Edges (ACWE) optimization method. The role of the morphological operators is played by novel pooling layers that are incorporated to the network's architecture. We demonstrate the challenges that are faced by previous supervised learning solutions, when the imaging conditions shift. Our unsupervised method is able to outperform such previous methods in both the labeled dataset, and when applied to similar but different datasets. Our code, as well as efficient PyTorch reimplementations of the baseline methods VesselNN and DeepVess is available on GitHub - https://github.com/shirgur/UMIS.

Via

Access Paper or Ask Questions

Mask Based Unsupervised Content Transfer

Jun 15, 2019
Ron Mokady, Sagie Benaim, Lior Wolf, Amit Bermano

Figure 1 for Mask Based Unsupervised Content Transfer

Figure 2 for Mask Based Unsupervised Content Transfer

Figure 3 for Mask Based Unsupervised Content Transfer

Figure 4 for Mask Based Unsupervised Content Transfer

We consider the problem of translating, in an unsupervised manner, between two domains where one contains some additional information compared to the other. The proposed method disentangles the common and separate parts of these domains and, through the generation of a mask, focuses the attention of the underlying network to the desired augmentation alone, without wastefully reconstructing the entire target. This enables state-of-the-art quality and variety of content translation, as shown through extensive quantitative and qualitative evaluation. Furthermore, the novel mask-based formulation and regularization is accurate enough to achieve state-of-the-art performance in the realm of weakly supervised segmentation, where only class labels are given. To our knowledge, this is the first report that bridges the problems of domain disentanglement and weakly supervised segmentation. Our code is publicly available at https://github.com/rmokady/mbu-content-tansfer.

Via

Access Paper or Ask Questions

Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

May 02, 2019
Itzik Malkiel, Sangtae Ahn, Valentina Taviani, Anne Menini, Lior Wolf, Christopher J. Hardy

Figure 1 for Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

Figure 2 for Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

Figure 3 for Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

Figure 4 for Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

Recent sparse MRI reconstruction models have used Deep Neural Networks (DNNs) to reconstruct relatively high-quality images from highly undersampled k-space data, enabling much faster MRI scanning. However, these techniques sometimes struggle to reconstruct sharp images that preserve fine detail while maintaining a natural appearance. In this work, we enhance the image quality by using a Conditional Wasserstein Generative Adversarial Network combined with a novel Adaptive Gradient Balancing technique that stabilizes the training and minimizes the degree of artifacts, while maintaining a high-quality reconstruction that produces sharper images than other techniques.

Via

Access Paper or Ask Questions

TTS Skins: Speaker Conversion via ASR

Apr 18, 2019
Adam Polyak, Lior Wolf, Yaniv Taigman

Figure 1 for TTS Skins: Speaker Conversion via ASR

Figure 2 for TTS Skins: Speaker Conversion via ASR

Figure 3 for TTS Skins: Speaker Conversion via ASR

Figure 4 for TTS Skins: Speaker Conversion via ASR

We present a fully convolutional wav-to-wav network for converting between speakers' voices, without relying on text. Our network is based on an encoder-decoder architecture, where the encoder is pre-trained for the task of Automatic Speech Recognition (ASR), and a multi-speaker waveform decoder is trained to reconstruct the original signal in an autoregressive manner. We train the network on narrated audiobooks, and demonstrate the ability to perform multi-voice TTS in those voices, by converting the voice of a TTS robot. We observe no degradation in the quality of the generated voices, in comparison to the reference TTS voice. The modularity of our approach, which separates the target voice generation from the TTS module, enables client-side personalized TTS in a privacy-aware manner.

Via

Access Paper or Ask Questions

Vid2Game: Controllable Characters Extracted from Real-World Videos

Apr 17, 2019
Oran Gafni, Lior Wolf, Yaniv Taigman

Figure 1 for Vid2Game: Controllable Characters Extracted from Real-World Videos

Figure 2 for Vid2Game: Controllable Characters Extracted from Real-World Videos

Figure 3 for Vid2Game: Controllable Characters Extracted from Real-World Videos

Figure 4 for Vid2Game: Controllable Characters Extracted from Real-World Videos

We are given a video of a person performing a certain activity, from which we extract a controllable model. The model generates novel image sequences of that person, according to arbitrary user-defined control signals, typically marking the displacement of the moving body. The generated video can have an arbitrary background, and effectively capture both the dynamics and appearance of the person. The method is based on two networks. The first network maps a current pose, and a single-instance control signal to the next pose. The second network maps the current pose, the new pose, and a given background, to an output frame. Both networks include multiple novelties that enable high-quality performance. This is demonstrated on multiple characters extracted from various videos of dancers and athletes.

Via

Access Paper or Ask Questions

Audio Denoising with Deep Network Priors

Apr 16, 2019
Michael Michelashvili, Lior Wolf

Figure 1 for Audio Denoising with Deep Network Priors

Figure 2 for Audio Denoising with Deep Network Priors

Figure 3 for Audio Denoising with Deep Network Priors

Figure 4 for Audio Denoising with Deep Network Priors

We present a method for audio denoising that combines processing done in both the time domain and the time-frequency domain. Given a noisy audio clip, the method trains a deep neural network to fit this signal. Since the fitting is only partly successful and is able to better capture the underlying clean signal than the noise, the output of the network helps to disentangle the clean audio from the rest of the signal. The method is completely unsupervised and only trains on the specific audio clip that is being denoised. Our experiments demonstrate favorable performance in comparison to the literature methods, and our code and audio samples are available at https: //github.com/mosheman5/DNP. Index Terms: Audio denoising; Unsupervised learning

Via

Access Paper or Ask Questions

Unsupervised Singing Voice Conversion

Apr 13, 2019
Eliya Nachmani, Lior Wolf

Figure 1 for Unsupervised Singing Voice Conversion

Figure 2 for Unsupervised Singing Voice Conversion

Figure 3 for Unsupervised Singing Voice Conversion

Figure 4 for Unsupervised Singing Voice Conversion

We present a deep learning method for singing voice conversion. The proposed network is not conditioned on the text or on the notes, and it directly converts the audio of one singer to the voice of another. Training is performed without any form of supervision: no lyrics or any kind of phonetic features, no notes, and no matching samples between singers. The proposed network employs a single CNN encoder for all singers, a single WaveNet decoder, and a classifier that enforces the latent representation to be singer-agnostic. Each singer is represented by one embedding vector, which the decoder is conditioned on. In order to deal with relatively small datasets, we propose a new data augmentation scheme, as well as new training losses and protocols that are based on backtranslation. Our evaluation presents evidence that the conversion produces natural signing voices that are highly recognizable as the target singer.

* Submitted to Interspeech 2019

Via

Access Paper or Ask Questions

Unsupervised Polyglot Text To Speech

Feb 06, 2019
Eliya Nachmani, Lior Wolf

Figure 1 for Unsupervised Polyglot Text To Speech

Figure 2 for Unsupervised Polyglot Text To Speech

Figure 3 for Unsupervised Polyglot Text To Speech

Figure 4 for Unsupervised Polyglot Text To Speech

We present a TTS neural network that is able to produce speech in multiple languages. The proposed network is able to transfer a voice, which was presented as a sample in a source language, into one of several target languages. Training is done without using matching or parallel data, i.e., without samples of the same speaker in multiple languages, making the method much more applicable. The conversion is based on learning a polyglot network that has multiple per-language sub-networks and adding loss terms that preserve the speaker's identity in multiple languages. We evaluate the proposed polyglot neural network for three languages with a total of more than 400 speakers and demonstrate convincing conversion capabilities.

* The paper will be presented at ICASSP 2019

Via

Access Paper or Ask Questions

Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

Dec 14, 2018
Michael Michelashvili, Sagie Benaim, Lior Wolf

Figure 1 for Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

Figure 2 for Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

Figure 3 for Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

Figure 4 for Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

We study the problem of semi-supervised singing voice separation, in which the training data contains a set of samples of mixed music (singing and instrumental) and an unmatched set of instrumental music. Our solution employs a single mapping function g, which, applied to a mixed sample, recovers the underlying instrumental music, and, applied to an instrumental sample, returns the same sample. The network g is trained using purely instrumental samples, as well as on synthetic mixed samples that are created by mixing reconstructed singing voices with random instrumental samples. Our results indicate that we are on a par with or better than fully supervised methods, which are also provided with training samples of unmixed singing voices, and are better than other recent semi-supervised methods.

Via

Access Paper or Ask Questions