I show that a one-dimensional (1D) conditional generative adversarial network (cGAN) with an adversarial training architecture is capable of unpaired signal-to-signal ("sig2sig") translation. Using a simplified CycleGAN model with 1D layers and wider convolutional kernels, mirroring WaveGAN to reframe two-dimensional (2D) image generation as 1D audio generation, I show that recasting the 2D image-to-image translation task to a 1D signal-to-signal translation task with deep convolutional GANs is possible without substantial modification to the conventional U-Net model and adversarial architecture developed as CycleGAN. With this I show for a small tunable dataset that noisy test signals unseen by the 1D CycleGAN model and without paired training transform from the source domain to signals similar to paired test signals in the translated domain, especially in terms of frequency, and I quantify these differences in terms of correlation and error.
I present "SnakeSynth," a web-based lightweight audio synthesizer that combines audio generated by a deep generative model and real-time continuous two-dimensional (2D) input to create and control variable-length generative sounds through 2D interaction gestures. Interaction gestures are touch and mobile-compatible with analogies to strummed, bowed, and plucked musical instrument controls. Point-and-click and drag-and-drop gestures directly control audio playback length and I show that sound length and intensity are modulated by interactions with a programmable 2D coordinate grid. Leveraging the speed and ubiquity of browser-based audio and hardware acceleration in Google's TensorFlow.js we generate time-varying high-fidelity sounds with real-time interactivity. SnakeSynth adaptively reproduces and interpolates between sounds encountered during model training, notably without long training times, and I briefly discuss possible futures for deep generative models as an interactive paradigm for musical expression.
I outline a signal resampling strategy for aligning event times between time series trials in contexts where significant event times like onsets and offsets vary between trials. These variations prevent direct comparisons of trials in practical contexts as comparisons require equal-length time series (Salari et al., 2019). Algorithms like dynamic time warping help us quantify these variations locally but do not apply well to continuous transformations of time series signals without interpolating or downsampling to add or remove samples (Jamid, 2004; Eckner, 2014). I show that with consideration for padding and sampling frequency that sinc interpolation is sufficient to resample parts of trial intervals to produce equal-length time-locked trials that correlate to and strongly approximate their unwarped counterparts with minimal interpolation effects. Specifically I show that interpolation effects can be minimized by oversampling, selectively interpolating mis-aligned parts of trials with respect to mean mis-aligned event lengths, and interpolating mis-aligned events with sufficient zero-padding. Interpolated signals then have a bandlimit well below the Nyquist frequency and satisfy the Nyquist-Shannon sampling theorem ensuring perfect reconstructions, and I find that I can track and potentially counteract resampling effects on signal energy quantities.