Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sam Kirkham

Nosey: Open-source hardware for acoustic nasalance

May 29, 2025

Maya Dewhurst, Jack Collins, Justin J. H. Lo, Roy Alderton, Sam Kirkham

Abstract:We introduce Nosey (Nasalance Open Source Estimation sYstem), a low-cost, customizable, 3D-printed system for recording acoustic nasalance data that we have made available as open-source hardware (http://github.com/phoneticslab/nosey). We first outline the motivations and design principles behind our hardware nasalance system, and then present a comparison between Nosey and a commercial nasalance device. Nosey shows consistently higher nasalance scores than the commercial device, but the magnitude of contrast between phonological environments is comparable between systems. We also review ways of customizing the hardware to facilitate testing, such as comparison of microphones and different construction materials. We conclude that Nosey is a flexible and cost-effective alternative to commercial nasometry devices and propose some methodological considerations for its use in data collection.

* Accepted to Interspeech 2025

Via

Access Paper or Ask Questions

Articulatory strategy in vowel production as a basis for speaker discrimination

May 27, 2025

Justin J. H. Lo, Patrycja Strycharczuk, Sam Kirkham

Abstract:The way speakers articulate is well known to be variable across individuals while at the same time subject to anatomical and biomechanical constraints. In this study, we ask whether articulatory strategy in vowel production can be sufficiently speaker-specific to form the basis for speaker discrimination. We conducted Generalised Procrustes Analyses of tongue shape data from 40 English speakers from the North West of England, and assessed the speaker-discriminatory potential of orthogonal tongue shape features within the framework of likelihood ratios. Tongue size emerged as the individual dimension with the strongest discriminatory power, while tongue shape variation in the more anterior part of the tongue generally outperformed tongue shape variation in the posterior part. When considered in combination, shape-only information may offer comparable levels of speaker specificity to size-and-shape information, but only when features do not exhibit speaker-level co-variation.

* Accepted to Interspeech 2025

Via

Access Paper or Ask Questions

Discovering dynamical laws for speech gestures

Apr 07, 2025

Sam Kirkham

Abstract:A fundamental challenge in the cognitive sciences is discovering the dynamics that govern behaviour. Take the example of spoken language, which is characterised by a highly variable and complex set of physical movements that map onto the small set of cognitive units that comprise language. What are the fundamental dynamical principles behind the movements that structure speech production? In this study, we discover models in the form of symbolic equations that govern articulatory gestures during speech. A sparse symbolic regression algorithm is used to discover models from kinematic data on the tongue and lips. We explore these candidate models using analytical techniques and numerical simulations, and find that a second-order linear model achieves high levels of accuracy, but a nonlinear force is required to properly model articulatory dynamics in approximately one third of cases. This supports the proposal that an autonomous, nonlinear, second-order differential equation is a viable dynamical law for articulatory gestures in speech. We conclude by identifying future opportunities and obstacles in data-driven model discovery and outline prospects for discovering the dynamical principles that govern language, brain and behaviour.

* Accepted for publication in 'Cognitive Science'

Via

Access Paper or Ask Questions

Modelling change in neural dynamics during phonetic accommodation

Feb 03, 2025

Sam Kirkham, Patrycja Strycharczuk, Rob Davies, Danielle Welburn

Abstract:Short-term phonetic accommodation is a fundamental driver behind accent change, but how does real-time input from another speaker's voice shape the speech planning representations of an interlocutor? We advance a computational model of change in phonetic representations during phonetic accommodation, grounded in dynamic neural field equations for movement planning and memory dynamics. We test the model's ability to capture empirical patterns from an experimental study where speakers shadowed a model talker with a different accent from their own. The experimental data shows vowel-specific degrees of convergence during shadowing, followed by return to baseline (or minor divergence) post-shadowing. The model can reproduce these phenomena by modulating the magnitude of inhibitory memory dynamics, which may reflect resistance to accommodation due to phonological and/or sociolinguistic pressures. We discuss the implications of these results for the relation between short-term phonetic accommodation and longer-term patterns of sound change.

Via

Access Paper or Ask Questions

Scaling laws for nonlinear dynamical models of speech

Nov 19, 2024

Sam Kirkham

Abstract:The addition of a nonlinear restoring force to dynamical models of the speech gesture significantly improves the empirical accuracy of model predictions, but nonlinearity introduces challenges in selecting appropriate parameters and numerical stability, especially when modelling variation in empirical data. We address this issue by introducing simple numerical methods for parameterization of nonlinear task dynamic models. We first illustrate the problem and then outline solutions in the form of power laws that scale nonlinear stiffness terms. We apply the scaling laws to a cubic model and show how they facilitate interpretable simulations of the nonlinear gestural dynamics underpinning speech production.

Via

Access Paper or Ask Questions

Towards a dynamical model of English vowels. Evidence from diphthongisation

Aug 30, 2024

Patrycja Strycharczuk, Sam Kirkham, Emily Gorman, Takayuki Nagamine

Figure 1 for Towards a dynamical model of English vowels. Evidence from diphthongisation

Figure 2 for Towards a dynamical model of English vowels. Evidence from diphthongisation

Figure 3 for Towards a dynamical model of English vowels. Evidence from diphthongisation

Figure 4 for Towards a dynamical model of English vowels. Evidence from diphthongisation

Abstract:Diphthong vowels exhibit a degree of inherent dynamic change, the extent of which can vary synchronically and diachronically, such that diphthong vowels can become monophthongs and vice versa. Modelling this type of change requires defining diphthongs in opposition to monophthongs. However, formulating an explicit definition has proven elusive in acoustics and articulation, as diphthongisation is often gradient in these domains. In this study, we consider whether diphthong vowels form a coherent phonetic category from the articulatory point of view. We present articulometry and acoustic data from six speakers of Northern Anglo-English producing a full set of phonologically long vowels. We analyse several measures of diphthongisation, all of which suggest that diphthongs are not categorically distinct from long monophthongs. We account for this observation with an Articulatory Phonology/Task Dynamic model in which diphthongs and long monophthongs have a common gestural representation, comprising two articulatory targets in each case, but they differ according to gestural constriction and location of the component gestures. We argue that a two-target representation for all long vowels is independently supported by phonological weight, as well as by the nature of historical diphthongisation and present-day dynamic vowel variation in British English.

Via

Access Paper or Ask Questions