Automatic plosive detection and VOT extraction using the Plosion index

Project title: Automatic plosive detection and VOT extraction using the Plosion index


Collaborators: ??

Topic: Language

Project description: Do you wish you didn’t have to spend hours manually combing through your audio data to identify plosives and extract their voice onset times (VOT)? Introducing the Plosion Index! The Plosion index is a measurement from the speech recognition literature (Ananthapadmanabha, Prathosh, & Ramakrishnan; 2014) to facilitate the detection of plosives in natural language data. It is based on an algorithm that calculate at each time point the ratio between the instantaneous amplitude of an acoustic signal and the mean amplitude of the signal during the preceding few milliseconds. It is designed to detect brutal changes in amplitude typically linked to the puff of air expulsed out of the mouth during the release phase of a plosive. This mean that the Plosive index is not only able to detect plosives in speech data, but also to determine the exact moment of the plosive release, making it possible to automatize the calculation of VOT, that is the interval of time between the release of the plosive and the moment when the vocal chords start to vibrate to produce the following vowel.

Data to use: Some of my own (Florent) data is available to start working on the project, but my data is overwhelmingly constituted of Spanish and French words and pseudowords starting with the voiced plosive /b/. It would be good to have more data to generalize this technique to other phonemes, especially unvoiced and aspirated plosives, and to plosives in non-initial position.

Link to project repository:

Goals for Brainhack Donostia 2021: I have already implemented this algorithm to successfully automatize the extraction of VOT for word-initial prevoiced plosives (such as /b/). The success rate is around 80 to 90%, so some manual verification is still needed. The goal of this project is to build a roadmap to extend the use of this algorithm to non-prevoiced and aspirated plosives, both in word-initial position and inside words, and to improve the efficiency of the algorithm.

First tasks:

Communication channels:

Video channel: Zoom

Number of collaborators: 4

Credit to collaborators: All contributors will get credit for their work on the README file.

Type of project: Pipeline development

Development status: Basic structure

Programming languages: Python, Praat

Necessary Programming skills level for the project: Familiar

Necessary git skills level for the project: None

Modality: Behavioral

Software suites: Jupyter


What will participants learn: We will learn how to:

Jan 1, 0001 12:00 AM