WinPitch Corpus, a Tool for Alignment and Analysis of Large Corpora
|Presented by:||Philippe Martin , Université Paris 7|
|Project / Software Title :||WinPitch Corpus|
|Project / Software URL:||http://www.winpitch.com|
|Access / Availability:||This software is available at www.winpitch.com|
Description of endangered languages normally
starts with the collection of speech data, which are then segmented into
various phonological, prosodic, morphological and syntactic units. In this
process, the (phonetic) transcription is the most critical part, and user
friendly tools are essential to tackle any sizeable work in a reasonable
amount of time.
The software program WinPitch Corpus addresses these concerns directly, allowing two modes of operation to handle the data. In the first mode, text is not available and is generated by the user speech segment by speech segment (as it was the case when only analog tape recorders were available). In the second mode, speech has already been transcribed into text, but the text units are not aligned, i.e. a bi-univocal relationship between units of text and units of speech has not been established.
Although some existing software programs operate in the first mode, establishing implicit text and speech alignment in the process, few allow operations in commonly found (difficult) recording conditions such as voice overlapping or presence of noise. This paper introduces briefly some of the important features of WinPitch Corpus, as an efficient tool for transcription and analysis of speech data: slower speech rate for easier transcription, dynamic adjustment of segments with simultaneous display of spectrograms for precise alignment, etc.
Numerous speech analysis tools (fundamental frequency tracker, spectrogram, LPC formant analysis, etc.) are available with a quasi instantaneously display of the results. Support for the simultaneous acoustical analysis of both channels of stereo recordings is also provided.
The program has already been extensively used for analysis of large romance languages corpora of spontaneous speech (more than 1.200.000 words, C-ORAL-ROM, 2003), as well as for the phonetic and phonological description of Parkatêjê, an endangered language of the Amazon spoken by about 300 people (Araújo and Martin, 2003). WinPitch Corpus is available from the www.winpitch.com web site, under the name WinPitchPro.
Text to speech alignment can be done in two modes. In the first mode, text
does not exists, and the user selects blocks of speech (which can be slowed
down for playback), and enters the corresponding text (any UNICODE font can
be used directly). In this process, a database is automatically built, which
can be later saved in XML or Excel® formats.|
The second mode of text to speech alignment implies a preexisting text. The speech sound is then played back at a reduced speed (dynamically programmable) while the user clicks on the part of text corresponding to the perceived sound unit. A database of the dynamically defined segments is automatically built (table in the dialog box on the left).