logo
Published on ISCA student website (http://www.isca-students.org)

Looking for Best Practices for Collecting Speech for a Free GPL Speech Corpus

By kmaclean
Created 2007-02-13 04:16

This is a cross-post from comp.speech.research. My apologies if you have already seen this.

Hi,

I am the admin for the VoxForge project. We are collecting user
submitted speech for incorporation into a GPL Acoustic Model ('AM').
Currently we have a Julius/HTK AM being created daily, incorporating
newly submitted audio on a nightly basis.

I am confused as to which approach to take in the creation of the
VoxForge speech corpora. Up until now, we have been asking users to
submit 'clean' speech - i.e. record their submission to ensure that all
noise (i.e. non-speech noise such as echo, hiss, ...) is kept to an
absolute minimum. One guy (very ingeniously I thought) records his
submissions in his closet or in his car!

But some people, whose opinions I respect, say that I should not be
collecting clean speech, but collecting speech in its 'natural
environment', warts and all, with echo and hiss and all that (but
avoiding other background noise such as people talking or radios or
TVs, ...). On some submissions, the hiss is very noticeable.

What confuses me is that some speech recognition microphones are sold
with built-in echo and noise cancellation, and the marketing says that
this improves a (commercial) speech recognizer's performance. This
indicates to me that I should be collecting clean speech, and then use
a noise reduction and echo cancellation front-end on the speech
recognizer, because that is what commercial speech recognition engines
seem to be doing.

And further, if clean speech is required, should I be using noise
reduction software on the submitted audio (such as the submission with
very pronounced hiss). My attempts at noise reduction have not been
successful, with the resulting 'musical noise' (the low level sound
that replaces the removed noise) giving me very poor recognition
results.

I was wondering what your thoughts on this might be,

thanks for your time,

Ken MacLean

--
http://www.voxforge.org [1]

‹ standard english database req [1]

Source URL:
http://www.isca-students.org/looking_for_best_practices_for_collecting_speech_for_a_free_gpl_speech_corpus