Turkish Dictation System for Radiology and Broadcast News Applications
| Author: |
Arisoy, Ebru |
| Advisor: |
Prof. Dr. Levent Arslan |
| URL: |
f...
" target="_blank"> f...
|
| Completion Date: |
July 2004 |
| Degree: |
M.Sc./M.A. |
| Institution: |
Bogazici University |
| Abstract: |
In this thesis, we have designed a Turkish dictation system for Radiology and Broadcast news applications. Turkish is an agglutinative language with free word order. These characteristics of the language result in the vocabulary explosion and the complexity of the N-gram language models in speech recognition. In order to alleviate this problem, we propose a task-specific, radiology, dictation system. Using classical word-based language models, we achieve 87.06 per cent recognition performance with a small vocabulary size in a speaker independent radiology speech recognition system. However, the same system results in 46.29 per cent recognition rate for the broadcast news dictation due to the large number of out-of-vocabulary (OOV) words. Therefore, we parse some of the words to smaller recognition units like stems, endings and morphemes, and introduced these smaller units and the unparsed words to the speech recognizer as lexicon entries. This time, we manage to overcome to the problem of large number of OOV words with a moderate vocabulary size and get better estimates for the N-gram language models. However, best recognition result is in the word-based language model. |
|