speech

1 September, 1995 - 31 August, 2000

COST249

Continuous Speech Recognition Over the Telephone
1 September, 2001 - 31 August, 2004

COST278

Continuous speech recognition over the telephone

The Broadcast News Interest Group is a consortium of research groups collaborating with each other in the domain of Broadcast News Transcription (BNT). The group operates within the European COST action on Spoken Language Interaction in Telecommunication. Its project website can be found at http://cost278.elis.ugent.be.

1 September, 1998 - 31 August, 2003

CGN

Corpus gesproken Nederlands (The Spoken Dutch Corpus)

The Spoken Dutch Corpus (CGN = Corpus gesproken Nederlands) project has resulted in a large corpus of contemporary Dutch as spoken in Flanders and in the Netherlands. The corpus contains about 1000 hours of speech, and all this speech has been annotated at several levels. The basic annotations which are made for the entire corpus are an orthographic transcription, a part-of-speech tagging and a lemmatization.

1 January, 2001 - 31 December, 2004

ACCENT - Pronunciation Modeling

This research aimed at modeling pronunciation variations for Automatic Speech Recognition (ASR) at the level of the lexicon (as opposite to a modeling at the level of the acoustic models). We developed a data-driven technique for upgrading a lexicon of reference pronunciations to one with multiple pronunciation variants per entry of the reference lexicon. The approach is based on the following basic principles:

1 September, 2001 - 31 August, 2004

ATRANOS - Audio segmentation

An important application of speech technology is the automatic transcription of audio files. This transcription could enormously facilitate the user access to the growing number of audio files in audio archives (e.g. the VRT archive) and on the internet. It can also speed up the creation of TV program captions for the deaf. A problem with most audio files is that they do not only contain speech but also other sound sources like music, background noise, clips, etc.

1 September, 1999 - 31 August, 2001

PROMOTEX - Intonation modeling

For some time the group has been involved in projects on data-driven prosodic modeling for a text-to-speech synthesis. Our prosodic modeling approach starts with the prediction of word prominence values (PRM) for each word of the text and prosodic boundary strength values (PBS) between successive words of that text. The next step is the derivation of good intonation and phoneme duration patterns from that information. Two different methods of intonation modeling were developed and compared:

1 September, 1995 - 31 August, 1998

Segmental acoustic models for ASR

To circumvent some of the problems arising from the within state independency assumption that is made for Hidden Markov Models (HMMs), an alternative Automatic Speech Recognition (ASR) approach using segmental models has been developed. Such models do not score individual frames, but variable length sequences of frame vectors. The ELIS approach is specific in different ways. First of all, it uses an auditory model feature extractor as its acoustic front-end.

1 June, 2005 - 31 May, 2007

AUTONOMATA

In many modern applications such as directory assistance, name dialing, car navigation, etc. one needs a speech recognizer and/or a speech synthesizer. The former to recognize spoken user commands and the latter to pronounce information found in a database. Both components make use of phonetic transcriptions of the words to recognize/pronounce. In order to develop an application, the developer needs a tool that accepts words/sentences and that returns the phonetic transcriptions of these words/sentences.

1 April, 2009 - 1 April, 2012

ORGANIC

Self-organized recurrent neural learning for language processing

This EU-FP7 project aims at investigating whether it is possible to incorporate principles of human brain processing (such as self-organization, deep hierarchical processes, fast adaptation, supervised and unsupervised learning) into a new type of automatic speech recognizer. We will attempt to reach our goal by adopting the paradigm of reservoir computing because it has been demonstrated recently that this paradigm allows one to build accurate and robust (against noise) isolated digit recognizers.

1 January, 2007 - 31 December, 2010

TELEX

Combining acoustic TEmplates and LEXical modeling

This FWO-funded project aims at combining bottom-up phonetic recognition and long span example based recognition into a single speech recognition architecture that beats mainstream state-of-the-art HMM systems in terms of accuracy, be it at a higher computational cost. The project runs in collaboration with KULeuven.