Dutch

1 September, 1998 - 31 August, 2003

CGN

Corpus gesproken Nederlands (The Spoken Dutch Corpus)

The Spoken Dutch Corpus (CGN = Corpus gesproken Nederlands) project has resulted in a large corpus of contemporary Dutch as spoken in Flanders and in the Netherlands. The corpus contains about 1000 hours of speech, and all this speech has been annotated at several levels. The basic annotations which are made for the entire corpus are an orthographic transcription, a part-of-speech tagging and a lemmatization.

1 May, 2006 - 30 September, 2008

N-BEST

Dutch Benchmark Evaluaton of Speech recognition Technology

This STEVIN project is coordinated by TNO Utrecht and is about the setting up of a large-scale benchmark for speech recognition in Dutch, and with participation of different research groups in the Netherlands (Nijmegen, Twente, Delft), Flanders (Gent, Leuven) and abroad (TU-Brno and LIMSI)