|Back to search engine||(This search engine is still in an experimental stage)|
The EMK database consists of interviews held with various dialect speakers of Estonian. Most of the recordings have been made between 1960 and 1980, the oldest dates back to 1938, the latest recording was made in 1990. In total 229 recordings are included in the database. The corpus consists of ca. 1 million words (in 2008).
The dialect corpus is compiled by the University of Tartu in cooperation with the Institute of the Estonian Language. The materials used in the corpus are for the most part compiled by the Institute of the Estonian Language.
Each recording consists of an interview with a dialect speaker. The interview contains free speech. The informant was asked by the interviewer to tell something about the past, no questionnaire was used during an interview. The interview took place in the home of the dialect speaker.
The informants lived in the country. Their parents were born and raised in the same area. Most of the informants were more than eighty years of age.
The recordings were at first transcribed according to Finno-Ugric phonetic transcription and later converted into plain text (simplified transcription). The texts are all morphologically tagged in XML. The EMK database can be reached at http://www.murre.ut.ee/triip/murdekorpus/