Lectori salutem

The Old and Middle Hungarian corpus of informal language use contains texts belonging to two genres that are supposed to best represent informal language use: private letters and testimonies of witnesses in trials. All the sources predate 1772, the symbolic end of the Middle Hungarian period. As its normalized and morphologically annotated texts are also annotated with sociolinguistic metadata, it is a highly practical resource for conducting research on historical morphology and sociolinguistics, but it can also be used for studies on historical syntax, pragmatics, and lexicology. The current size of the corpus (as of 03. 2020) is approximately 8.1 M characters (1.05 M analyzed word tokens), 50% of which are letters and 50% are court records.

The creation of the corpus was funded by grants Nr. 81189 and 116217 of the Hungarian Scientific Research Fund.

If you use the corpus, please cite to the following articles (please click on the titles to access the papers and for further bibliographical data):
1. Attila Novák, Katalin Gugán, Mónika Varga, Adrienne Dömötör: Creation of an annotated corpus of Old and Middle Hungarian court records and private correspondence. Language Resources and Evaluation (2018): pp. 1–28
2. Dömötör Adrienne, Gugán Katalin, Novák Attila, Varga Mónika: Kiútkeresés a morfológiai labirintusból – korpuszépítés ó- és középmagyar kori magánéleti szövegekből. [Finding the way out of the morphological maze: Building a corpus of Old and Middle Hungarian informal texts.] NyK. 113 (2017): pp. 85–110

We kindly ask you to notify us if you publish results that were obtained using this corpus, and also to cite the given article in that case.

