A game-based approach to transcribing images of text


As of July 2018 University of Brighton Repository is no longer updated. Please see our new repository at http://research.brighton.ac.uk.

Dahab, Khalil and Belz, Anja (2010) A game-based approach to transcribing images of text In: 7th International Conference on Language Resources and Evaluation, 19-21 May 2010, Valletta, Malta.

Full text not available from this repository.


We present a methodology that takes as input scanned documents of typed or hand-written text, and produces transcriptions of the text as output. Instead of using OCR technology, the methodology is game-based and produces such transcriptions as a by-product. The approach is intended particularly for languages for which language technology and resources are scarce and reliable OCR technology may not exist. It can be used in place of OCR for transcribing individual documents, or to create corpora of paired images and transcriptions required to train OCR tools. We present Minefield, a prototype implementation of the approach which is currently collecting Arabic transcriptions.

Item Type: Contribution to conference proceedings in the public domain ( Full Paper)
Subjects: Q000 Languages and Literature - Linguistics and related subjects > Q100 Linguistics
Faculties: Faculty of Science and Engineering > School of Computing, Engineering and Mathematics > Natural Language Technology
Depositing User: Converis
Date Deposited: 21 Feb 2012 11:59
Last Modified: 25 Feb 2015 14:44
URI: http://eprints.brighton.ac.uk/id/eprint/9911

Actions (login required)

View Item View Item


Downloads per month over past year