Statistical generation: three methods compared and evaluated

BELZ, ANJA (2005) Statistical generation: three methods compared and evaluated In: Proceedings of the 10th European Workshop On Natural Language Generation, 8-10 Aug 2005, Aberdeen, Scotland.

[img]
Preview
Text
belz-enlg05.pdf - Accepted Version

Download (103kB) | Preview

Abstract

Statistical NL G has largely meant n-gram modelling which has the considerable advantages of lending robustness to NL G systems, and of making automatic adaptation to new domains from raw corpora possible. On the downside, n-gram models are expensive to use as selection mechanisms and have a built-in bias towards shorter realisations. This paper looks at treebank-training of generators, an alternative method for building statistical models for NL G from raw corpora, and two different ways of using treebank-trained models during generation. Results show that the treebank-trained generators achieve improvements similar to a 2-gram generator over a baseline of random selection. However, the treebank-trained generators achieve this at a much lower cost than the 2-gram generator, and without its strong preference for shorter reasations.

Item Type: Contribution to conference proceedings in the public domain ( Full Paper)
Uncontrolled Keywords: Natural language generation
Subjects: Q000 Languages and Literature - Linguistics and related subjects > Q100 Linguistics
Faculties: Faculty of Science and Engineering > School of Computing, Engineering and Mathematics > Natural Language Technology
Depositing User: Converis
Date Deposited: 18 Nov 2007
Last Modified: 21 May 2014 11:01
URI: http://eprints.brighton.ac.uk/id/eprint/3203

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year