Logo of Science Foundation Ireland  Logo of the Higher Education Authority, Ireland7 Capacities
Ireland's High-Performance Computing Centre | ICHEC
Home | News | Infrastructure | Outreach | Services | Research | Support | Education & Training | Consultancy | About Us | Login


Title:Parser-Based Retraining for Domain Adaptation of Probabilistic Generators
Authors:Deirdre Hogan, Jennifer Foster, Joachim Wagner and Josef van Genabith, 2008
Abstract: While the effect of domain variation on Penn-treebank- trained probabilistic parsers has been investigated in previous work, we study its effect on a Penn-Treebank-trained probabilistic generator. We show that applying the generator to data from the British National Corpus results in a performance drop (from a BLEU score of 0.66 on the standard WSJ test set to a BLEU score of 0.54 on our BNC test set). We develop a generator retraining method where the domain-specific training data is automatically produced using state-of-the-art parser output. The retraining method recovers a substantial portion of the performance drop, resulting in a generator which achieves a BLEU score of 0.61 on our BNC test data.
ICHEC Project:TREC-2008 Blog Track - Parsing for Sentiment Analysis
Publication:In Proceedings of the 5th International Natural Language Generation Conference (INLG08), Salt Fork Park, Ohio
URL: http://rian.ie/en/item/view/30441.html
Keywords: Machine translating; Penn-Treebank-trained probabilistic generator
Status: Published

return to publications list