Logo of Science Foundation Ireland  Logo of the Higher Education Authority, Ireland7 Capacities
Ireland's High-Performance Computing Centre | ICHEC
Home | News | Infrastructure | Outreach | Services | Research | Support | Education & Training | Consultancy | About Us | Login


Title:Automatic Generation of Parallel Treebanks
Authors:Zhechev V. and Way A., 2008
Abstract: The need for syntactically annotated data for use in natural language processing has increased dramatically in recent years. This is true especially for parallel treebanks, of which very few exist. The ones that exist are mainly hand-crafted and too small for reliable use in data-oriented applications. In this paper we introduce a novel platform for fast and robust automatic generation of parallel treebanks. The software we have developed based on this platform has been shown to handle large data sets. We also present evaluation results demonstrating the quality of the derived treebanks and discuss some possible modifications and improvements that can lead to even better results. We expect the presented platform to help boost research in the field of data- oriented machine translation and lead to advancements in other fields where paral- lel treebanks can be employed.
ICHEC Project:Data-Oriented Models of Parsing and Translation
Publication:Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, UK (CoLing 2008) pp. 1105–1112
URL: http://www.aclweb.org/anthology/C08-1139
Status: Published

return to publications list