Cultural heritage digital resources: from extraction to querying

Genereux, M. (2007) Cultural heritage digital resources: from extraction to querying In: ACL 2007 Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), 28 Jun 2007, Prague, Czech Republic.


Download (77kB) | Preview


This article presents a method to extract and query Cultural Heritage (CH) textual digital resources. The extraction and querying phases are linked by a common ontological representation (CIDOC-CRM). A transport format (RDF) allows the ontology to be queried in a suitable query language (SPARQL), on top of which an interface makes it possible to formulate queries in Natural Language (NL). The extraction phase exploits the propositional nature of the ontology. The query interface is based on the Generate and Select principle, where potentially suitable queries are generated to match the user input, only for the most semantically similar candidate to be selected. In the process we evaluate data extracted from the description of a medieval city (Wolfenbuttel), transform and develop two methods of computing similarity between sentences based on WordNet. Experiments are described that compare the pros and cons of the similarity measures and evaluate them.

Item Type: Contribution to conference proceedings in the public domain ( Full Paper)
Uncontrolled Keywords: Cultural heritage; Natural language generation
Subjects: Q000 Languages and Literature - Linguistics and related subjects > Q100 Linguistics
V000 Historical and Philosophical studies
G000 Computing and Mathematical Sciences > G700 Artificial Intelligence
Faculties: Faculty of Science and Engineering > School of Computing, Engineering and Mathematics > Natural Language Technology
Depositing User: Helen Webb
Date Deposited: 18 Nov 2007
Last Modified: 21 May 2014 11:01

Actions (login required)

View Item View Item


Downloads per month over past year