Cultural heritage digital resources: from extraction to querying
Genereux, M. (2007) Cultural heritage digital resources: from extraction to querying In: ACL 2007 Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), 28 Jun 2007, Prague, Czech Republic.
Official URL: http://ilk.uvt.nl/latech07/
This article presents a method to extract and query Cultural Heritage (CH) textual digital resources. The extraction and querying phases are linked by a common ontological representation (CIDOC-CRM). A transport format (RDF) allows the ontology to be queried in a suitable query language (SPARQL), on top of which an interface makes it possible to formulate queries in Natural Language (NL). The extraction phase exploits the propositional nature of the ontology. The query interface is based on the Generate and Select principle, where potentially suitable queries are generated to match the user input, only for the most semantically similar candidate to be selected. In the process we evaluate data extracted from the description of a medieval city (Wolfenbuttel), transform and develop two methods of computing similarity between sentences based on WordNet. Experiments are described that compare the pros and cons of the similarity measures and evaluate them.
Repository Staff Only: item control page