Page History

Wiki Markup

h3. Contents
{toc:maxLevel=2}
This page describes the way a _data model ontology_ is configured into the repository. This ontology is like (but not really equivalent to) a schema for RDF data, it predicts what you are likely to find in the data and describes how you should react to it. For example, the ontology describes which properties of a resource instance should not be exposed to the public, and this configuration tells the repository where to look in the ontology for those rules.

h1. Introduction

The repository is designed to be independent of any data model ontology. All of its interactions with the ontology are externally configured, through a properties file. This should allow other projects to use the eagle-i repository as a standalone component, incorporating their own data model.

The data model configuration is contained in a properties file _separate_ from the main repository configuration. Since the data model configuration is:
# uniform among all repository instances for an application
# unlikely to be changed
# contains very complex and specialized information

It makes more sense to keep it separate from an instance's configuration, reserving that for instance-specific variables.

h1. Locating the Configuration

The data model configuration file is expected to be packaged as a resource in the repository's web application (webapp). Since it is so crucial to the operation of the repository, this ensures it is always available and is not subject to an administrator's mistakes.&nbsp;

To allow alternate data model configurations to be selected, there is a configuration property in the main repository configuration which names the resource path of the data model configuration: *eaglei.repository.datamodel.source*. Its default value is the path "eaglei-datamodel.properties", which describes the eagle-i data model.&nbsp; You do not need this property in the system configuration if this default is acceptable.

h2. Format

The data model configuration file is read by [Apache Commons Configuration|http://www.google.com/url?q=http%3A%2F%2Fcommons.apache.org%2Fconfiguration%2Findex.html&sa=D&sntz=1&usg=AFrqEzdYKqJPqb5zeZ8iHrUZ9afjBCSBJg], which recognizes interpolated property and system property values.&nbsp; See the Apache&nbsp; documentation for more information about features in the configuration file.

You _can_ treat it simply as a Java properties file, but be aware that the comma (,) character has special significance in property values.&nbsp; It is used to specify a list of values, and if the value is *{_}not{_}* expected to be a list, everything after the first comma is ignored.

h2. About Versions

Although your data model ontology may not contain any RDF statements at all, it is *{_}required{_}* to have at least one statement describing its _version_, so the loaded version can be compared with the one available in the webapp when considering an update and generating reports. (See the */model* service in the REST API for details.)
The version statement consists of an arbitrary URI as subject, the predicate [http://www.w3.org/2002/07/owl#versionInfo], and a literal string value that is expected to start with a Maven-style version number (a sequence of integers separated by dot (.). For example: {noformat}<http://purl.org/eagle-i/app-ext/> <http://www.w3.org/2002/07/owl#versionInfo> "0.7.6 8274 2011-03-28 15:06:29"{noformat}
As you will see in the next section, the version is also key to describing the location of the data model as well.

h2. Loading the Data Model

See the description of the property {datamodel.versionInfo.source} for the gory details of how the repository actually loads the data model. Although it seems somewhat convoluted, it lets you avoid specifying redundant information, such as the JAR file containing the ontology, since it can be derived from what was already given by making some safe assumptions.

h1. Properties in the Data Model Configuration

h2. Ontology Metadata

These properties describe the data model ontology itself, and how it is managed by the repository.

* *datamodel.graph.uri*
The value is the URI naming the named graph were the data model ontology statements are to be stored. The repository adds statements to this graph when it loads the data model.
* *datamodel.graph.label*
The human-readable label to be set on the data model's named graph. This is purely decorative and will be seen in the Admin GUI and *listGraphs* service.
* *datamodel.versionInfo.source*
This property names the file in the data model containing the relevant {{owl:versionInfo statement}}. It has two distinct purposes:
** First, indicate the resource file containing the statement describing the actual released version of this data model ontology.
** Second, as a fortunate side-effect, identify the container or archive that can be expected to contain the data model ontology files.
The current algorithm for loading the data model ontology is to locate this container (which is assumed to be a JAR archive file) and _load every file in it a name ending in_ ".owl". This is a crude assumption that works for ontologies maintained in Protoge.

* *datamodel.versionInfo.uri*
Identifies the subject URI of the statement expected to contain the version of the data model ontology. The predicate is, as has been noted, {{owl:versionInfo}} (see section on Versions above).

h1. Data Model Description

These properties direct where the repository looks in the ontology to control some of its actions, e.g. to decide what properties of a resource instance are to be hidden from public view.

h2. Label Properties

When the repository prepares a human-readable description of a resource instance, it attempts to display the text _label_ for all URIs in the subject, predicate, and object of the instance's graph. Also, when computing the graph to return for an RDF dissemination request, it includes the labels of all of these objects.

Since some ontologies use other properties for labels instead of or in addition to the conventional {{rdfs:label}}, the repository can be configured with an ordered list of label properties. Its influence is as follows:

# In a human-readable dissemination of the resource instance, the _first_ label predicate with a value is shown in place of the URI. If none have values, then the URIs local name is displayed.
# In a dissemination of serialized RDF, values for _any and all of the label predicates_ are included, in no particular order.

The configuration property is:

* *datamodel.label.predicate*
Value is a comma-and-or-space separated list of URIs to be used as label predicates. For example:
{noformat}datamodel.label.predicate = http://eagle-i.org/ont/app/1.0/preferredLabel, \
http://purl.obolibrary.org/obo/IAO_0000111, \
http://www.w3.org/2000/01/rdf-schema#label
{noformat}

h2. Embedded Instances

* *datamodel.embedded.predicate*
This defines the _predicate_ of a statement _in the data model ontology_ whose subject URI is identified as&nbsp; the name of an embedded instance class.
* *datamodel.embedded.object*
This defines the _object_ of a statement _in the data model ontology_ whose subject URI is identified as the name of an embedded instance class.

Any class URIs in the ontology which are the subject of statements with the given predicate and object are considered _embedded object classes_.&nbsp; This is significant in the implementation of the repository and will not be covered here.&nbsp; For example, here is the configuration, and a statement in the ontology describing an affected class:

{noformat}
datamodel.embedded.predicate = http://eagle-i.org/ont/app/1.0/inClassGroup
datamodel.embedded.object = http://eagle-i.org/ont/app/1.0/ClassGroup_embedded_class
{noformat}

{noformat}
# some embedded instance classes
@prefix ero: <http://eagle-i.org/ont/app/1.0/> .
<http://purl.obolibrary.org/obo/OBI_1110023> ero:inClassGroup ero:ClassGroup_embedded_class .
<http://purl.obolibrary.org/obo/ERO_0000404> ero:inClassGroup ero:ClassGroup_embedded_class . ...
{noformat}

h2. Hidden Properties

These properties tell the repository how to identify properties that should be hidden from public view. {color:#ff0000}The exact rules about how they are to be exposed are not designed yet{color}.

In practice, the predicate and object configuration lines work exactly the same as for embedded instances above, only they name the URIs of _properties_ that are to be hidden.

* *datamodel.hideProperty.predicate*
This defines the _predicate_ of a statement _in the data model ontology_ whose subject URI is identified as the name of a _hidden property_.
* *datamodel.hideProperty.object*
This defines the _object_ of a statement _in the data model ontology_ _in the data model ontology_ whose subject URI is identified as the name of a _hidden property_.

h2. Contact Properties

These configuration keys define how the repository detects _contact properties_, i.e. properties that contain information about real-world contacts such as e-mail addresses, phone numbers, lat/long, etc. There are some situations where this information must be kept confidential, so this is how it is identified. {color:#ff0000}The exact rules about how they are to be exposed are not designed yet{color}.

The predicate and object configuration lines work exactly the same as for hidden properties above.

* *datamodel.contactProperty.predicate*
This defines the _predicate_ of a statement _in the data model ontology_ whose subject URI is identified as&nbsp; the name of a _contact property_.
* *datamodel.contactProperty.object*
This defines the _object_ of a statement _in the data model ontology_ _in the data model ontology_ whose subject URI is identified as&nbsp; the name of a _contact property_.

h2. Contact Email Address

These configuration properties describe how the repository locates a _contact email address_ for a given resource instance. This address is where email generated by the "contact the owner" form (see the Repository API for details).

* *datamodel.contact.email.query*
The text of the SPARQL query to find the email addresses. It is expected to be a SELECT query that binds the variables mentioned in the bindings list below. You can use a backslash at the end of the line to escape newlines in the query.
* *datamodel.contact.email.bindings*
An ordered list of the query variables that can be bound by the email query. Each variable is identified only by its name, _without_ the leading "?". They are separated by comma and/or space. The _first_ one of these with a value is used as the email address. If there are multiple results to the query, any additional values of *{_}that{_}* variable (and no others) are added as destination addresses.

Here is an example:

{noformat}
datamodel.contact.email.bindings = email1,email2,email3, email4
datamodel.contact.email.query = \
SELECT ?email1 ?email2 ?email3 ?email4 WHERE { \
OPTIONAL { \
?subject <http://purl.obolibrary.org/obo/ERO_0000021> ?person . \
?person <http://purl.obolibrary.org/obo/ERO_0000041> ?email1 . } \OPTIONAL { \
?subject ?any_predicate ?organization . \
?any_predicate <http://eagle-i.org/ont/app/1.0/inPropertyGroup> \
<http://eagle-i.org/ont/app/1.0/PropertyGroup_RelatedResourceProvider> . \
OPTIONAL { \
?organization <http://purl.obolibrary.org/obo/ERO_0000021>
?person . \ ?person <http://purl.obolibrary.org/obo/ERO_0000041> ?email2 . \
filter(!bound(?email1)) } \ OPTIONAL { \
?organization <http://purl.obolibrary.org/obo/ERO_0000041> ?email3 . \
filter(!bound(?email1) && !bound(?email2)) } \
OPTIONAL { \
?organization <http://purl.obolibrary.org/obo/ERO_0000022>
?pi . \ ?pi <http://purl.obolibrary.org/obo/ERO_0000041> ?email4 . \
filter(!bound(?email1) && !bound(?email2) && !bound(?email3)) \ }}}
{noformat}

Page tree

Versions Compared

Old Version 30

New Version 31

Key