Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

  1. System requirements. The current eagle-i network deployment is a reference configuration. In this deployment, eagle-i institutional servers are VMs. System requirements for these VMs are available here.
  2. Unix-like operating system. This procedure is only valid for Unix variants like Linux, Solaris, MacOSX. To run some of the scripts you will need to have these commands installed:
    • bash
    • perl
    • curl
    • awk (surely anything that calls itself unix must have awk)
    • tr (seriously, is tr missing? if you are running Gentoo, install an operating system)
  3. Sun's Java JDK 1.6.0.18 (though any 1.6 version ought to work just as well).
  4. Apache Tomcat web servlet container,*version 6.0 (version 7.0 also works, but this guide refers to version 6.0), configured to run with the Java JDK in #2.
    • Make sure you follow Tomcat installation and configuration instructions for the Tomcat version and Linux distribution you are using; before installing the eagle-i repository, Tomcat must be fully functional. You may want to test this by using Tomcat's manager app, which should be available athttp://localhost/manager/html/ - you will need to edit the file conf/tomcat-users.xml for defining a user and a role - see this guide: Apache Tomcat 6.0 Manager App HOW_TO
    • Tomcat may be configured as a standalone web server, or be fronted by an Apache httpd server. In this guide we assume the former configuration. The latter should also work, but describing it is out of our scope.
    • Tomcat must be configured to use SSL, see the quickstart section here: Apache Tomcat 6.0 SSL Configuration HOW-TO. Note that a production server will require a valid SSL certificate.
    • Network configuration for Tomcat to respond on standard ports 80 and 443 is required. The section Run Tomcat on Port 80 (and 443) under #Procedures details our preferred method. Other methods (e.g. using of Apache httpd) are possible but out of scope for this guide.
    • See the #Procedures section if using Ubuntu's tomcat6 package.
      • It may be necessary to download Tomcat directly and install it manually if the version supplied by the host OS's package system is not usable. Don't hesitate to do this if it is expedient; Tomcat can run as a pure Java application in a single file hierarchy, so a manual download can work just as well (if not better) than the packaged version.

...

You need to determine the repository's home directory. It may be anywhere on the system so long as it satisfies these criteria:

  1. It must be owned by the same user-id under which the servlet container (Tomcat) is executing. The repository will automatically create files and subdirectories as needed.
  2. The host filesystem must have adequate space for your anticipated RDF database, logfiles, and backup snapshots.

...

Follow this step-by-step procedure. Before you start, make sure the Tomcat server is not running.

...

  1. Navigate to Tomcat's webapps directory. If there exist a directory named ROOT, move it aside. The eagle-i repository must be the ROOT application
    No Format
    cd ${CATALINA_HOME}/webapps
    mv ROOT ROOT.original
    
  2. Copy the repository webapp to the Tomcat webapps directory:
    No Format
    cp ${REPO_HOME}/webapps/ROOT.war ${CATALINA_HOME}/webapps/.
    
  3. Create your initial administrative user login. Think of a USERNAME and PASSWORD and substitute them for the upper case words in this command:
    No Format
    bash ${REPO_HOME}/etc/prepare-install.sh USERNAME PASSWORD ${REPO_HOME}
    
  4. Start up Tomcat.
  5. Run the finish-install script, which loads the data model ontology among other things. Note that you can also give it additional options to specify a personal name and email box for the initial admin user.
    No Format
    bash ${REPO_HOME}/etc/finish-install.sh USERNAME PASSWORD https://localhost:8443
    ...or, with username metadata included:
    No Format
    bash ${REPO_HOME}/etc/finish-install.sh \
    -f firstname \
    -l lastname \
    -m admin@ei.edu \
    USERNAME PASSWORD https://localhost:8443
  6. Run the upgrade.sh script, which preforms additional configurations.
    No Format
    bash ${REPO_HOME}/etc/upgrade.sh USERNAME PASSWORD https://localhost:8443
    
  7. Copy the file default.configuration.properties in located in {${REPO_HOME} }}into a  file named {{configuration.properties  and edit the latter to reflect your installation. See the #Configuration section below for details on the property definitions and expected values.
  8. Restart Tomcat to pick up these configuration changes. Confirm that the eagle-i repository is running by visiting the admin page (login with USERNAME and PASSWORD):
    No Format
    https://localhost:8443/repository/admin
    

...

You can set the following properties in the configuration.properties file. Most of the repository's "configuration" comes from administrative metadata in its RDF database and from the ontologies loaded into it, so the configuration settings here are very minimal and mostly serve to bootstrap the RDF repository.

The properties in red are required; those in orange are important and can be considered required for a production system, although they can be elided for a test or development system at the cost of some ugliness in the UI.

Any properties not present (or commented-out) in the configuration properties file will revert to the default values documented here. In most cases this is just fine. The property is only provided so that the application's behavior can be customized and adjusted to suit the requirements of a particular installation site. For example, your site may have a convention of writing all log files to a fielsystem separate from the applications.

  • eaglei.repository.namespace - The namespace URI prefix for Eagle-I resource instances created in the repository.
    • Every administrator should set this to a reasonable value for his/her site, because the default is NOT desireable.
    • The value must be a fully qualified, resolvable, HTTP URL.
    • For example, http://foo.bar.edu/i/
    • Use the http scheme, NOT https, since the container will redirect to https if necessary, but it is not possible to direct back if it becomes preferable to use http later.
    • The system-generated default is the hostname followed by /i/ -- but this is often wrong, since Java's determination of hostnames in a servlet container environment is not reliable.
  • eaglei.repository.title - the decorative title for UI pages, should be set for cosmetic reasons.
    • Set this to the name of your site, e.g. "Miskatonic University School of Medicine".
  • eaglei.repository.logo - URL of the logo image for your site, may be either relative URL (to refer to a image embedded in the webapp) or an absolute URL to use an image hosted elsewhere. It should be about 50 pixels high and a suitable with given the proportions.
  • eaglei.repository.index.url - Set this to the URL to which you want the site's "root" (top-level index) page redirected. Although the repository is installed as the root webapp to have control over resolving Semantic Web URIs, it does not need the root page so this allows you to configure your site as you like.
  • eaglei.repository.admin.backgroundColor - Lets you change the background color for admin web UI pages, to give admins an obvious cue when they are operating on e.g. the production vs. test repos. Value is CSS color expression, e.g. crayon name like "bisque" or hex #CCFFCC (Added in Release 1.2MS2 or 3)
  • eaglei.repository.instance.xslt - path to XSL stylesheet used to transform the HTML output of the instance dissemination service. A value for this key is required to produce XHTML in the dissemination service; without it, the service returns the internal XML document describing the instance.
    • If it is a relative path then it must be located relative to the root of the web application, if absolute then it is in the filesystem at large.
    • The advantage of keeping your stylesheets external to the webapp is that you can change them easily, and don't have to modify the webapp from its default installation.
    • An example is provided at repository/styles/example.xsl which creates very simple HTML, as a demonstration of how to write an XSL stylesheet.
  • eaglei.repository.instance.css - URI of the CSS stylesheet resource to be used to style instance dissemination pages. It must be an absolute path or absolute URL. The default is:
    No Format
    eaglei.repository.instance.css = /repository/styles/i.css
  • eaglei.repository.tbox.graphs - a comma-separated list of graph URIs making up the "TBox".
    You should never have to set this! It is configurable "just in case", and for testing/experimenting. For more information, see the section on inferencing in the API Manual.
    By default, the TBox consists of:
    • The repository's internal ontology, http://eagle-i.org/ont/repo/1.0/
    • The eagle-i data model ontology, http://purl.obolibrary.org/obo/ero.owl
  • eaglei.repository.datamodel.source - the full name of a resource within the webapp which its itself a property file describing the RDF data model ontology. You should not need to set this, the default is adequate for the eagle-i applicaiton. Default is eaglei-datamodel.properties which is a built-in resource file.
    For a description of the contents of this properties file, see the separate document Guide to Data Model Configuration Properties
  • eaglei.repository.sesame.dir - directory where Sesame RDF database files are created.
    • Defaults to sesame subdirectory of home dir.
  • eaglei.repository.log.dir - Directory where log files are created.
    • Defaults to logs subdirectory of the home dir. 
    • You can also configure log4j explicitly by adding log4j properties to this file.
  • eaglei.repository.sesame.indexes - index configuration for Sesame triple store. Must be a comma-separated list of index specifiers, see Sesame NativeStore configuration documentation for details. Use this to change the internal indexes Sesame maintains to process queries. It takes effect on next servlet container (tomcat) restart.
    Warning
    titleWARNING

    If you have a configured value and wish to go back to the default, do NOT just delete this configuration property. If you do, Sesame will simply keep the existing indexes. You must change it to the original default value, which is documented int he default configuration file.

  • eaglei.repository.slow.query - Value in seconds of time after which a SPARQL query should be considered "slow" and logged as such. Only affects the SPARQL Protocol endpoint service. Default is 0, which never logs. Use this to check for performance problems, since it logs the full text of the query and time of occurance in the regular log at INFO level.
  • eaglei.repository.sparqlprotocol.max.time - Time limit, in seconds, of the maximum time allowed for a query invoked by the SPARQL Protocol endpoint. Note that this does not affect any internally-generated SPARQL queries.
    • Any user can override this setting to impose a shorter timeout by giving a value for the nonstandard time argument.
    • Only the Administrator can override with a longer timeout.
    • The built-in default is 600 seconds (10 min) if nothing is configured.
    • If a SPARQL Protocol request cannot be complted within the timeout, it returns an HTTP 413 status (result too large - it was the standard response code that comes closest to the concept).
  • eaglei.repository.anonymous.user - This is a hack, only intended for testing the Anonymous role. Its value is a username, e.g. "nobody". If configured, when the designated user logs in, their session is downgraded to the Anonymous role; this allows explicit testing of Anonymous (vs. Authenticated) access even when the webapp configuration does not allow unauthenticated access. ONLY TESTERS SHOULD EVER NEED TO SET THIS.
  • Configuring Contact Hiding:*The following properties control the contact hiding extension, which restricts the display of "contact location" properties of instances and instead offers an anonymous email option. Red properties are required *only if you enable contact hiding:
    • eaglei.repository.hideContacts - true|false, enables the contact hiding function. When it is false, none of the other properties are used.
    • eaglei.repository.postmaster - email address of repository administrator(s). User-generated messages about resources without a contact email address get sent here, as well as diagnostic messages. We recommend using an email list or alias so it can be changed or directed to multiple people.
    • eaglei.repository.mail.host - hostname of SMTP server for outgoing mail, defaults to localhost.
    • eaglei.repository.mail.port - TCP port number of SMTP server for outgoing mail, only necessary if using a non-default port for your chosen type of service.
    • eaglei.repository.mail.ssl - Use SSL for connection to SMTP server for outgoing mail, value is true or false.
    • eaglei.repository.mail.username - Username with which to authenticate to SMTP server for outgoing mail, default is unauthenticated.
    • eaglei.repository.mail.password - password with which to authenticate to SMTP server for outgoing mail, default is none.

...

  1. Poor Semantic Web Hygiene: Existing semantic web standards and technologies have ways to show that multiple URIs really describe the same object. This method does not use them, in the interest of making a precisely accurate copy of the data and not changing the source.
  2. Previous Content of Target Repo Is Lost: We can't emphasize this enough because someone is sure to make a tragic mistake after not reading these instructions carefully enough. All contents of the repository you are moving resources into will be replaced by the copy of the source repository. All previous contents are lost irretrievably. You made a backup, right?
  3. All State is Preserved: All metadata for the state of e.g. data tools is copied faithfully, so claimed instances will have claims on the new site. This is not intuitive.
  4. User Accounts are added to Destination: All of the user login accounts from the source are added to the target repo. Previously existing accounts are still available there too, but in the event of a duplicate, the target's account is replaced by the user URI and password from the source. You must be aware of any security issues this may create.
  5. Abuse of auto-generated URIs: Since the script is importing a bunch of URIs which were not generated by the native /new service, there is some chance of overlap. This should not happen so long as all repositories use the same time-based UUID generator for the suffix of URIs, but there is always a chance of conflicts.

...

  1. Shut down tomcat. This is major surgery, and tomcats don't like to be vivisected no matter how much more satisfying you may find it.
  2. Disable Java Security -- alternately, you could try to configure all the authorization grants to give the repository webapp access to the filesystem and property resources it needs, but I found it much easier to just disable java security. DO NOT RUN THE TOMCAT PROCESS AS ROOT if you do this, but you should not be running it as root in any case. That's just insane.
    1. Edit the file /etc/init.d/tomcat6 and change the following variable to look like this:
      No Format
      TOMCAT6_SECURITY=no
  3. Install Derby jars: ONLY IF DERBY IS NOT ALREADY INSTALLED IN THE COMMON AREA OF YOUR TOMCAT. If another webapp is already using Derby, they should share that version.
    1. Find the Derby jars in the lib/ subdirectory under where you installed the create-user.sh script.
    2. Copy them to the Tomcat common library directory:
      No Format
      cp ${REPO-ZIP-DIR}/lib/derby* /usr/share/tomcat6/lib/
  4. Install the webapp: First, get rid of any existing root webapp, then copy in the webapp (ROOT.war file from your installation kit) and be sure it is readable by the tomcat6 user:
    No Format
    rm /var/lib/tomcat6/webapps/ROOT*cp ROOT.war /var/lib/tomcat6/webapps/ROOT.war
  5. Install cached webapp context: This is VERY IMPORTANT, and the Tomcat docs does not even mention it, but without it your server will be mysteriously broken. The file /etc/tomcat6/Catalina/localhost/ROOT.xml must be a copy of your app's context.xml. Redo this command after installing every new ROOT.war:
    No Format
    mkdir -p /etc/tomcat6/Catalina/localhost
    unzip -p /var/lib/tomcat6/webapps/ROOT.war META-INF/context.xml > /etc/tomcat6/Catalina/localhost/ROOT.xml
  6. Add System Properties: Be sure you have added system properties to the file /etc/tomcat6/catalina.properties e.g.
    No Format
    org.eaglei.repository.home = /opt/eaglei/repoderby.system.home = /opt/eaglei/repo
    ...of course, the value of these properties will be your Repository Home Directory path.
  7. Start up Tomcat:
    No Format
    sudo /etc/init.d/tomcat6 start
  8. Troubleshooting: If there are problems, check the following places for logs (because packaged apps make everything so much easier):
    • /var/log/daemon.log - really dire tomcat problems and stdout/stderr go to syslog
    • /var/log/tomcat6/* - normal catalina logging
    • ${REPOSITORY_HOME}/logs/repository.log - default repo log file in release 1.1; under 1.0 the filename was default.log.

...