Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction and Main Data Objects

This is a developers' guide to the eagle-i data tools. It will focus primarily on the SWEET (Semantic Web Entry and Editing Tool, formerly, and confusingly, referred to as "the datatools webapp" or "datatools"). Several of the other user-level data manipulation tools (bulk data management, aka datamanagement; bulk data import, aka ETL; and the extraction of resources from published research articles and the like, aka nlp) will come up in passing. They each should have a guide, and this developers' guide will not attempt to cover them in any detail.

This guide begins with the back end of SWEET, most of which can be found in org.eagle-i.datatools (the eagle-i-datatools-common module). The backend components are held in common by all of the datatools; they are not specific to the SWEET. The SweetServlet (org.eaglei.ui.gwt.sweet; in the datatools.sweet.gwt package), in this context, is the user-facing endpoint for the data tools backend.

EIInstance

All of the data tools rely on two key abstractions: the EIInstance and the EIInstanceMinimal. (In reality, the EIBasicInstance is also important, but datatools only sees the EIBasicInstance through the EIInstance, so their attributes are conflated for the purposes of this discussion.) An EIInstance is the java-side representation of the collection of RDF statements (from the repository and from the ontology) about a particular subject (resource) in the repository.The EIInstance only contains representations of those RDF statements that are relevant to eagle-i users. EIInstances are used only for the resources captured in eagle-i.

...

Ontology properties are always displayed in a resource page, either through search or through the SWEET webapps. Non-ontology properties are extra, either because of ontology changes or because they are managed by the repository. Only the SWEET datatools applications bother to load the non-ontology properties.

  • Datatype properties are the properties with values that are complete in themselves. They are represented as a Map<EIEntity, Set<String>>

...

  • Boolean properties, date properties, and of course text properties fall into this category.

...

  • Object properties are the properties with values that refer to other subjects in the repository or ontology. They are represented as Map<EIEntity, Set<EIEntity>>. In practice, the SWEET applications need to be able to distinguish between object properties that refer to terms from the ontology and ones that refer to other instances in the repository. Doing so requires a separate call to the server.

...

  • Non-ontology datatype properties are datatype properties that don't appear in the eagle-i ontology. Many of the so-called "provenance metadata" properties added by the repository itself (creation date, last modified date, contributor) are non-ontology datatype properties. The is_stub property is another, as are the standard note and curator note. In addition, any datatype properties that are associated with an instance but are no longer relevant to the instance's type (either because of a change to the ontology or because the user changed its type) will appear here.

...

  • Non-ontology object properties are the object properties that don't appear in the eagle-i ontology. The remaining "provenance metadata" properties, like workflow state and workflow owner, are non-ontology object properties.

EIInstanceMinimal

The EIInstanceMinimal is the core representation for listing resources in the webapps (both SWEET and search frontends). As the name suggests, it contains only the minimal information required to list the relevant instances.This includes:

  • The label and URI
  • The type
  • The resource-providing organization (lab, center, ...) that contributed this resource
  • All the supertypes up to the eagle-i base type (for filtering)
  • Workflow state and owner
  • Creation and modification dates
  • Whether or not the resource is a "stub"

Datatools back end

Configuration

Datatools backends (particularly for SWEET and etl) need to know the URL of the repository to point to. Because eagle-i applications use https, it's not possible to point to localhost. Instead, the repository location is specified in a configuration file.An example file is found in eagle-i/examples.

...

Datatools web applications will look in the loader's classpath to find the relevant configuration file.

Workflow and access control

The repository provides a mechanism for controlling which users can edit which resources. Details can be found in the Workflow Design Guide. It has four core notions: ownership, workflow state, transitions between states, and roles assigned to users (or types of users), which determine which transitions are legal for each user type. All workflow privileges are based on the URI of the user in the repository; therefore, all datatools operations (listing resources, editing them, making workflow transitions, etc.) require a repository username and password.

...

Transitions are also configurable, and are specified by a URI and label. Each has a precondition: the workflow state the resource must be in for this transition to succeed, and a postcondition: the workflow state this transition will put the resource in when it succeeds. As a result, there are 3 separate "Return to Draft" transitions:

Image Modified

A list of transitions can be found at  url of a machine with an eagle-i repository]/repository/workflow/transitions.

...

In order to prevent conflicting edits, each resource can only be edited by a single "owner". In order to edit a resource, a user must first claim the resource. The user is then set as the resource's workflow owner in the repository. No one else can claim the resource until it has been shared, either explicitly via the "share" button, or by a workflow transition. All workflow transitions clear existing ownership.

Security and logging in

Obviously, then, datatools requires a user-specific login to the repository, while search can make do with a single generic user with no privileges (and access only to the Published graph). Furthermore, datatools needs to retrieve the user's valid transitions in order to be able to present their options correctly. Both applications, though, need to have logged into the repository in some fashion, and both need to keep track of a user's activity and get rid of connections when a user has been inactive for too long. The AuthenticationManager (and AuthenticationProviders) in org.eaglei.services.authentication have the job of handling logging in and stale sessions. Because the datatools operations all go to the eagle-i repository, all SWEET (and other datatools) connections use the StandardAuthenticationProvider and the Apache4xHttpConnectionProvider.

...

A login request goes to the DatatoolsSecurityProvider, which logs in through the AuthenticationManager, which has been configured to use a RepositoryAuthenticationProvider (and therefore present credentials to the eagle-i repository specified by the DatatoolsConfiguration). The DatatoolsSecurityProvider then requests a User from the PrivilegesInfoProvider, using the sessionId from the AuthenticationManager login. The PrivilegesInfoProvider requests information about the current user, including the workflow transitions (if any) this user is authorized to perform. It populates a User object with that information (which is also parceled out into a map from workflow state to the list of allowed transitions, to facilitate determining what if any actions a user is allowed to perform).

Getting ontology information

Many operations require information from the application and domain ontologies. Making that information available is the role of the JenaEIOntModel (eagle-i-model-jena: org.eaglei.model.jena) at the back end. The front-end equivalent is the ModelServlet (eagle-i-common-ui-model-gwt: org.eaglei.model.gwt.server). The details of the servlet and the JenaEIOntModel are beyond the scope of this document. For the purposes of datatools and SWEET, all information about the ontologies is encapsulated in these classes.

...

The datatools backend shares instance retrieval functionality with the search applications. The RepositoryInstanceProvider (org.eaglei.services.repository) performs this function.

SWEET Front End

The SWEET is built in GWT. Documentation for GWT can be found at {{http://code.google.com/webtoolkit/}}this guide assumes some familiarity with GWT.

Servlet

The SweetServlet is a thin wrapper around a collection of AbstractRepositoryProviders as described above. Each call to the servlet checks for a valid sessionId, then dispatches the call to the appropriate provider.

RPC

As usual for a GWT application, much of the org.eaglei.ui.gwt.sweet.rpc package is taken up by definitions of the services and their asynchronous counterparts. ClientSecurityProxy and ClientSweetProxy are different, and important. The ClientSecurityProxy encapsulates authentication and session-related behavior for the ClientSweetProxy. A number of front-end classes register as listeners for changes to sessions; the ClientSecurityProxy is responsible for notifying them when a session becomes valid or invalid. Similarly, the ClientSecurityProxy detects when a user is not authorized to access the SWEET webapp (currently, if that user is not permitted to create resources).

ClientSweetProxy is a single point for all of the UI classes to talk to the backend as needed.  In addition to the SweetServlet, the ClientSweetProxy talks to a ModelService in order to fetch ontology information that certain front-end operations require. In a few cases, the ClientSweetProxy makes multiple server calls for a single user operation, in order to be sure to have the most up-to-date data. Examples include claiming (where first the proxy verifies that the resource is not out of date), and sharing, which re-fetches the instance after a successful share. For now, creating a new instance forces the workflow state to Draft, so that the instance has a valid workflow state; the alternative is to re-fetch the instance (or instances, when creating a resource also creates stub resources).

ApplicationState and how the front end redraws

ApplicationState object is a central (singleton) location for various general bits of information the SWEET webapp needs. It tracks several selections a user has (or has not) made, and allows the UI to fetch and draw the correct information. It holds a list of resource-providing organizations fetched from the repository, as well as a cache of the EIClasses that are known to be embedded and those that are associated with labs (and other resource-creating organizations) and those visible for overall browsing. It's also where the client caches class definitions for use in tooltips.

...

When the user has selected a lab (above left), the selection is stored in the ApplicationState.The LeftListPanel then retrieves the selected lab from the ApplicationState, and displays the top-level types from the ApplicationState's resourceTypesForProvider list. When there's no lab selected (above right), the ApplicationState instead uses resourceTypesForBrowse.In either case, if the ApplicationState's typeEntity is populated, the LeftListPanel highlights the selected resource type (Protocol in above left); otherwise, it selects the "All Resource Types" entry{ (above right)

MainController

The MainController uses the {{Mode (an enum from QueryTokenObject) from the ApplicationState to determine what belongs in the main area of the window. Whenever the application state updates, the MainController checks first for a valid user (if there's none, it clears everything), and then checks the mode.

...

  • EDIT: edit a single resource; show the edit form (including all possible properties as supplied by the domain ontology)*
  • VIEW: show all the properties of a single resource, but only those that have been annotated by a user.
  • DUPLICATE: placeholder mode; application state change should *not* be invoked for this mode. Show an edit form with a new instance, populated from the values of an existing instance. Clears the label field, to force users to add a new label.

...

  • LIST: list all resources meeting the criteria (including the filter criteria) from the ApplicationState. If a typeEntity exists, show only resources with that as a base type (otherwise, show all resources). If a providerEntity exists, show only resources belonging to that lab (or other resource provider). The ApplicationState is set to this mode by the LeftListGridRowWidget, or after a resource has been deleted.
  • FILTER: the ApplicationState mode used when the user clicks the "go" button in the FilterPanel. Shows only resources meeting the description from the ApplicationState, further refined by the FilterPanel options (resources belonging to a subclass, only resources in Draft mode, etc).
  • RESOURCES: a (hackish) way to get an empty resources grid shown. In highly-populated repositories, trying to show all resources of all types from all labs is prohibitively slow. Resources mode short-circuits the process.
  • REFERENCES: List all the resources that refer to the instanceEntity from the ApplicationState. Resource A refers to resource B if resource B is the value of a some property on resource A. (In rdf-speak, we're looking for all the subjects A where B is the predicate for some statement about A, and A is an instance of one of the types we consider as eagle-i resources.)
  • STUBS: List all the resources that were created through a "create new" mechanism in the edit form.
  • Wiki Markup
    _PROVIDERS{_}{:  Shows a list of all organizations to which resources can or should be  added. \[Note: currently broken--shows all organizations\]
  • MYRESOURCES: Show exactly the resources for which the current user is the editor/has claimed the resource.

For any of the modes that involve listing resources, the MainController clears its panel, fetches any relevant resources in the form of a list of EIInstanceMinimal, and draws a ResourcesGrid.  For the edit, view, and duplicate modes, it clears its panel and invokes the FormsPanelFactory to fetch the appropriate resource (and possibly ontology model information) and draw the page.

ResourcesGrid

For any mode in the listing resources group, the MainController retrieves a list of EIInstanceMinimals, and then draws a ResourcesGrid. The ResourcesGrid is responsible for drawing lists of resources, of whatever sort.  The ResourcesGrid has a FilterPanel at the top, which allows filtering the list by any combination of subtype, current owner (claimed by user, or all), and workflow state.  It also has two PaginationWidgets (one at the top and one at the bottom), which determine how many resources to show per page and which page to view.Note that no total counts are available (and page sizes may be slightly off) because a single resource may be returned multiple times from the repository, but is filtered out at the front end.

The ResourcesGrid displays a list of {{GridRowWidget, each of which wraps an EIInstanceMinimal.The GridRowWidget displays relevant information from the EIInstanceMinimal, and provides links to allow the user to claim or share the resource (if it's in a workflow state that the user can change), and to edit or delete it once it's been claimed. The GridRowWidget also has a checkbox which interacts with the ResourcesGrid's actions drop-down to allow bulk workflow transitions.

Displaying and Editing Single Resources

Panels for editing and viewing single resources are constructed by the FormsPanelFactory, which is responsible for fetching the instance (or getting a new, empty instance) and constructing a DatatoolsInstancePanel to display it.

...

The OntologyPropEditRenderer is the ontology renderer for editing resources in a datatools context.  Since it needs to present all possible properties, it loads the properties from the EIOntModel as well as the values that currently exist on the instance (if any). The OntologyPropEditRenderer draws widgets that are subclasses of EditWidget (usually wrapped in EditWidgetCollections).

EditWidget and EditWidgetCollection

As previously mentioned, the values for a given property of an EIInstance is (generally) a set of values (collection that is unordered and for which each value must be unique). As a result, when the user changes a value using an edit form, there needs to be a mechanism for tracking which value is being replaced.The EditWidget heirarchy performs that function.

...