Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This guide provides an overview of tasks pertaining to ETL and the usage of the SWIFT toolkit. The ETL workflow requires a person with domain knowledge and understanding of the eagle-i resource ontology to prepare the input files for optimal upload, and a person with basic knowledge of Unix to run the commands and troubleshoot potential errors. A detailed description is provided in: Data preparation and ETL Workflow 

The SWIFT toolkit is comprised of:

...

  • You may obtain the type URI from the eagle-i ontology browser . Use the left bar to find the most specific type you need, select it and grab its URI, e.g. http://purl.obolibrary.org/obo/ERO_0000229 for Monoclonal Antibodies.
     

...

Warning

The ETLer expects data to be entered into one of the generated templates, and a few conventions to be respected (see Data preparation and ETL Workflow) . A data curator usually makes sure that the template is correctly filled. In particular, the location of the resources to be ETLd (e.g. Lab or Core facility name) must be provided in every row of data.

...

  • Use this command if  the input file represents resources that have been previously uploaded or created in eagle-i

  • The value of the -eid parameter (external identifier) is the URI of a property that uniquely identifies the resource outside eagle-i. This property will be used to match the input to a resource in the eagle-i repository. Grab the property URI from the eagle-i ontology browser (expand the property name to see all information about a property). Example properties are: 

    • Catalog number, -eid http://purl.obolibrary.org/obo/ERO_0001528
    • Inventory number, -eid http://purl.obolibrary.org/obo/ERO_0000044
    • RDFS label, use the shorthand syntax -eid label
  • If the ETL process finds a matching resource, it will replace all its properties with the values from the input file; the URI of the matched resource will be preserved.
  • If the ETL process does not find a matching resource, a new resource will be created.
  • The value of the -p parameter indicates the desired workflow state for newly created resources. Existing resources will retain their workflow state.

...