Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Consult the tooltips in the header rows for more information about what is expected.

Tip

It is highly recommended you delete any columns from the template that are not needed. This eliminates potential confusion and makes the sheets much less unwieldy.

Rules and guidelines

  • You must always enter the Organization to which the resource is associated, either by name or by URI

    Tip
    It is best to use the Sweet to add the Organization (e.g. lab) to which these resources are associated, and then reference this name or URI in the files.
  • References in the primary file, either to resources represented in a secondary file or resources in the repository, need to use the exact label (ignoring case) for the correct linkage to occur
  • If there is more than one value for a given column, enter values separated by ; (semicolon). Conversely, check your input file for the presence of ; in values that are not meant to be split and substitute for a different character.
  • Every resource (primary or secondary) needs to have a name and a type as a minimum. For simplification, the type column is omitted if there are no possible subclasses (e.g. Person, Human Subject). If the template has a type column, you must enter a value.
  • Use URIs instead of labels with caution, as all checks are also by-passed (in particular, the ETL process does not verify that the resource exists in the repository or that the term exists in the ontology, so you may end up with non-conforming data).

  • Generate ETL templates for the most specific type needed, i.e. Antibody instead of parent type Reagent, Core Facility instead of parent type Organization, Biological process phenotype instead of parent type Phenotype, Journal Article instead of parent type Document, etc. This is because the properties generated for each of these more sub-types will be different than those generated for the root class.

    Tip

    We have found that the URI option is most useful when all values in the column are the same - otherwise it makes visual inspection of the data more difficult.

...

  • In the template for the main resource, enter a list of semicolon-separated labels for the embedded resources, and fill the type column if it exists (only one value is required, not a semicolon-separated list). This will result in the creation of "empty" embedded resources
  • ETL this file (see below)
  • In order to ETL the rest of the properties for the embedded resources:
    • Generate a template/map for the embedded resource class
      • Templates generated for embedded classes will have two additional columns: Main Resource Name and Main Resource Type
    • Fill the template with the information for the embedded resource (one row per embedded resource), making sure the entries for Main Resource Name and Main Resource Type match the label and type previously used when ETLing the main resource. If there are different sub-types for different embedded resources (for example, for Phenotypes), those can be specified in the type field here.
    • Run the ETL command with the embedded resource file as input. Note that the -p and -eid parameters will be ignored if they are present.

...