The Excel import service reads xls definitions for both types and entities and send them to openBIS. It is the replacement of the old master data scripts adding support for the creation of openBIS entities.
The goals are:
For common users an import format with the following features to avoid the shortcomings of the old format:
For advanced users like consultants and plugin developers a tool that allows to specify on an Excel sheet:
To support different use cases the import service supports the next modes, specifying one of them is mandatory.
All data can be arranged according to the needs of the user, in any number of files and any number of worksheets. All files have to be in one directory. Since there may be dependencies between types and entities described in distributed files, they should all be submitted together. The names of the files and worksheets are ignored by the service, the user is advised to use descriptive names that they can quickly remember/refer to later.
We want to define vocabularies and sample types with properties using these vocabularies. We can arrange our files in several ways:
Since all definitions can be distributed, the order of definitions is not important for the Excel import service.
All types of formatting are permitted, and users are encouraged to use them to make their excel files more readable. Adding any non text element (table, clipart) will cause the import to fail.
(A valid, but not easily readable, example)
A valid sheet has to start with definition on the first row.
Each definition has to be separated by one empty row (here it is shown in green)
Two or more consecutive empty rows mark the end of the definitions. Everything below these can be considered a comment.
Header rows can contain values that do not have corresponding elements in the properties row (can be left empty).
Empty columns are allowed between each attribute. However, it is possible that this feature might be disabled in future.
If attribute does not exist, it will be ignored.
Values that do not have corresponding headers will be ignored.
The case of letters in header rows is ignored.
Definition is a single description of an openBIS type or entity. A definition usually consists of:
For in depth explanation of Definitions, please see Definitions, defining entities and other special cases paragraph.
It is possible to create property types, which are not attached to any entity types. In such a case, Definition type row has to state PROPERTY_TYPE; Attribute header and value rows are omitted.
Version is a mandatory field for entity types, it just starts at 1; and during updating a type definition is expected to increment it; otherwise, the system will ignore the changes.
Additionally, use the keyword FORCE to reinstall the type even if is present and has been deleted.
The system keeps track of what versions of entities have been installed storing this information, so in the future when one updates their types, the version specified in the spreadsheet is checked against the stored version.
For every TYPE found in the Excel sheet the next algorithm is performed:
IF (TYPE.Version > STORED_VERSION) OR (TYPE.Version == FORCE):
IF TYPE EXISTS in openBIS:
ELSE IF FAIL_IF_EXISTS:
ELSE IF IGNORE_EXISTING:
PASS (We basically ignore as requested)
ELSE: // Doesn't exist branch
NEW_TYPE = TYPE.VERSION == 1
IF TYPE.Version == FORCE OR NEW_TYPE:
PASS (We basically ignore types that have been deleted by users)
PASS (We basically ignore types that have not been updated)
Most entities can be created, excluding DataSets. There are differences due to the nature of the defined elements themselves.
Spaces and projects cannot have properties, therefore they have to be omitted.
In the properties header and values rows, the user has to specify attributes and properties. The order in which they are listed is arbitrary.
Properties can be referred to as labels in the header row.
Vocabularies in properties value rows can be referred to by either the vocabulary code or the label.
Parent and child columns can be used to define relations between samples. Samples can be addressed by:
Scripts have to reside in .py files in the scripts directory within the folder that contains the Excel files.
Within scripts, files can be organised in any suitable setup.
In order to refer to a validation or dynamic script (e.g. valid.py above), the relative path (from the scripts directory) to the file has to be provided in the relevant column. See the example columns in green below.
<Not complete, subject to change>
Duplicates may be one of the following:
For further reference please see an example spreadsheet: