Child pages
  • Sample Batch Update
Skip to end of metadata
Go to start of metadata

Introduction

It is possible to update more than one sample at the same time using a feature that is working in a similar way as sample batch registration (import). It is a much faster way if you want to change many samples. Not only sample properties can be updated but also sample's experiment, parent and container. You can use a TSV (tab separated values) file, or an Excel file.

Go to Import -> Sample Update menu and select the sample type of samples that you want to update (you can select '(multiple)' if you want to update samples of many types at once).

There is a link where you can download a file template containing all the columns that can be used for a batch update of samples of the type you chosen. It is a TSV (tab separated values) file that has the same format as the file format used in the sample batch registration without automatically generated sample codes. As a consequence you can easily update some attributes of samples that you previously registered with wrong or stale data. Before doing this, please check that you will not overwrite someone else's changes made in the meantime (e.g. use the latest values taken from sample browser's table export). Also make sure that you use sample identifiers that are up to date (e.g. nobody moved them to different space in the meantime).

TSV/Excel File Columns

Optionality

Apart from "identifier" column which is mandatory only the columns for attributes that you want to change need to be present in the uploaded file.
You can remove all other columns and the corresponding values of updated samples will be preserved.

Be Careful

If you leave a value in a column empty for a certain sample, the corresponding property data of the sample will be cleared.
In particular, a sample can be detached from an experiment, container sample or parent sample this way.

Identifier

Basically the "identifier" column should contain sample identifiers, e.g. /SPACE_1/SAMPLE_1, but for samples from Default Space (if it was provided in the batch update form) it is enough to put sample codes (e.g. SAMPLE_1) into the column.

Default Space

"default_space" is an artificial column introduced to improve readability of TSV/Excel files. Once defined, it will be used as default space for all the identifiers, which do not include the space part.

Default section

It is possible to define "DFAULT" section before the real data. In this section it is possible to define defaults for any column from TSV/Excel file (e. g. "default_space" or "experiment"). Then, it is not necessary to include those values for every line in the file. However, if there is a need, defaults from "DEFAULT" section can be overridden in actual data.

In case of update of multiple sample types it is possible define defaults on two levels: file level (before first section) and section level (inside section for sample type, can be different for every section). Section level defaults can override file defaults.

Examples

Samples of the same sample type

For all examples in this section the assumption is that you first registered 4 samples of the same type with a file:

The files in examples are not real TSV files that can be used in openBIS - they have row data aligned with headers to improve readability

identifier	parent		experiment		SIZE	ORGANISM
/SPACE_1/S_10	/SPACE_1/P_1	/SPACE_1/PROJ_X/EXP_1	123	HUMAN
/SPACE_1/S_11	/SPACE_1/P_1	/SPACE_1/PROJ_X/EXP_1 	124	FLY
/SPACE_2/S_20	/SPACE_1/P_1	/SPACE_2/PROJ_Y/EXP_2	125	HUMAN
/SPACE_2/S_30	/SPACE_1/P_1	/SPACE_2/PROJ_Y/EXP_2	126	FLY

The sample batch upload functionality is very flexible. All the examples below explain basic changes that can be mixed and used at the same time.

Change parent

To change sample's parent in the last 2 samples you can use the same file that was used for registration with only the parent values changed:

identifier	parent		experiment		SIZE	ORGANISM
/SPACE_1/S_10	/SPACE_1/P_1	/SPACE_1/PROJ_X/EXP_1	123	HUMAN
/SPACE_1/S_11	/SPACE_1/P_1	/SPACE_1/PROJ_X/EXP_1 	124	FLY
/SPACE_2/S_20	/SPACE_2/P_2	/SPACE_2/PROJ_Y/EXP_2	125	HUMAN
/SPACE_2/S_30	/SPACE_2/P_2	/SPACE_2/PROJ_Y/EXP_2	126	FLY

But in fact the only columns that you need are identifier and parent columns. So to achieve the same goal you can remove all other columns as well as unchanged samples rows from the file. Moreover, you can use "default_space" artificial column and DEFAULT section to make it even more simple:

[DEFAULT]
default_space	/SPACE_2
parent		P_2
[DEFAULT]
identifier
S_20
S_30

This second example is also much more safe to use the second file because with the first file you could accidentally overwrite a change done by someone else in the meantime (e.g. someone could have changed the SIZE property value of one of the samples between registration and your update and your update would change it back to the original value which may be out of date).

Change properties

To update SIZE and ORGANISM properties of all 4 samples at once

  • multiplying all SIZE property values by the factor of 10,
  • making implicit connection between sample's space and the ORGANISM property value such that samples from SPACE_1 will be connected with HUMAN while samples from SPACE_2 will be connected with FLY (it was mixed in registration file),

upload a file:

identifier	default_space	SIZE	ORGANISM
S_10		/SPACE_1	1230	HUMAN
S_11		/SPACE_1	1240	HUMAN
S_20		/SPACE_2	1250	FLY
S_30		/SPACE_2	1260	FLY

If you just want to remove ORGANISM property value for samples from SPACE_2 then use a file:

[DEFAULT]
default_space	/SPACE_2
[DEFAULT]
identifier	ORGANISM
S_20
S_30
Change experiment and space

Currently there is no way to change only the space of samples using batch update as there is no space column. You can change the space indirectly by changing the experiment (as a rule sample can be connected only with an experiment from the same space as the sample).

On the other hand if you remove the sample's connection with an experiment then the sample's space will remain unchanged.
For example you could:

  • detach sample /SPACE_2/S_20 from its experiment,
  • change experiment assignment of sample /SPACE_2/S_30 to experiment /SPACE_3/PROJ_Z/EXP_3 changing the sample space to SPACE_3 at the same time,

using a file:

identifier	default_space	experiment
S_20		/SPACE_2
S_30		/SPACE_2	/SPACE_3/PROJ_Z/EXP_3

Note that after update the changed sample's new identifiers will be /SPACE_2/S_20 and /SPACE_3/S_30.

Samples of multiple sample types

This is a more advanced mode of sample batch update. Suppose that you registered similar samples as in previous section but this time you did it using '(multiple)' option in sample batch registration (import) with a file:

[SAMPLE_TYPE_1]
[DEFAULT]
default_space	/SPACE_1
experiment	PROJ_X/EXP_1
[DEFAULT]
identifier	parent		SIZE
S_10		P_1		123
S_11		P_1		124
[SAMPLE_TYPE_2]
identifier	parent		experiment		ORGANISM
/SPACE_2/S_20	/SPACE_1/P_1	/SPACE_2/PROJ_Y/EXP_2	HUMAN
/SPACE_2/S_30	/SPACE_1/P_1	/SPACE_2/PROJ_Y/EXP_2	FLY

This time:

  • samples from SPACE_1 have type SAMPLE_TYPE_1 and only SIZE property is assigned with this type,
  • samples from SPACE_2 have type SAMPLE_TYPE_2 and only ORGANISM property is assigned with this type,
  • note that update of SAMPLE_TYPE_1 is using DEFAULT section and default_space column, what makes the file more readable.

Lets make similar changes as in previous section using sample batch update with '(multiple)' sample type option chosen. You can do this all at once with a single file:

[SAMPLE_TYPE_1]
[DEFAULT]
default_space	/SPACE_1
experiment	PROJ_X/EXP_1
[DEFAULT]
identifier	parent		SIZE
S_10		P_1		1230
S_11		P_1		1240
[SAMPLE_TYPE_2]
[DEFAULT]
default_space	/SPACE_2
[DEFAULT]
identifier	parent		experiment		ORGANISM
S_20		P_2					FLY
S_30		P_2		/SPACE_3/PROJ_Y/EXP_2	FLY

or a file with unchanged columns removed:

[SAMPLE_TYPE_1]
identifier	SIZE
/SPACE_1/S_10	1230
/SPACE_1/S_11	1240
[SAMPLE_TYPE_2]
[DEFAULT]
default_space	/SPACE_2
[DEFAULT]
identifier	parent		experiment		ORGANISM
S_20		P_2					FLY
S_30		P_2		/SPACE_3/PROJ_Y/EXP_2

As a result you will:

  • for sample /SPACE_1/S_10
    • update SIZE property value from 123 to 1230,
  • for sample /SPACE_1/S_11
    • update SIZE property value from 124 to 1240,
  • for sample /SPACE_2/S_20
    • update parent sample from /SPACE_1/P_1 to /SPACE_2/P_2,
    • detach it from experiment /SPACE_2/PROJ_Y/EXP_2,
    • update ORGANISM property value from HUMAN to FLY,
  • for sample /SPACE_2/S_30
    • update parent sample from /SPACE_1/P_1 to /SPACE_2/P_2,
    • change experiment assignment from /SPACE_2/PROJ_Y/EXP_2 to /SPACE_3/PROJ_Z/EXP_3 changing the sample space from SPACE_2 to SPACE_3 at the same time,
    • clear ORGANISM property value.

All other data will be preserved.

  • No labels