SDK Reference¶
Resolwe¶
- class resdk.Resolwe(username: str | None = None, password: str | None = None, url: str | None = None)[source]¶
Connect to a Resolwe server.
- Parameters:
- data_usage(**query_params)[source]¶
Get per-user data usage information.
Display number of samples, data objects and sum of data object sizes for currently logged-in user. For admin users, display data for all users.
- get_or_run(slug: str | None = None, input: dict = {})[source]¶
Return existing object if found, otherwise create new one.
- get_query_by_resource(resource: type[BaseResource]) ResolweQuery[source]¶
Get ResolweQuery for a given resource.
- Raises:
ValueError – if the resource is not a subclass of BaseResource.
- login(username: str | None = None, password: str | None = None)[source]¶
Perform the interactive login.
If only username is given prompt the user for password via shell. If username is not given, prompt for interactive login.
- run(slug: str | None = None, input: dict = {}, descriptor: dict | None = None, descriptor_schema: str | None = None, collection: Collection | None = None, data_name: str = '', process_resources: dict | None = None) Data[source]¶
Run process and return the corresponding Data object.
Upload files referenced in inputs
Create Data object with given inputs
Command is run that processes inputs into outputs
Return Data object
The processing runs asynchronously, so the returned Data object does not have an OK status or outputs when returned. Use data.update() to refresh the Data resource object.
- Parameters:
slug (str) – Process slug (human readable unique identifier)
input (dict) – Input values
descriptor (dict) – Descriptor values
descriptor_schema (str) – A valid descriptor schema slug
collection (int/resource) – Collection resource or it’s id into which data object should be included
data_name (str) – Default name of data object
process_resources (dict) – Process resources
- Returns:
data object that was just created
- Return type:
Data object
Resolwe Query¶
- class resdk.ResolweQuery(resolwe: Resolwe, resource: type[BaseResource], slug_field: str = 'slug')[source]¶
Query resource endpoints.
A Resolwe instance (for example “res”) has several endpoints:
res.data
res.collection
res.sample
res.process
…
Each such endpoint is an instance of the ResolweQuery class. ResolweQuery supports queries on corresponding objects, for example:
res.data.get(42) # return Data object with ID 42. res.sample.filter(contributor=1) # return all samples made by contributor 1
This object is lazy loaded which means that actual request is made only when needed. This enables composing multiple filters, for example:
res.data.filter(contributor=1).filter(name='My object')
is the same as:
res.data.filter(contributor=1, name='My object')
This is especially useful, because all endpoints at Resolwe instance are such queries and can be filtered further before transferring any data.
To get a list of all supported query parameters, use one that does not exist and you will et a helpful error message with a list of allowed ones.
res.data.filter(foo="bar")
- all() ResolweQuery[source]¶
Return copy of the current queryset.
This is handy function to get newly created query without any filters.
- create(**model_data: dict) BaseResource[source]¶
Return new instance of current resource.
- delete(force: bool = False)[source]¶
Delete objects in current query.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- filter(**filters: dict) ResolweQuery[source]¶
Return clone of current query with added given filters.
- get(*args, **kwargs)[source]¶
Get object that matches given parameters.
If only one non-keyworded argument is given, it is considered as id if it is number and as slug otherwise.
- Parameters:
uid (int for ID or string for slug) – unique identifier - ID or slug
- Return type:
object of type self.resource
- Raises:
ValueError – if non-keyworded and keyworded arguments are combined or if more than one non-keyworded argument is given
LookupError – if none or more than one objects are returned
- iterate(chunk_size: int = 100, show_progress: bool = False) Iterable[BaseResource][source]¶
Iterate through query.
This can come handy when one wishes to iterate through hundreds or thousands of objects and would otherwise get “504 Gateway-timeout”.
The method cannot be used together with the following filters: limit, offset and ordering, and will raise a
ValueError.
- search(text: str) ResolweQuery[source]¶
Full text search.
Resources¶
Resource classes¶
- class resdk.resources.base.BaseResource(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
Abstract resource.
One and only one of the identifiers (slug, id or model_data) should be given.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)[source]¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- class resdk.resources.base.BaseResolweResource(resolwe, **model_data)[source]¶
Base class for Resolwe resources.
One and only one of the identifiers (slug, id or model_data) should be given.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- class resdk.resources.Data(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe Data resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- property children¶
Get children of this Data object.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None, show_progress: bool = True) list[str][source]¶
Download Data object’s files and directories.
Download files and directories from the Resolwe server to the download directory (defaults to the current working directory).
Data objects can contain multiple files and directories. All are downloaded by default, but may be filtered by name or output field:
re.data.get(42).download(file_name=’alignment7.bam’)
re.data.get(42).download(field_name=’bam’)
- download_and_rename(custom_file_name: str, overwrite_existing: bool = False, field_name: str | None = None, file_name: str | None = None, download_dir: str | None = None)[source]¶
Download and rename a single file from the Data object.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None) list[str][source]¶
Get list of downloadable file fields.
Filter files by file name or output field.
- property parents¶
Get parents of this Data object.
- property permissions: PermissionsManager¶
Permissions.
- restart(storage: int | None = None, memory: int | None = None, cores: int | None = None)[source]¶
Restart the data object.
The units for storage are gigabytes and for memory are megabytes.
The resources that are not specified (or set no None) are reset to their default values.
- save()¶
Save resource to the server.
- stdout() str[source]¶
Return process standard output (stdout.txt file content).
Fetch stdout.txt file from the corresponding Data object and return the file content as string. The string can be long and ugly.
- Return type:
string
- update()¶
Clear permissions cache and update the object.
- class resdk.resources.collection.BaseCollection(resolwe: Resolwe, **model_data: dict)[source]¶
Abstract collection resource.
One and only one of the identifiers (slug, id or model_data) should be given.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None)[source]¶
Download output files of associated Data objects.
Download files from the Resolwe server to the download directory (defaults to the current working directory).
Collections can contain multiple Data objects and Data objects can contain multiple files. All files are downloaded by default, but may be filtered by file name or Data object type:
re.collection.get(42).download(file_name=’alignment7.bam’)
re.collection.get(42).download(data_type=’bam’)
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None)[source]¶
Return list of files in resource.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- class resdk.resources.Collection(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe Collection resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- assign_to_billing_account(billing_account_name: str)[source]¶
Assign given collection to a billing account.
- create_background_relation(category, background, cases)¶
Create background relation.
- create_compare_relation(category, samples, labels=[])¶
Create compare relation.
- create_group_relation(category, samples, labels=[])¶
Create group relation.
- create_series_relation(category, samples, positions=[], labels=[])¶
Create series relation.
- Parameters:
category (str) – Category of relation (i.e.
case-control, …)samples (list) – List of samples to include in relation.
positions (list) – List of positions assigned to corresponding sample (i.e.
10,20,30). If given it should be of same length as samples. Note that this elements should be machine-sortable by default.labels (list) – List of labels assigned to corresponding samples. If given it should be of same length as samples.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None)¶
Download output files of associated Data objects.
Download files from the Resolwe server to the download directory (defaults to the current working directory).
Collections can contain multiple Data objects and Data objects can contain multiple files. All files are downloaded by default, but may be filtered by file name or Data object type:
re.collection.get(42).download(file_name=’alignment7.bam’)
re.collection.get(42).download(data_type=’bam’)
- duplicate() Collection[source]¶
Duplicate (make copy of)
collectionobject.- Returns:
Duplicated collection
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None)¶
Return list of files in resource.
- get_annotation_fields() Iterable[AnnotationField][source]¶
Get annotation fields associated with the collection.
- get_prediction_fields() Iterable[PredictionField][source]¶
Get prediction fields associated with the collection.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- set_annotation_fields(annotation_fields: Iterable[AnnotationField])[source]¶
Set collection annotation fields.
The change is applied instantly.
- set_prediction_fields(prediction_fields: Iterable[PredictionField])[source]¶
Set collection prediction fields.
The change is applied instantly.
- update()¶
Clear cache and update resource fields from the server.
- class resdk.resources.Sample(resolwe, **model_data)[source]¶
Resolwe Sample resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- property background¶
Get background sample of the current one.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None)¶
Download output files of associated Data objects.
Download files from the Resolwe server to the download directory (defaults to the current working directory).
Collections can contain multiple Data objects and Data objects can contain multiple files. All files are downloaded by default, but may be filtered by file name or Data object type:
re.collection.get(42).download(file_name=’alignment7.bam’)
re.collection.get(42).download(data_type=’bam’)
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None)¶
Return list of files in resource.
- get_annotation(full_path: str) AnnotationValue[source]¶
Get the AnnotationValue from full path.
- Raises:
LookupError – when field at the specified path does not exist.
- get_bam()¶
Return
bamobject on the sample.
- get_cuffquant()¶
Get cuffquant.
- get_expression()¶
Get expression.
- get_macs()¶
Return list of
bedobjects on the sample.
- get_primary_bam(fallback_to_bam=False)¶
Return
primary bamobject on the sample.If the
primary bamobject is not present andfallback_to_bamis set toTrue, abamobject will be returned.
- get_reads(**filters)¶
Return the latest
fastqobject in sample.If there are multiple
fastqobjects in sample (trimmed, filtered, subsampled…), return the latest one. If any other of thefastqobjects is required, one can provide additionalfilterarguments and limits search to one result.
- property is_background¶
Return
Trueif given sample is background to any other andFalseotherwise.
- property latest_experiment¶
Get latest experiment.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- set_annotation(full_path: str, value, force=False) AnnotationValue | None[source]¶
Create/update annotation value.
If value is None the annotation is deleted and None is returned. If force is set to True no explicit confirmation is required to delete the annotation.
- class resdk.resources.Relation(resolwe, **model_data)[source]¶
Resolwe Relation resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- update()¶
Clear permissions cache and update the object.
- class resdk.resources.Process(resolwe, **model_data)[source]¶
Resolwe Process resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- update()¶
Clear permissions cache and update the object.
- class resdk.resources.DescriptorSchema(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe DescriptorSchema resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- property permissions: PermissionsManager¶
Permissions.
- save()¶
Save resource to the server.
- update()¶
Clear permissions cache and update the object.
- class resdk.resources.AnnotationValue(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe AnnotationValue resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.AnnotationGroup(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe AnnotationGroup resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.AnnotationField(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe AnnotationField resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.PredictionField(resolwe: Resolwe, **model_data)[source]¶
Resolwe PredictionField resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.PredictionGroup(resolwe: Resolwe, **model_data)[source]¶
Resolwe PredictionGroup resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.PredictionPreset(resolwe: Resolwe, **model_data: dict)[source]¶
Resolwe PredictionPreset resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.PredictionValue(resolwe: Resolwe, **model_data)[source]¶
Resolwe PredictionValue resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.Variant(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
ResolweBio Variant resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.VariantAnnotation(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
VariantAnnotation resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.VariantAnnotationTranscript(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
VariantAnnotationTranscript resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.VariantCall(resolwe, **model_data: Any)[source]¶
VariantCall resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.VariantExperiment(resolwe, **model_data: Any)[source]¶
Variant experiment resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.User(resolwe=None, **model_data)[source]¶
Resolwe User resource.
One and only one of the identifiers (slug, id or model_data) should be given.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.Group(resolwe=None, **model_data)[source]¶
Resolwe Group resource.
One and only one of the identifiers (slug, id or model_data) should be given.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- property users¶
Return list of users in group.
- class resdk.resources.Geneset(resolwe: Resolwe, genes: list[str] | None = None, source: str | None = None, species: str | None = None, **model_data: dict)[source]¶
Resolwe Geneset resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- property children¶
Get children of this Data object.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None, show_progress: bool = True) list[str]¶
Download Data object’s files and directories.
Download files and directories from the Resolwe server to the download directory (defaults to the current working directory).
Data objects can contain multiple files and directories. All are downloaded by default, but may be filtered by name or output field:
re.data.get(42).download(file_name=’alignment7.bam’)
re.data.get(42).download(field_name=’bam’)
- download_and_rename(custom_file_name: str, overwrite_existing: bool = False, field_name: str | None = None, file_name: str | None = None, download_dir: str | None = None)¶
Download and rename a single file from the Data object.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None) list[str]¶
Get list of downloadable file fields.
Filter files by file name or output field.
- property parents¶
Get parents of this Data object.
- property permissions: PermissionsManager¶
Permissions.
- restart(storage: int | None = None, memory: int | None = None, cores: int | None = None)¶
Restart the data object.
The units for storage are gigabytes and for memory are megabytes.
The resources that are not specified (or set no None) are reset to their default values.
- save()[source]¶
Save Geneset to the server.
If Geneset is already on the server update with save() from base class. Otherwise, create a new Geneset by running process with slug “create-geneset”.
- set_operator(operator, other)[source]¶
Perform set operations on Geneset object by creating a new Genseset.
- Parameters:
operator – string -> set operation function name
other – Geneset object
- Returns:
new Geneset object
- stdout() str¶
Return process standard output (stdout.txt file content).
Fetch stdout.txt file from the corresponding Data object and return the file content as string. The string can be long and ugly.
- Return type:
string
- update()¶
Clear permissions cache and update the object.
- class resdk.resources.Metadata(resolwe: Resolwe, **model_data: dict)[source]¶
Metadata resource.
- Parameters:
resolwe (Resolwe object) – Resolwe instance
model_data – Resource model data
- property children¶
Get children of this Data object.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- property df¶
Get table as pd.DataFrame.
- download(file_name: str | None = None, field_name: str | None = None, download_dir: str | None = None, show_progress: bool = True) list[str]¶
Download Data object’s files and directories.
Download files and directories from the Resolwe server to the download directory (defaults to the current working directory).
Data objects can contain multiple files and directories. All are downloaded by default, but may be filtered by name or output field:
re.data.get(42).download(file_name=’alignment7.bam’)
re.data.get(42).download(field_name=’bam’)
- download_and_rename(custom_file_name: str, overwrite_existing: bool = False, field_name: str | None = None, file_name: str | None = None, download_dir: str | None = None)¶
Download and rename a single file from the Data object.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- files(file_name: str | None = None, field_name: str | None = None) list[str]¶
Get list of downloadable file fields.
Filter files by file name or output field.
- property parents¶
Get parents of this Data object.
- property permissions: PermissionsManager¶
Permissions.
- restart(storage: int | None = None, memory: int | None = None, cores: int | None = None)¶
Restart the data object.
The units for storage are gigabytes and for memory are megabytes.
The resources that are not specified (or set no None) are reset to their default values.
- save()[source]¶
Save Metadata to the server.
If Metadata is already uploaded: update. Otherwise, create new one.
- set_index(df: DataFrame) DataFrame[source]¶
Set index of df to Sample ID.
If there is a column with
Sample IDjust set that as index. If there isSample nameorSample slugcolumn, map sample name / slug to sample ID’s and set ID’s as an index. If no suitable column in there, raise an error. Works also if any of the above options is already an index with appropriate name.
- stdout() str¶
Return process standard output (stdout.txt file content).
Fetch stdout.txt file from the corresponding Data object and return the file content as string. The string can be long and ugly.
- Return type:
string
- property unique: bool¶
Get unique attribute.
This attribute tells if Metadata has one-to-one or one-to-many relation to collection samples.
- update()¶
Clear permissions cache and update the object.
- validate_df(df: DataFrame)[source]¶
Validate df property.
Validates that df:
is an instance of pandas.DataFrame
index contains sample IDs that match some samples:
If not matches, raise warning
If there are samples in df but not in collection, raise warning
If there are samples in collection but not in df, raise warning
- class resdk.resources.kb.Feature(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
Knowledge base Feature resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
- class resdk.resources.kb.Mapping(resolwe: Resolwe, initial_data_source: DataSource = DataSource.USER, **model_data: dict)[source]¶
Knowledge base Mapping resource.
- delete(force=False)¶
Delete the resource object from the server.
- Parameters:
force (bool) – Do not trigger confirmation prompt. WARNING: Be sure that you really know what you are doing as deleted objects are not recoverable.
- classmethod fetch_object(resolwe: Resolwe, id: int | None = None, slug: str | None = None) BaseResource¶
Return resource instance that is uniquely defined by identifier.
- save()¶
Save resource to the server.
- update()¶
Update resource fields from the server.
Permissions¶
Resources like resdk.resources.Data,
resdk.resources.Collection, resdk.resources.Sample, and
resdk.resources.Process include a permissions attribute to manage
permissions. The permissions attribute is an instance of
resdk.resources.permissions.PermissionsManager.
- class resdk.resources.permissions.PermissionsManager(all_permissions, api_root, resolwe)[source]¶
Helper class to manage permissions of the
BaseResource.- property editors¶
Get users with
editpermission.
- property owners¶
Get users with
ownerpermission.
- set_group(group, perm)[source]¶
Set
permpermission togroup.When assigning permissions, only the highest permission needs to be given. Permission hierarchy is:
none (no permissions)
view
edit
share
owner
Some examples:
collection = res.collection.get(...) # Add share, edit and view permission to BioLab: collection.permissions.set_group('biolab', 'share') # Remove share and edit permission from BioLab: collection.permissions.set_group('biolab', 'view') # Remove all permissions from BioLab: collection.permissions.set_group('biolab', 'none')
- set_public(perm)[source]¶
Set
permpermission for public.Public can only get two sorts of permissions:
none (no permissions)
view
Some examples:
collection = res.collection.get(...) # Add view permission to public: collection.permissions.set_public('view') # Remove view permission from public: collection.permissions.set_public('none')
- set_user(user, perm)[source]¶
Set
permpermission touser.When assigning permissions, only the highest permission needs to be given. Permission hierarchy is:
none (no permissions)
view
edit
share
owner
Some examples:
collection = res.collection.get(...) # Add share, edit and view permission to John: collection.permissions.set_user('john', 'share') # Remove share and edit permission from John: collection.permissions.set_user('john', 'view') # Remove all permissions from John: collection.permissions.set_user('john', 'none')
- property viewers¶
Get users with
viewpermission.
Utility functions¶
Resource utility functions.
- resdk.resources.utils.fill_spaces(word, desired_length)[source]¶
Fill spaces at the end until word reaches desired length.
- resdk.resources.utils.flatten_field(field, schema, path)[source]¶
Reduce dicts of dicts to dot separated keys.
- resdk.resources.utils.get_collection_id(collection)[source]¶
Return id attribute of the object if it is collection, otherwise return given value.
- resdk.resources.utils.get_data_id(data)[source]¶
Return id attribute of the object if it is data, otherwise return given value.
- resdk.resources.utils.get_descriptor_schema_id(dschema)[source]¶
Get descriptor schema id.
Return id attribute of the object if it is descriptor schema, otherwise return given value.
- resdk.resources.utils.get_process_id(process)[source]¶
Return id attribute of the object if it is process, otherwise return given value.
- resdk.resources.utils.get_relation_id(relation)[source]¶
Return id attribute of the object if it is relation, otherwise return given value.
- resdk.resources.utils.get_sample_id(sample)[source]¶
Return id attribute of the object if it is sample, otherwise return given value.
- resdk.resources.utils.get_user_id(user)[source]¶
Return id attribute of the object if it is relation, otherwise return given value.
- resdk.resources.utils.is_collection(collection)[source]¶
Return
Trueif passed object is Collection andFalseotherwise.
- resdk.resources.utils.is_data(data)[source]¶
Return
Trueif passed object is Data andFalseotherwise.
- resdk.resources.utils.is_descriptor_schema(data)[source]¶
Return
Trueif passed object is DescriptorSchema andFalseotherwise.
- resdk.resources.utils.is_group(group)[source]¶
Return
Trueif passed object is Group andFalseotherwise.
- resdk.resources.utils.is_process(process)[source]¶
Return
Trueif passed object is Process andFalseotherwise.
- resdk.resources.utils.is_relation(relation)[source]¶
Return
Trueif passed object is Relation andFalseotherwise.
- resdk.resources.utils.is_sample(sample)[source]¶
Return
Trueif passed object is Sample andFalseotherwise.
- resdk.resources.utils.is_user(user)[source]¶
Return
Trueif passed object is User andFalseotherwise.
- resdk.resources.utils.iterate_fields(fields, schema)[source]¶
Recursively iterate over all DictField sub-fields.
ReSDK Tables¶
Helper classes for aggregating collection data in tabular format.
Table classes¶
- class resdk.tables.microarray.MATables(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
A helper class to fetch collection’s microarray, qc and meta data.
This class enables fetching given collection’s data and returning it as tables which have samples in rows and microarray / qc / metadata in columns.
A simple example:
# Get Collection object collection = res.collection.get("collection-slug") # Fetch collection microarray and metadata tables = MATables(collection) meta = tables.meta exp = tables.exp
- __init__(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
Initialize class.
- Parameters:
collection – collection to use
cache_dir – cache directory location, if not specified system specific cache directory is used
progress_callable – custom callable that can be used to report progress. By default, progress is written to stderr with tqdm
- property exp: DataFrame¶
Return expressions values table as a pandas DataFrame object.
- property meta: DataFrame¶
Return samples metadata table as a pandas DataFrame object.
- Returns:
table of metadata
- class resdk.tables.ml_ready.MLTables(collection, name)[source]¶
Machine-learning ready tables.
- property exp¶
Get ML ready expressions as pandas.DataFrame.
These expressions are normalized and batch effect corrected - thus ready to be taken into ML procedures.
- class resdk.tables.rna.RNATables(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None, expression_source: str | None = None, expression_process_slug: str | None = None)[source]¶
A helper class to fetch collection’s expression and meta data.
This class enables fetching given collection’s data and returning it as tables which have samples in rows and expressions/metadata in columns.
When calling
RNATables.exp,RNATables.rcandRNATables.metafor the first time the corresponding data gets downloaded from the server. This data than gets cached in memory and on disc and is used in consequent calls. If the data on the server changes the updated version gets re-downloaded.A simple example:
# Get Collection object collection = res.collection.get("collection-slug") # Fetch collection expressions and metadata tables = RNATables(collection) exp = tables.exp rc = tables.rc meta = tables.meta
- __init__(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None, expression_source: str | None = None, expression_process_slug: str | None = None)[source]¶
Initialize class.
- Parameters:
collection – collection to use
cache_dir – cache directory location, if not specified system specific cache directory is used
progress_callable – custom callable that can be used to report progress. By default, progress is written to stderr with tqdm
expression_source – Only consider samples in the collection with specified source
expression_process_slug – Only consider samples in the collection with specified process slug
- property exp: DataFrame¶
Return expressions table as a pandas DataFrame object.
Which type of expressions (TPM, CPM, FPKM, …) get returned depends on how the data was processed. The expression type can be checked in the returned table attribute attrs[‘exp_type’]:
exp = tables.exp print(exp.attrs['exp_type'])
- Returns:
table of expressions
- property meta: DataFrame¶
Return samples metadata table as a pandas DataFrame object.
- Returns:
table of metadata
- property qc: DataFrame¶
Return QC table as a pandas DataFrame object.
- Returns:
table of QC values
- property rc: DataFrame¶
Return expression counts table as a pandas DataFrame object.
- Returns:
table of counts
- property readable_columns: Dict[str, str]¶
Map of source gene ids to symbols.
This also gets fetched only once and then cached in memory and on disc.
RNATables.exporRNATables.rcmust be called before this as the mapping is specific to just this data. Its intended use is to rename table column labels from gene ids to symbols.Example of use:
exp = exp.rename(columns=tables.readable_columns)
- Returns:
dict with gene ids as keys and gene symbols as values
- class resdk.tables.methylation.MethylationTables(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
A helper class to fetch collection’s methylation and meta data.
This class enables fetching given collection’s data and returning it as tables which have samples in rows and methylation/metadata in columns.
A simple example:
# Get Collection object collection = res.collection.get("collection-slug") # Fetch collection methylation and metadata tables = MethylationTables(collection) meta = tables.meta beta = tables.beta m_values = tables.mval
- __init__(collection: Collection, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
Initialize class.
- Parameters:
collection – collection to use
cache_dir – cache directory location, if not specified system specific cache directory is used
progress_callable – custom callable that can be used to report progress. By default, progress is written to stderr with tqdm
- property beta: DataFrame¶
Return beta values table as a pandas DataFrame object.
- property meta: DataFrame¶
Return samples metadata table as a pandas DataFrame object.
- Returns:
table of metadata
- property mval: DataFrame¶
Return m-values as a pandas DataFrame object.
- class resdk.tables.variant.VariantTables(collection: Collection, geneset: List[str] | None = None, filtering: bool = True, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
A helper class to fetch collection’s variant and meta data.
This class enables fetching given collection’s data and returning it as tables which have samples in rows and variants in columns.
A simple example:
# Get Collection object collection = res.collection.get("collection-slug") tables = VariantTables(collection) # Get variant data tables.variants # Get depth per variant or coverage for specific base tables.depth tables.depth_a tables.depth_c tables.depth_g tables.depth_t
- __init__(collection: Collection, geneset: List[str] | None = None, filtering: bool = True, cache_dir: str | None = None, progress_callable: Callable | None = None)[source]¶
Initialize class.
- Parameters:
collection – Collection to use.
geneset – Only consider mutations from this gene-set. Can be a list of gene symbols or a valid geneset Data object id / slug.
filtering – Only show variants that pass QC filters.
cache_dir – Cache directory location, if not specified system specific cache directory is used.
progress_callable – Custom callable that can be used to report progress. By default, progress is written to stderr with tqdm.
- property depth: DataFrame¶
Get depth table.
- property depth_a: DataFrame¶
Get depth table for adenine.
- property depth_c: DataFrame¶
Get depth table for cytosine.
- property depth_g: DataFrame¶
Get depth table for guanine.
- property depth_t: DataFrame¶
Get depth table for thymine.
- property filter: DataFrame¶
Get filter table.
Values can be:
PASS - Variant has passed filters:
DP : Insufficient read depth (< 10.0)
QD: insufficient quality normalized by depth (< 2.0)
- FS: insufficient phred-scaled p-value using Fisher’s exact
test to detect strand bias (> 30.0)
SnpCluster: Variant is part of a cluster
For example, if a variant has read depth 8, GATK will mark it as DP.
- property geneset¶
Get geneset.
- property meta: DataFrame¶
Return samples metadata table as a pandas DataFrame object.
- Returns:
table of metadata
- property variants: DataFrame¶
Get variants table.
There are 4 possible values:
0 - wild-type, no variant
1 - heterozygous mutation
2 - homozygous mutation
NaN - QC filters are failing - mutation status is unreliable
Exceptions¶
Custom ReSDK exceptions.
Logging¶
Module contents:
Parent logger for all modules in resdk library
Handler STDOUT_HANDLER is “turned off” by default
Handler configuration functions
Override sys.excepthook to log all uncaught exceptions
Parent logger¶
Loggers in resdk are named by their module name. This is achieved by:
logger = logging.getLogger(__name__)
This makes it easy to locate the source of a log message.
Logging handlers¶
The handler STDOUT_HANDLER is created but not
automatically added to ROOT_LOGGER, which means they do not do anything.
The handlers are activated when users call logger configuration
functions like start_logging().
Handler configuration functions¶
As a good logging practice, the library does not register handlers by default. The reason is that if the library is included in some application, developers of that application will probably want to register loggers by themself. Therefore, if a user wishes to register the pre-defined handlers she can run:
import resdk
resdk.start_logging()
- resdk_logger.start_logging(logging_level=logging.INFO)¶
Start logging resdk with the default configuration.
- Parameters:
logging_level (int) – logging threshold level - integer in [0-50]
- Return type:
None
Logging levels:
logging.DEBUG(10)
logging.INFO(20)
logging.WARNING(30)
logging.ERROR(40)
logging.CRITICAL(50)
Log uncaught exceptions¶
All python exceptions are handled by function, stored in
sys.excepthook. By rewriting the default implementation, we can
modify it for our puruses - to log all uncaught exceptions.
Note#1: Modified behaviour (logging of all uncaught exceptions) applies only when runing in non-interactive mode.
Note#2: Any exception can be caught/uncaught and it can happen in interactive/non-interactive mode. This makes 4 different scenarios. The sys.excepthook modification takes care of uncaught exceptions in non-interactive mode. In interactive mode, user is notified directly if exception is raised. If exception is caught and not reraised, it should be logged somehow, since it can provide valuable information for developer when debugging. Therefore, we should use the following convention for logging in resdk: “Exceptions are explicitly logged only when they are caught and not re-raised.”