immuneML.workflows.instructions.dataset_generation package


immuneML.workflows.instructions.dataset_generation.DatasetExportInstruction module

class immuneML.workflows.instructions.dataset_generation.DatasetExportInstruction.DatasetExportInstruction(datasets: List[Dataset], exporters: List[DataExporter], number_of_processes: int = 1, preprocessing_sequence: List[Preprocessor] = None, result_path: Path = None, name: str = None)[source]

Bases: Instruction

DatasetExport instruction takes a list of datasets as input, optionally applies preprocessing steps, and outputs the data in specified formats.

  • datasets (list) – a list of datasets to export in all given formats

  • preprocessing_sequence (list) – which preprocessing sequence to use on the dataset(s), this item is optional and does not have to be specified.

  • specified (When) –

  • datasets. (the same preprocessing sequence will be applied to all) –

  • exporters (list) – a list of formats in which to export the datasets. Valid formats are class names of any non-abstract class inheriting DataExporter.

  • number_of_processes (int) – how many processes to use during repertoire export (not used for sequence datasets)

YAML specification:

my_dataset_export_instruction: # user-defined instruction name
    type: DatasetExport # which instruction to execute
    datasets: # list of datasets to export
        - my_generated_dataset
        - my_dataset_from_adaptive
    preprocessing_sequence: my_preprocessing_sequence
    number_of_processes: 4
    export_formats: # list of formats to export the datasets to
        - AIRR
        - ImmuneML
static get_documentation()[source]
run(result_path: Path) DatasetExportState[source]

immuneML.workflows.instructions.dataset_generation.DatasetExportState module

class immuneML.workflows.instructions.dataset_generation.DatasetExportState.DatasetExportState(datasets: List[immuneML.data_model.dataset.Dataset.Dataset], formats: List[str], preprocessing_sequence: List[immuneML.preprocessing.Preprocessor.Preprocessor], paths: dict, result_path: pathlib.Path, name: str)[source]

Bases: object

datasets: List[Dataset]
formats: List[str]
name: str
paths: dict
preprocessing_sequence: List[Preprocessor]
result_path: Path

Module contents