immuneML.workflows.instructions.exploratory_analysis package

Submodules

immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction module

class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction.ExploratoryAnalysisInstruction(exploratory_analysis_units: dict, name: Optional[str] = None)[source]

Bases: immuneML.workflows.instructions.Instruction.Instruction

Allows exploratory analysis of different datasets using encodings and reports.

Analysis is defined by a dictionary of ExploratoryAnalysisUnit objects that encapsulate a dataset, an encoding [optional] and a report to be executed on the [encoded] dataset. Each analysis specified under analyses is completely independent from all others.

Parameters

analyses (dict) – a dictionary of analyses to perform. The keys are the names of different analyses, and the values for each
are (of the analyses) –
dataset (-) – dataset on which to perform the exploratory analysis
preprocessing_sequence (-) – which preprocessings to use on the dataset, this item is optional and does not have to be specified.
encoding (-) – how to encode the dataset before running the report, this item is optional and does not have to be specified.
labels (-) – if encoding is specified, the relevant labels must be specified here.
report (-) – which report to run on the dataset. Reports specified here may be of the category Data reports or Encoding reports, depending on whether ‘encoding’ was specified.
number_of_processes – (int): how many processes should be created at once to speed up the analysis. For personal machines, 4 or 8 is usually a good choice.

YAML specification:

my_expl_analysis_instruction: # user-defined instruction name
    type: ExploratoryAnalysis # which instruction to execute
    analyses: # analyses to perform
        my_first_analysis: # user-defined name of the analysis
            dataset: d1 # dataset to use in the first analysis
            preprocessing_sequence: p1 # preprocessing sequence to use in the first analysis
            report: r1 # which report to generate using the dataset d1
        my_second_analysis: # user-defined name of another analysis
            dataset: d1 # dataset to use in the second analysis - can be the same or different from other analyses
            encoding: e1 # encoding to apply on the specified dataset (d1)
            report: r2 # which report to generate in the second analysis
            labels: # labels present in the dataset d1 which will be included in the encoded data on which report r2 will be run
                - celiac # name of the first label as present in the column of dataset's metadata file
                - CMV # name of the second label as present in the column of dataset's metadata file
    number_of_processes: 4 # number of parallel processes to create (could speed up the computation)

encode(unit: immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit, result_path: pathlib.Path) → immuneML.data_model.dataset.Dataset.Dataset[source]

preprocess_dataset(unit: immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit, result_path: pathlib.Path) → immuneML.data_model.dataset.Dataset.Dataset[source]

run(result_path: pathlib.Path)[source]

run_unit(unit: immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit, result_path: pathlib.Path) → immuneML.reports.ReportResult.ReportResult[source]

immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisState module

class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisState.ExploratoryAnalysisState(exploratory_analysis_units: dict, result_path: pathlib.Path = None, name: str = None)[source]

Bases: object

exploratory_analysis_units: dict

name: str = None

result_path: pathlib.Path = None

immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit module

class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit(dataset: immuneML.data_model.dataset.Dataset.Dataset, report: immuneML.reports.Report.Report, preprocessing_sequence: list = None, encoder: immuneML.encodings.DatasetEncoder.DatasetEncoder = None, label_config: immuneML.environment.LabelConfiguration.LabelConfiguration = None, number_of_processes: int = 1, report_result: immuneML.reports.ReportResult.ReportResult = None)[source]