- class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction.ExploratoryAnalysisInstruction(exploratory_analysis_units: dict, name: str = None)
Allows exploratory analysis of different datasets using encodings and reports.
Analysis is defined by a dictionary of ExploratoryAnalysisUnit objects that encapsulate a dataset, an encoding [optional] and a report to be executed on the [encoded] dataset. Each analysis specified under analyses is completely independent from all others.
analyses (dict) – a dictionary of analyses to perform. The keys are the names of different analyses, and the values for each
are (of the analyses) –
dataset (-) – dataset on which to perform the exploratory analysis
preprocessing_sequence (-) – which preprocessings to use on the dataset, this item is optional and does not have to be specified.
encoding (-) – how to encode the dataset before running the report, this item is optional and does not have to be specified.
labels (-) – if encoding is specified, the relevant labels must be specified here.
report (-) – which report to run on the dataset. Reports specified here may be of the category Data reports or Encoding reports, depending on whether ‘encoding’ was specified.
number_of_processes – (int): how many processes should be created at once to speed up the analysis. For personal machines, 4 or 8 is usually a good choice.
my_expl_analysis_instruction: # user-defined instruction name type: ExploratoryAnalysis # which instruction to execute analyses: # analyses to perform my_first_analysis: # user-defined name of the analysis dataset: d1 # dataset to use in the first analysis preprocessing_sequence: p1 # preprocessing sequence to use in the first analysis report: r1 # which report to generate using the dataset d1 my_second_analysis: # user-defined name of another analysis dataset: d1 # dataset to use in the second analysis - can be the same or different from other analyses encoding: e1 # encoding to apply on the specified dataset (d1) report: r2 # which report to generate in the second analysis labels: # labels present in the dataset d1 which will be included in the encoded data on which report r2 will be run - celiac # name of the first label as present in the column of dataset's metadata file - CMV # name of the second label as present in the column of dataset's metadata file number_of_processes: 4 # number of parallel processes to create (could speed up the computation)
- run(result_path: Path)
- class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit(dataset: immuneML.data_model.dataset.Dataset.Dataset, report: immuneML.reports.Report.Report, preprocessing_sequence: list = None, encoder: immuneML.encodings.DatasetEncoder.DatasetEncoder = None, label_config: immuneML.environment.LabelConfiguration.LabelConfiguration = None, number_of_processes: int = 1, report_result: immuneML.reports.ReportResult.ReportResult = None)
- number_of_processes: int = 1
- preprocessing_sequence: list = None