immuneML.workflows.instructions.exploratory_analysis package¶
Submodules¶
immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction module¶
- class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction.ExploratoryAnalysisInstruction(exploratory_analysis_units: dict, name: str = None)[source]¶
Bases:
Instruction
Allows exploratory analysis of different datasets using encodings and reports.
Analysis is defined by a dictionary of ExploratoryAnalysisUnit objects that encapsulate a dataset, an encoding [optional] and a report to be executed on the [encoded] dataset. Each analysis specified under analyses is completely independent from all others.
Specification arguments:
analyses (dict): a dictionary of analyses to perform. The keys are the names of different analyses, and the values for each of the analyses are:
dataset: dataset on which to perform the exploratory analysis
preprocessing_sequence: which preprocessings to use on the dataset, this item is optional and does not have to be specified.
example_weighting: which example weighting strategy to use before encoding the data, this item is optional and does not have to be specified.
encoding: how to encode the dataset before running the report, this item is optional and does not have to be specified.
labels: if encoding is specified, the relevant labels should be specified here.
dim_reduction: which dimensionality reduction to apply;
report: which report to run on the dataset. Reports specified here may be of the category Data reports or Encoding reports, depending on whether ‘encoding’ was specified.
number_of_processes: (int): how many processes should be created at once to speed up the analysis. For personal machines, 4 or 8 is usually a good choice.
YAML specification:
instructions: my_expl_analysis_instruction: # user-defined instruction name type: ExploratoryAnalysis # which instruction to execute analyses: # analyses to perform my_first_analysis: # user-defined name of the analysis dataset: d1 # dataset to use in the first analysis preprocessing_sequence: p1 # preprocessing sequence to use in the first analysis report: r1 # which report to generate using the dataset d1 my_second_analysis: # user-defined name of another analysis dataset: d1 # dataset to use in the second analysis - can be the same or different from other analyses encoding: e1 # encoding to apply on the specified dataset (d1) report: r2 # which report to generate in the second analysis labels: # labels present in the dataset d1 which will be included in the encoded data on which report r2 will be run - celiac # name of the first label as present in the column of dataset's metadata file - CMV # name of the second label as present in the column of dataset's metadata file my_third_analysis: # user-defined name of another analysis dataset: d1 # dataset to use in the second analysis - can be the same or different from other analyses encoding: e1 # encoding to apply on the specified dataset (d1) dim_reduction: umap # or None; which dimensionality reduction method to apply to encoded d1 report: r3 # which report to generate in the third analysis number_of_processes: 4 # number of parallel processes to create (could speed up the computation)
- encode(unit: ExploratoryAnalysisUnit, result_path: Path) Dataset [source]¶
- preprocess_dataset(unit: ExploratoryAnalysisUnit, result_path: Path) Dataset [source]¶
- run_report(unit: ExploratoryAnalysisUnit, result_path: Path)[source]¶
- run_unit(unit: ExploratoryAnalysisUnit, result_path: Path) ReportResult [source]¶
- weight_examples(unit: ExploratoryAnalysisUnit, result_path: Path)[source]¶
immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisState module¶
immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit module¶
- class immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisUnit.ExploratoryAnalysisUnit(dataset: immuneML.data_model.datasets.Dataset.Dataset, report: immuneML.reports.Report.Report, preprocessing_sequence: list = None, encoder: immuneML.encodings.DatasetEncoder.DatasetEncoder = None, example_weighting: immuneML.example_weighting.ExampleWeightingStrategy.ExampleWeightingStrategy = None, label_config: immuneML.environment.LabelConfiguration.LabelConfiguration = None, number_of_processes: int = 1, report_result: immuneML.reports.ReportResult.ReportResult = None, dim_reduction: immuneML.ml_methods.dim_reduction.DimRedMethod.DimRedMethod = None)[source]¶
Bases:
object
- dataset: Dataset¶
- dim_reduction: DimRedMethod = None¶
- encoder: DatasetEncoder = None¶
- example_weighting: ExampleWeightingStrategy = None¶
- label_config: LabelConfiguration = None¶
- number_of_processes: int = 1¶
- preprocessing_sequence: list = None¶
- report_result: ReportResult = None¶